Open MPI

Open MPI
Stable release	4.1.6 / September 30, 2023;2 months ago
Preview release	5.0.0rc13 / September 29, 2023;2 months ago
Repository	github.com/open-mpi/ompi ;
Operating system	Unix, Linux, macOS, FreeBSD
Platform	Cross-platform
Type	Library
License	New BSD License
Website	www.open-mpi.org

Last updated December 24, 2023

Open MPI is a Message Passing Interface (MPI) library project combining technologies and resources from several other projects (FT-MPI, LA-MPI, LAM/MPI, and PACX-MPI). It is used by many TOP500 supercomputers including Roadrunner, which was the world's fastest supercomputer from June 2008 to November 2009,^[3] and K computer, the fastest supercomputer from June 2011 to June 2012.^[4]^[5]

Overview

Open MPI represents the merger between three well-known MPI implementations:

FT-MPI from the University of Tennessee
LA-MPI from Los Alamos National Laboratory
LAM/MPI from Indiana University

with contributions from the PACX-MPI team at the University of Stuttgart. These four institutions comprise the founding members of the Open MPI development team.

The Open MPI developers selected these MPI implementations as excelling in one or more areas. Open MPI aims to use the best ideas and technologies from the individual projects and create one world-class open-source MPI implementation that excels in all areas. The Open MPI project specifies several top-level goals:

to create a free, open source software, peer-reviewed, production-quality complete MPI-3.0 implementation
to provide extremely high, competitive performance (low latency or high bandwidth)
to involve the high-performance computing community directly with external development and feedback (vendors, 3rd party researchers, users, etc.)
to provide a stable platform for 3rd-party research and commercial development
to help prevent the "forking problem" common to other MPI projects^[6]
to support a wide variety of high-performance computing platforms and environments

Code modules

The Open MPI code has 3 major code modules:

OMPI - MPI code
ORTE - the Open Run-Time Environment
OPAL - the Open Portable Access Layer

Commercial implementations

Sun HPC Cluster Tools - beginning with version 7, Sun switched to Open MPI
Bullx MPI—In 2010 Bull announced the release of bullx MPI, based on Open MPI^[7]

Consortium

Open MPI development is performed within a consortium of many industrial and academic partners. The consortium also covers several other software projects such as the hwloc (Hardware Locality) library which takes care of discovering and modeling the topology of parallel platforms.

Related Research Articles

<span class="mw-page-title-main">Beowulf cluster</span> Type of computing cluster

A Beowulf cluster is a computer cluster of what are normally identical, commodity-grade computers networked into a small local area network with libraries and programs installed which allow processing to be shared among them. The result is a high-performance parallel computing cluster from inexpensive personal computer hardware.

Message Passing Interface (MPI) is a standardized and portable message-passing standard designed to function on parallel computing architectures. The MPI standard defines the syntax and semantics of library routines that are useful to a wide range of users writing portable message-passing programs in C, C++, and Fortran. There are several open-source MPI implementations, which fostered the development of a parallel software industry, and encouraged development of portable and scalable large-scale parallel applications.

Quadrics was a supercomputer company formed in 1996 as a joint venture between Alenia Spazio and the technical team from Meiko Scientific. They produced hardware and software for clustering commodity computer systems into massively parallel systems. Their highpoint was in June 2003 when six out of the ten fastest supercomputers in the world were based on Quadrics' interconnect. They officially closed on June 29, 2009.

<span class="mw-page-title-main">Jack Dongarra</span> American computer scientist (born 1950)

Jack Joseph Dongarra is an American computer scientist and mathematician. He is the American University Distinguished Professor of Computer Science in the Electrical Engineering and Computer Science Department at the University of Tennessee. He holds the position of a Distinguished Research Staff member in the Computer Science and Mathematics Division at Oak Ridge National Laboratory, Turing Fellowship in the School of Mathematics at the University of Manchester, and is an adjunct professor and teacher in the Computer Science Department at Rice University. He served as a faculty fellow at the Texas A&M University Institute for Advanced Study (2014–2018). Dongarra is the founding director of the Innovative Computing Laboratory at the University of Tennessee. He was the recipient of the Turing Award in 2021.

MPICH, formerly known as MPICH2, is a freely available, portable implementation of MPI, a standard for message-passing for distributed-memory applications used in parallel computing. MPICH is Free and open source software with some public domain components that were developed by a US governmental organisation, and is available for most flavours of Unix-like OS.

In computing, single program, multiple data (SPMD) is a term that has been used to refer to computational models for exploiting parallelism where-by multiple processors cooperate in the execution of a program in order to obtain results faster.

LAM/MPI is one of the predecessors of the Open MPI project. Open MPI represents a community-driven, next generation implementation of a Message Passing Interface (MPI) fundamentally designed upon a component architecture to make an extremely powerful platform for high-performance computing. LAM/MPI was officially retired in March 2015.

The Parallel Virtual File System (PVFS) is an open-source parallel file system. A parallel file system is a type of distributed file system that distributes file data across multiple servers and provides for concurrent access by multiple tasks of a parallel application. PVFS was designed for use in large scale cluster computing. PVFS focuses on high performance access to large data sets. It consists of a server process and a client library, both of which are written entirely of user-level code. A Linux kernel module and pvfs-client process allow the file system to be mounted and used with standard utilities. The client library provides for high performance access via the message passing interface (MPI). PVFS is being jointly developed between The Parallel Architecture Research Laboratory at Clemson University and the Mathematics and Computer Science Division at Argonne National Laboratory, and the Ohio Supercomputer Center. PVFS development has been funded by NASA Goddard Space Flight Center, The DOE Office of Science Advanced Scientific Computing Research program, NSF PACI and HECURA programs, and other government and private agencies. PVFS is now known as OrangeFS in its newest development branch.

Trilinos is a collection of open-source software libraries, called packages, intended to be used as building blocks for the development of scientific applications. The word "Trilinos" is Greek and conveys the idea of "a string of pearls", suggesting a number of software packages linked together by a common infrastructure. Trilinos was developed at Sandia National Laboratories from a core group of existing algorithms and utilizes the functionality of software interfaces such as the BLAS, LAPACK, and MPI . In 2004, Trilinos received an R&D100 Award.

Global Arrays, or GA, is the library developed by scientists at Pacific Northwest National Laboratory for parallel computing. GA provides a friendly API for shared-memory programming on distributed-memory computers for multidimensional arrays. The GA library is a predecessor to the GAS languages currently being developed for high-performance computing.

<span class="mw-page-title-main">Computer cluster</span> Set of computers configured in a distributed computing system

A computer cluster is a set of computers that work together so that they can be viewed as a single system. Unlike grid computers, computer clusters have each node set to perform the same task, controlled and scheduled by software. The newest manifestation of cluster computing is cloud computing.

Manycore processors are special kinds of multi-core processors designed for a high degree of parallel processing, containing numerous simpler, independent processor cores. Manycore processors are used extensively in embedded computers and high-performance computing.

Portals is a low-level network API for high-performance networking on high-performance computing systems developed by Sandia National Laboratories and the University of New Mexico. Portals is currently the lowest-level network programming interface on the commercially successful XT line of supercomputers from Cray.

Exascale computing refers to computing systems capable of calculating at least "10¹⁸ IEEE 754 Double Precision (64-bit) operations (multiplications and/or additions) per second (exaFLOPS)"; it is a measure of supercomputer performance.

Tachyon is a parallel/multiprocessor ray tracing software. It is a parallel ray tracing library for use on distributed memory parallel computers, shared memory computers, and clusters of workstations. Tachyon implements rendering features such as ambient occlusion lighting, depth-of-field focal blur, shadows, reflections, and others. It was originally developed for the Intel iPSC/860 by John Stone for his M.S. thesis at University of Missouri-Rolla. Tachyon subsequently became a more functional and complete ray tracing engine, and it is now incorporated into a number of other open source software packages such as VMD, and SageMath. Tachyon is released under a permissive license.

<span class="mw-page-title-main">Message passing in computer clusters</span> Aspect of computer clusters

Message passing is an inherent element of all computer clusters. All computer clusters, ranging from homemade Beowulfs to some of the fastest supercomputers in the world, rely on message passing to coordinate the activities of the many nodes they encompass. Message passing in computer clusters built with commodity servers and switches is used by virtually every internet service.

SHMEM is a family of parallel programming libraries, providing one-sided, RDMA, parallel-processing interfaces for low-latency distributed-memory supercomputers. The SHMEM acronym was subsequently reverse engineered to mean "Symmetric Hierarchical MEMory”. Later it was expanded to distributed memory parallel computer clusters, and is used as parallel programming interface or as low-level interface to build partitioned global address space (PGAS) systems and languages. “Libsma”, the first SHMEM library, was created by Richard Smith at Cray Research in 1993 as a set of thin interfaces to access the CRAY T3D's inter-processor-communication hardware. SHMEM has been implemented by Cray Research, SGI, Cray Inc., Quadrics, HP, GSHMEM, IBM, QLogic, Mellanox, Universities of Houston and Florida; there is also open-source OpenSHMEM.

Arm MAP, is an application profiler produced by Allinea Software now part of Arm. of Warwick, United Kingdom, for profiling the performance of C, C++, Fortran 90 and Python software. It is widely used for its multithreaded and multiprocess capabilities such as profiling parallel Message Passing Interface (MPI) or OpenMP applications, including those running on clusters of Linux machines, and for scalar (sequential) code.

ROCm is an Advanced Micro Devices (AMD) software stack for graphics processing unit (GPU) programming. ROCm spans several domains: general-purpose computing on graphics processing units (GPGPU), high performance computing (HPC), heterogeneous computing. It offers several programming models: HIP, OpenMP/Message Passing Interface (MPI), OpenCL.

References

↑ "Open MPI: Version 5.0".
↑ "FreshPorts -- net/Openmpi2: High Performance Message Passing Library".
↑ Jeff Squyres. "Open MPI: 10^15 Flops Can't Be Wrong" (PDF). Open MPI Project. Retrieved 2011-09-27.
↑ "Programming on K computer" (PDF). Fujitsu. Retrieved 2012-01-17.
↑ "Open MPI powers 8 petaflops". Cisco Systems. Archived from the original on 2011-06-28. Retrieved 2011-09-27.
↑ Preventing forking is a goal; how will you enforce that?
↑ Aurélie Negro. "Bull launches bullx supercomputer suite". Bull SAS. Archived from the original on 2014-04-21. Retrieved 2013-09-27.

External links

Open MPI Project Homepage
RCE01: Open MPI - RCE Podcast interview

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] "Open MPI: Version 5.0".

[2] "FreshPorts -- net/Openmpi2: High Performance Message Passing Library".

[3] Jeff Squyres. "Open MPI: 10^15 Flops Can't Be Wrong" (PDF). Open MPI Project. Retrieved 2011-09-27.

[4] "Programming on K computer" (PDF). Fujitsu. Retrieved 2012-01-17.

[5] "Open MPI powers 8 petaflops". Cisco Systems. Archived from the original on 2011-06-28. Retrieved 2011-09-27.

[6] Preventing forking is a goal; how will you enforce that?

[7] Aurélie Negro. "Bull launches bullx supercomputer suite". Bull SAS. Archived from the original on 2014-04-21. Retrieved 2013-09-27.

[1]

[2]

[3]

[4]

[5]

[6]

[7]