Partitioned global address space

Last updated

In computer science, partitioned global address space (PGAS) is a parallel programming model paradigm. PGAS is typified by communication operations involving a global memory address space abstraction that is logically partitioned, where a portion is local to each process, thread, or processing element. [1] [2] The novelty of PGAS is that the portions of the shared memory space may have an affinity for a particular process, thereby exploiting locality of reference in order to improve performance. A PGAS memory model is featured in various parallel programming languages and libraries, including: Coarray Fortran, Unified Parallel C, Split-C, Fortress, Chapel, X10, UPC++, Coarray C++, Global Arrays, DASH and SHMEM. The PGAS paradigm is now an integrated part of the Fortran language, as of Fortran 2008 which standardized coarrays.

Contents

The various languages and libraries offering a PGAS memory model differ widely in other details, such as the base programming language and the mechanisms used to express parallelism. Many PGAS systems combine the advantages of a SPMD programming style for distributed memory systems (as employed by MPI) with the data referencing semantics of shared memory systems. In contrast to message passing, PGAS programming models frequently offer one-sided communication operations such as Remote Memory Access (RMA), whereby one processing element may directly access memory with affinity to a different (potentially remote) process, without explicit semantic involvement by the passive target process. PGAS offers more efficiency and scalability than traditional shared-memory approaches with a flat address space, because hardware-specific data locality can be explicitly exposed in the semantic partitioning of the address space.

A variant of the PGAS paradigm, asynchronous partitioned global address space (APGAS) augments the programming model with facilities for both local and remote asynchronous task creation. [3] Two programming languages that use this model are Chapel and X10.

Examples

See also

Related Research Articles

<span class="mw-page-title-main">Fortran</span> General-purpose programming language

Fortran is a general-purpose, compiled imperative programming language that is especially suited to numeric computation and scientific computing.

Message Passing Interface (MPI) is a standardized and portable message-passing standard designed to function on parallel computing architectures. The MPI standard defines the syntax and semantics of library routines that are useful to a wide range of users writing portable message-passing programs in C, C++, and Fortran. There are several open-source MPI implementations, which fostered the development of a parallel software industry, and encouraged development of portable and scalable large-scale parallel applications.

<span class="mw-page-title-main">High-performance computing</span> Computing with supercomputers and clusters

High-performance computing (HPC) uses supercomputers and computer clusters to solve advanced computation problems.

Coarray Fortran (CAF), formerly known as F--, started as an extension of Fortran 95/2003 for parallel processing created by Robert Numrich and John Reid in the 1990s. The Fortran 2008 standard now includes coarrays, as decided at the May 2005 meeting of the ISO Fortran Committee; the syntax in the Fortran 2008 standard is slightly different from the original CAF proposal.

Unified Parallel C (UPC) is an extension of the C programming language designed for high-performance computing on large-scale parallel machines, including those with a common global address space and those with distributed memory. The programmer is presented with a single partitioned global address space; where shared variables may be directly read and written by any processor, but each variable is physically associated with a single processor. UPC uses a single program, multiple data (SPMD) model of computation in which the amount of parallelism is fixed at program startup time, typically with a single thread of execution per processor.

High Performance Fortran (HPF) is an extension of Fortran 90 with constructs that support parallel computing, published by the High Performance Fortran Forum (HPFF). The HPFF was convened and chaired by Ken Kennedy of Rice University. The first version of the HPF Report was published in 1993.

X10 is a programming language being developed by IBM at the Thomas J. Watson Research Center as part of the Productive, Easy-to-use, Reliable Computing System (PERCS) project funded by DARPA's High Productivity Computing Systems (HPCS) program.

In computing, a parallel programming model is an abstraction of parallel computer architecture, with which it is convenient to express algorithms and their composition in programs. The value of a programming model can be judged on its generality: how well a range of different problems can be expressed for a variety of different architectures, and its performance: how efficiently the compiled programs can execute. The implementation of a parallel programming model can take the form of a library invoked from a sequential language, as an extension to an existing language, or as an entirely new language.

Concurrent computing is a form of computing in which several computations are executed concurrently—during overlapping time periods—instead of sequentially—with one completing before the next starts.

Split-C is a parallel extension of the C programming language. The Split-C project website describes Split-C as:

a parallel extension of the C programming language that supports efficient access to a global address space on current distributed memory multiprocessors. It retains the "small language" character of C and supports careful engineering and optimization of programs by providing a simple, predictable cost model.

<span class="mw-page-title-main">Chapel (programming language)</span> Parallel programming language

Chapel, the Cascade High Productivity Language, is a parallel programming language that was developed by Cray, and later by Hewlett Packard Enterprise which acquired Cray. It was being developed as part of the Cray Cascade project, a participant in DARPA's High Productivity Computing Systems (HPCS) program, which had the goal of increasing supercomputer productivity by 2010. It is being developed as an open source project, under version 2 of the Apache license.

<span class="mw-page-title-main">CUDA</span> Parallel computing platform and programming model

CUDA is a parallel computing platform and application programming interface (API) that allows software to use certain types of graphics processing units (GPUs) for general purpose processing, an approach called general-purpose computing on GPUs (GPGPU). CUDA is a software layer that gives direct access to the GPU's virtual instruction set and parallel computational elements, for the execution of compute kernels.

Global Arrays, or GA, is the library developed by scientists at Pacific Northwest National Laboratory for parallel computing. GA provides a friendly API for shared-memory programming on distributed-memory computers for multidimensional arrays. The GA library is a predecessor to the GAS languages currently being developed for high-performance computing.

In computing, algorithmic skeletons, or parallelism patterns, are a high-level parallel programming model for parallel and distributed computing.

HPC Challenge Benchmark combines several benchmarks to test a number of independent attributes of the performance of high-performance computer (HPC) systems. The project has been co-sponsored by the DARPA High Productivity Computing Systems program, the United States Department of Energy and the National Science Foundation.

<span class="mw-page-title-main">Katherine Yelick</span> American computer scientist and academic

Katherine "Kathy" Anne Yelick, an American computer scientist, is the Vice Chancellor for Research and the Robert S. Pepper Professor of Electrical Engineering and Computer Sciences at the University of California, Berkeley. She is also a faculty scientist at Lawrence Berkeley National Laboratory, where she was Associate Laboratory Director for Computing Sciences from 2010-2019.

SHMEM is a family of parallel programming libraries, providing one-sided, RDMA, parallel-processing interfaces for low-latency distributed-memory supercomputers. The SHMEM acronym was subsequently reverse engineered to mean "Symmetric Hierarchical MEMory”. Later it was expanded to distributed memory parallel computer clusters, and is used as parallel programming interface or as low-level interface to build partitioned global address space (PGAS) systems and languages. “Libsma”, the first SHMEM library, was created by Richard Smith at Cray Research in 1993 as a set of thin interfaces to access the CRAY T3D's inter-processor-communication hardware. SHMEM has been implemented by Cray Research, SGI, Cray Inc., Quadrics, HP, GSHMEM, IBM, QLogic, Mellanox, Universities of Houston and Florida; there is also open-source OpenSHMEM.

In computing, a memory access pattern or IO access pattern is the pattern with which a system or program reads and writes memory on secondary storage. These patterns differ in the level of locality of reference and drastically affect cache performance, and also have implications for the approach to parallelism and distribution of workload in shared memory systems. Further, cache coherency issues can affect multiprocessor performance, which means that certain memory access patterns place a ceiling on parallelism.

References

  1. Almasi, George. "PGAS (Partitioned Global Address Space) Languages.", Encyclopedia of Parallel Computing, Springer, (2011): 1539-1545. https://doi.org/10.1007/978-0-387-09766-4_210
  2. Cristian Coarfă; Yuri Dotsenko; John Mellor-Crummey, "An Evaluation of Global Address Space Languages: Co-Array Fortran and Unified Parallel C"
  3. Tim Stitt, "An Introduction to the Partitioned Global Address Space (PGAS) Programming Model"
  4. Numrich, R.W., Reid, J., Co-array Fortran for parallel programming. ACM SIGPLAN Fortran Forum 17(2), 1–31 (1998).
  5. J. Reid: Coarrays in the Next Fortran Standard. SIGPLAN Fortran Forum 29(2), 10–27 (July 2010)
  6. GCC wiki, Coarray support in gfortran as specified in the Fortran 2008 standard
  7. W. Chen, D. Bonachea, J. Duell, P. Husbands, C. Iancu, K. Yelick. A Performance Analysis of the Berkeley UPC Compiler 17th Annual International Conference on Supercomputing (ICS), 2003. https://doi.org/10.1145/782814.782825
  8. Tarek El-Ghazawi, William Carlson, Thomas Sterling, and Katherine Yelick. UPC: distributed shared memory programming. John Wiley & Sons, 2005.
  9. UPC Consortium, UPC Language and Library Specifications, v1.3, Lawrence Berkeley National Lab Tech Report LBNL-6623E, Nov 2013. https://doi.org/10.2172/1134233
  10. Bradford L. Chamberlain, Chapel, Programming Models for Parallel Computing, edited by Pavan Balaji, MIT Press, November 2015.
  11. John Bachan, Scott B. Baden, Steven Hofmeyr, Mathias Jacquelin, Amir Kamil, Dan Bonachea, Paul H. Hargrove, Hadia Ahmed. "UPC++: A High-Performance Communication Framework for Asynchronous Computation", In 33rd IEEE International Parallel & Distributed Processing Symposium (IPDPS'19), May 20–24, 2019. https://doi.org/10.25344/S4V88H
  12. T. A. Johnson: Coarray C++. Proceedings of the 7th International Conference on PGAS Programming Models. pp. 54–66. PGAS’13 (2013),
  13. Nieplocha, Jaroslaw; Harrison, Robert J.; Littlefield, Richard J. (1996). Global arrays: A nonuniform memory access programming model for high-performance computers. The Journal of Supercomputing. 10 (2): 169–189.
  14. K. Furlinger, C. Glass, A. Knupfer, J. Tao, D. Hunich, et al. DASH: Data Structures and Algorithms with Support for Hierarchical Locality. Euro-Par Parallel Processing Workshops (2014).
  15. P. Charles, C. Grothoff, V. Saraswat, C. Donawa, A. Kielstra, et al. X10: an object-oriented approach to nonuniform cluster computing. Proceedings of the 20th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA’05) (2005).
  16. Katherine Yelick, Paul Hilfinger, Susan Graham, Dan Bonachea, Jimmy Su, Amir Kamil, Kaushik Datta, Phillip Colella, and Tong Wen, "Parallel Languages and Compilers: Perspective from the Titanium Experience", The International Journal Of High Performance Computing Applications, August 1, 2007, 21(3):266-290
  17. Katherine Yelick, Susan Graham, Paul Hilfinger, Dan Bonachea, Jimmy Su, Amir Kamil, Kaushik Datta, Phillip Colella, Tong Wen, "Titanium", Encyclopedia of Parallel Computing, edited by David Padua, (Springer: 2011) Pages: 2049-2055
  18. Culler, D. E., Dusseau, A., Goldstein, S. C., Krishnamurthy, A., Lumetta, S., Von Eicken, T., & Yelick, K. Parallel programming in Split-C. In Supercomputing'93: Proceedings of the 1993 ACM/IEEE conference on Supercomputing (pp. 262-273). IEEE.
  19. Bonachea D, Hargrove P.GASNet-EX: A High-Performance, Portable Communication Library for Exascale Proceedings of Languages and Compilers for Parallel Computing (LCPC'18). Oct 2018. https://doi.org/10.25344/S4QP4W