Coarray Fortran

Last updated
Coarray Fortran
Paradigm multi-paradigm: parallel, message passing, imperative (procedural, object-oriented), structured
Designed by Robert Numrich and John Reid
Developer PL22.3 Fortran Committee
Stable release
Fortran 2008 (ISO/IEC 1539-1:2010)
Typing discipline strong, static
OS Cross-platform
Major implementations
Cray, g95, GNU Fortran, Intel Fortran Compiler, Rice (CAF 2.0), OpenUH, NAG Fortran Compiler
Influenced by

Coarray Fortran (CAF), formerly known as F--, started as an extension of Fortran 95/2003 for parallel processing created by Robert Numrich and John Reid in the 1990s. The Fortran 2008 standard (ISO/IEC 1539-1:2010) now includes coarrays (spelled without hyphen), as decided at the May 2005 meeting of the ISO Fortran Committee; the syntax in the Fortran 2008 standard is slightly different from the original CAF proposal.


A CAF program is interpreted as if it were replicated a number of times and all copies were executed asynchronously. Each copy has its own set of data objects and is termed an image. The array syntax of Fortran is extended with additional trailing subscripts in square brackets to provide a concise representation of references to data that is spread across images.

The CAF extension was implemented in some Fortran compilers such as those from Cray (since release 3.1). Since the inclusion of coarrays in the Fortran 2008 standard, the number of implementations is growing. The first open-source compiler which implemented coarrays as specified in the Fortran 2008 standard for Linux architectures is G95. Currently, GNU Fortran provides wide coverage of Fortran's coarray features in single- and multi-image configuration (the latter based on the OpenCoarrays library). Another implementation of coarrays and related parallel extensions from Fortran 2008 is available in the OpenUH compiler (a branch of Open64) developed at the University of Houston.

Implementation in compilers

CAF is often implemented on top of a Message Passing Interface (MPI) library for portability. Some implementations, such as the ones available in the GNU Fortran and OpenUH compilers, may run on top of other low-level layers (for example, GASNet) designed for supporting partitioned global address space languages.


A simple example is given below. CAF is used in CGPACK, an open source package for simulating polycrystalline materials developed at the University of Bristol. [1]

program Hello_Worldimplicit noneinteger::i! Local variablecharacter(len=20)::name[*]! scalar coarray, one "name" for each image.! Note: "name" is the local variable while "name[<index>]" accesses the! variable in a specific image; "name[this_image()]" is the same as "name".! Interact with the user on Image 1; execution for all others pass by.if(this_image()==1)then       write(*,'(a)',advance='no')'Enter your name: 'read(*,'(a)')name! Distribute information to other imagesdo i=2,num_images()name[i]=nameend do  end if  sync all! Barrier to make sure the data have arrived.! I/O from all images, executing in any order, but each record written is intact. write(*,'(3a,i0)')'Hello ',trim(name),' from image ',this_image()end program Hello_world

The program above scales poorly because the loop that distributes information executes sequentially. Writing scalable programs often requires a sophisticated understanding of parallel algorithms, a detailed knowledge of the underlying network characteristics, and special tuning for application characteristics such as the size of data transfers. For most application developers, letting the compiler or runtime library decide the best algorithm proves more robust and high-performing. Fortran 2018 will offer collective communication subroutines that empower compiler and runtime library teams to encapsulate efficient parallel algorithms for collective communication and distributed computation in a set of collective subroutines. These subroutines and other new parallel programming features are summarized in a technical specification [2] that the Fortran standards committee has voted to incorporate into Fortran 2018. These enable the user to write a more efficient version of the above algorithm

program Hello_Worldimplicit nonecharacter(len=20)::name[*]! scalar coarray, one "name" for each image.! Note: "name" is the local variable while "name[<index>]" accesses the! variable in a specific image; "name[this_image()]" is the same as "name".! Interact with the user on Image 1; execution for all others pass by.if(this_image()==1)then       write(*,'(a)',advance='no')'Enter your name: 'read(*,'(a)')nameend if! Distribute information to all imagescall co_broadcast(name,source_image=1)! I/O from all images, executing in any order, but each record written is intact. write(*,'(3a,i0)')'Hello ',trim(name),' from image ',this_image()end program Hello_world

where the lack of explicit synchronization offers the potential for higher performance due to less coordination between the images. Furthermore, TS 18508 guarantees that "A transfer from an image cannot occur before the collective subroutine has been invoked on that image." This implies some partial synchronization inside co_broadcast, but could be higher performing than the "sync all" in the prior example. TS 18508 also incorporates several other new features that address issues targeted by the CAF 2.0 effort described below. Examples include teams of images and events.

An alternate perspective

In 2011, Rice University pursued an alternate vision of coarray extensions for the Fortran language. [3] Their perspective is that the Fortran 2008 standard committee's design choices were shaped more by the desire to introduce as few modifications to the language as possible than to assemble the best set of extensions to support parallel programming. In their view, both Numrich and Reid's original design and the coarray extensions proposed for Fortran 2008 suffer from the following shortcomings:

To address these shortcomings, the Rice University group is developing a clean-slate redesign of the Coarray Fortran programming model. Rice's new design for Coarray Fortran, which they call Coarray Fortran 2.0, is an expressive set of coarray-based extensions to Fortran designed to provide a productive parallel programming model. Compared to Fortran 2008, Rice's new coarray-based language extensions include some additional features:

See also

Related Research Articles

C (programming language) general-purpose programming language

C is a general-purpose, procedural computer programming language supporting structured programming, lexical variable scope, and recursion, while a static type system prevents unintended operations. By design, C provides constructs that map efficiently to typical machine instructions and has found lasting use in applications previously coded in assembly language. Such applications include operating systems and various application software for computers, from supercomputers to PLCs and embedded systems.

Fortran General-purpose programming language

Fortran is a general-purpose, compiled imperative programming language that is especially suited to numeric computation and scientific computing.

Pascal (programming language) Programming language

Pascal is an imperative and procedural programming language, designed by Niklaus Wirth as a small, efficient language intended to encourage good programming practices using structured programming and data structuring. It is named in honour of the French mathematician, philosopher and physicist Blaise Pascal.

Message Passing Interface (MPI) is a standardized and portable message-passing standard designed by a group of researchers from academia and industry to function on a wide variety of parallel computing architectures. The standard defines the syntax and semantics of a core of library routines useful to a wide range of users writing portable message-passing programs in C, C++, and Fortran. There are several well-tested and efficient implementations of MPI, many of which are open-source or in the public domain. These fostered the development of a parallel software industry, and encouraged development of portable and scalable large-scale parallel applications.

OpenMP open standard for parallelizing

The application programming interface (API) OpenMP supports multi-platform shared-memory multiprocessing programming in C, C++, and Fortran, on many platforms, instruction-set architectures and operating systems, including Solaris, AIX, HP-UX, Linux, macOS, and Windows. It consists of a set of compiler directives, library routines, and environment variables that influence run-time behavior.

IDL, short for Interactive Data Language, is a programming language used for data analysis. It is popular in particular areas of science, such as astronomy, atmospheric physics and medical imaging. IDL shares a common syntax with PV-Wave and originated from the same codebase, though the languages have subsequently diverged in detail. There are also free or costless implementations, such as GNU Data Language (GDL) and Fawlty Language (FL).

A+ (programming language) programming language

A+ is an array programming language descended from the programming language A, which in turn was created to replace APL in 1988. Arthur Whitney developed the A portion of A+, while other developers at Morgan Stanley extended it, adding a graphical user interface and other language features. A+ is a high-level, interactive, interpreted language, designed for numerically intensive applications, especially those found in financial applications. A+ runs on many Unix variants, including Linux. It is free and open source software released under a GNU General Public License.

LAPACK Software library for numerical linear algebra

LAPACK is a standard software library for numerical linear algebra. It provides routines for solving systems of linear equations and linear least squares, eigenvalue problems, and singular value decomposition. It also includes routines to implement the associated matrix factorizations such as LU, QR, Cholesky and Schur decomposition. LAPACK was originally written in FORTRAN 77, but moved to Fortran 90 in version 3.2 (2008). The routines handle both real and complex matrices in both single and double precision.

Unified Parallel C (UPC) is an extension of the C programming language designed for high-performance computing on large-scale parallel machines, including those with a common global address space and those with distributed memory. The programmer is presented with a single shared, partitioned address space, where variables may be directly read and written by any processor, but each variable is physically associated with a single processor. UPC uses a single program, multiple data (SPMD) model of computation in which the amount of parallelism is fixed at program startup time, typically with a single thread of execution per processor.

In computing, a parallel programming model is an abstraction of parallel computer architecture, with which it is convenient to express algorithms and their composition in programs. The value of a programming model can be judged on its generality: how well a range of different problems can be expressed for a variety of different architectures, and its performance: how efficiently the compiled programs can execute. The implementation of a parallel programming model can take the form of a library invoked from a sequential language, as an extension to an existing language, or as an entirely new language.

Concurrent computing is a form of computing in which several computations are executed concurrently—during overlapping time periods—instead of sequentially, with one completing before the next starts.

Stream processing is a computer programming paradigm, equivalent to dataflow programming, event stream processing, and reactive programming, that allows some applications to more easily exploit a limited form of parallel processing. Such applications can use multiple computational units, such as the floating point unit on a graphics processing unit or field-programmable gate arrays (FPGAs), without explicitly managing allocation, synchronization, or communication among those units.

G95 is a free, portable, open-source Fortran 95 compiler. It implements the Fortran 95 standard, part of the Fortran 2003 standard, as well as some old and new extensions including features for the Fortran 2008 standard like coarray Fortran. It also supports the F programming language subset.

In computer science, a partitioned global address space (PGAS) is a parallel programming model. It assumes a global memory address space that is logically partitioned and a portion of it is local to each process, thread, or processing element. The novelty of PGAS is that the portions of the shared memory space may have an affinity for a particular process, thereby exploiting locality of reference. The PGAS model is the basis of Unified Parallel C, Coarray Fortran, Split-C, Fortress, Chapel, X10, UPC++, Coarray C++, Global Arrays, DASH and SHMEM. In standard Fortran, this model is now an integrated part of the language. PGAS attempts to combine the advantages of a SPMD programming style for distributed memory systems with the data referencing semantics of shared memory systems. This is more realistic than the traditional shared memory approach of one flat address space, because hardware-specific data locality can be modeled in the partitioning of the address space.

Intel Fortran Compiler, also known as IFORT, is a group of Fortran compilers from Intel for Windows, OS X, and Linux.

Comparison of programming languages is a common topic of discussion among software engineers. Basic instructions of several programming languages are compared here.

OpenHMPP - programming standard for heterogeneous computing. Based on a set of compiler directives, standard is a programming model designed to handle hardware accelerators without the complexity associated with GPU programming. This approach based on directives has been implemented because they enable a loose relationship between an application code and the use of a hardware accelerator (HWA).

In computer programming, a subroutine is a sequence of program instructions that performs a specific task, packaged as a unit. This unit can then be used in programs wherever that particular task should be performed.

Absoft Fortran Compilers are set of Fortran compilers for Microsoft Windows, Apple Macintosh, and Linux produced by Absoft Corporation. The compilers are source code compatible across platforms.


  1. A. Shterenlikht, Fortran coarray library for 3D cellular automata microstructure simulation Archived 2016-03-04 at the Wayback Machine , (2013) In Proc. 7th PGAS conf, Eds. M. Weiland, A. Jackson, N. Johnson, Published by The University of Edinburgh, ISBN   978-0-9926615-0-2
  2. TS 18508 Additional Parallel Features in Fortran
  3. "CoArray Fortran 2.0".