Chapel (programming language)

Last updated
Chapel
Cray Chapel Logo.png
Paradigm Object-oriented
Partitioned global address space
Parallel programming
Designed by David Callahan, Hans Zima, Brad Chamberlain, John Plevyak
Developer Hewlett Packard Enterprise (previously Cray Inc.)
First appeared2009;16 years ago (2009)
Stable release
2.3.0 / December 12, 2024;48 days ago (2024-12-12)
Typing discipline static
inferred
Platform multiplatform, including Amazon Web Services, HPE Cray EX
OS Mac OS, Linux, POSIX, Windows (with Cygwin), NetBSD
License Apache License 2.0
Filename extensions .chpl
Website chapel-lang.org
Influenced by
Ada, C#, [1] C, Fortran, C++, Java, HPF, ZPL, Cray MTA / XMT extensions to C and Fortran. [2]

Chapel, the Cascade High Productivity Language, is a parallel programming language that was developed by Cray, [3] and later by Hewlett Packard Enterprise which acquired Cray. It was being developed as part of the Cray Cascade project, a participant in DARPA's High Productivity Computing Systems (HPCS) program, which had the goal of increasing supercomputer productivity by 2010. It is being developed as an open source project, under version 2 of the Apache license. [4]

Contents

The Chapel compiler is written in C and C++ (C++14). The backend (i.e. the optimizer) is LLVM, written in C++. Python 3.7 or newer is required for some optional components such Chapel’s test system and c2chapel, a tool to generate C bindings for Chapel. By default Chapel compiles to binary executables, but it can also compile to C code, and then LLVM is not used. Chapel code can be compiled to libraries to be callable from C, or Fortran or e.g. Python also supported. Chapel supports GPU programming through code generation for NVIDIA and AMD graphics processing units. [5]

Goals

Chapel aims to improve the programmability of parallel computers in general and the Cascade system in particular, by providing a higher level of expression than current programming languages do and by improving the separation between algorithmic expression and data structure implementation details.

The language designers aspire for Chapel to bridge the gap between current high-performance computing (HPC) programming practitioners, who they describe as Fortran, C or C++ users writing procedural code using technologies like OpenMP and MPI on one side, and newly graduating computer programmers who tend to prefer Java, Python or Matlab with only some of them having experience with C++ or C. Chapel should offer the productivity advances offered by the latter suite of languages while not alienating the users of the first. [2]

Features

Chapel supports a multithreaded parallel programming model at a high level by supporting abstractions for data parallelism, task parallelism, and nested parallelism. It enables optimizations for the locality of data and computation in the program via abstractions for data distribution and data-driven placement of subcomputations. It allows for code reuse and generality through object-oriented concepts and generic programming features. For instance, Chapel allows for the declaration of locales. [6]

While Chapel borrows concepts from many preceding languages, its parallel concepts are most closely based on ideas from High Performance Fortran (HPF), ZPL, and the Cray MTA's extensions to Fortran and C.

See also

Notes

  1. "Chapel spec (Acknowledgments)" (PDF). Cray Inc. 2015-10-01. Retrieved 2016-01-14.
  2. 1 2 Chamberlain, Bradford L. "A Brief Overview of Chapel" (PDF). Cray Inc. Retrieved 22 April 2015.
  3. Lightfoot, David E. (2006). Modular programming languages: 7th Joint Modular Languages Conference . Springer. p.  20. ISBN   978-3-540-40927-4.
  4. "Chapel license information". chapel-lang.org. Retrieved November 15, 2015.
  5. "Chapel Technote: GPU Programming". Hewlett Packard Enterprise. 2023-09-28. Retrieved 2023-11-03.
  6. Bongen Gu; Wikuan Yu; Yoonsik Kwak (June 28–30, 2011). "Communication and Computation Overlap through Task Synchronization in Multi-locale Chapel Environment". In James J. Park, Laurence T. Yang and Changhoon Lee (ed.). Future Information Technology, Part I: 6th International Conference. Loutraki, Greece: Springer-Verlag. pp. 285–292. doi:10.1007/978-3-642-22333-4_37. ISBN   978-3-642-22332-7.

Related Research Articles

Cray Inc., a subsidiary of Hewlett Packard Enterprise, is an American supercomputer manufacturer headquartered in Seattle, Washington. It also manufactures systems for data storage and analytics. Several Cray supercomputer systems are listed in the TOP500, which ranks the most powerful supercomputers in the world.

<span class="mw-page-title-main">OpenMP</span> Open standard for parallelizing

OpenMP is an application programming interface (API) that supports multi-platform shared-memory multiprocessing programming in C, C++, and Fortran, on many platforms, instruction-set architectures and operating systems, including Solaris, AIX, FreeBSD, HP-UX, Linux, macOS, and Windows. It consists of a set of compiler directives, library routines, and environment variables that influence run-time behavior.

<span class="mw-page-title-main">LLVM</span> Compiler backend for multiple programming languages

LLVM, also called LLVM Core, is a target-independent optimizer and code generator. It can be used to develop a frontend for any programming language and a backend for any instruction set architecture. LLVM is designed around a language-independent intermediate representation (IR) that serves as a portable, high-level assembly language that can be optimized with a variety of transformations over multiple passes. The name LLVM originally stood for Low Level Virtual Machine. However, the project has since expanded, and the name is no longer an acronym but an orphan initialism.

Fortress is a discontinued experimental programming language for high-performance computing, created by Sun Microsystems with funding from DARPA's High Productivity Computing Systems project. One of the language designers was Guy L. Steele Jr., whose previous work includes Scheme, Common Lisp, and Java.

In computing, a parallel programming model is an abstraction of parallel computer architecture, with which it is convenient to express algorithms and their composition in programs. The value of a programming model can be judged on its generality: how well a range of different problems can be expressed for a variety of different architectures, and its performance: how efficiently the compiled programs can execute. The implementation of a parallel programming model can take the form of a library invoked from a programming language, as an extension to an existing languages.

Automatic parallelization, also auto parallelization, or autoparallelization refers to converting sequential code into multi-threaded and/or vectorized code in order to use multiple processors simultaneously in a shared-memory multiprocessor (SMP) machine. Fully automatic parallelization of sequential programs is a challenge because it requires complex program analysis and the best approach may depend upon parameter values that are not known at compilation time.

VTune Profiler is a performance analysis tool for x86-based machines running Linux or Microsoft Windows operating systems. Many features work on both Intel and AMD hardware, but the advanced hardware-based sampling features require an Intel-manufactured CPU.

<span class="mw-page-title-main">CUDA</span> Parallel computing platform and programming model

In computing, CUDA is a proprietary parallel computing platform and application programming interface (API) that allows software to use certain types of graphics processing units (GPUs) for accelerated general-purpose processing, an approach called general-purpose computing on GPUs. CUDA was created by Nvidia in 2006. When it was first introduced, the name was an acronym for Compute Unified Device Architecture, but Nvidia later dropped the common use of the acronym and now rarely expands it.

In computer science, partitioned global address space (PGAS) is a parallel programming model paradigm. PGAS is typified by communication operations involving a global memory address space abstraction that is logically partitioned, where a portion is local to each process, thread, or processing element. The novelty of PGAS is that the portions of the shared memory space may have an affinity for a particular process, thereby exploiting locality of reference in order to improve performance. A PGAS memory model is featured in various parallel programming languages and libraries, including: Coarray Fortran, Unified Parallel C, Split-C, Fortress, Chapel, X10, UPC++, Coarray C++, Global Arrays, DASH and SHMEM. The PGAS paradigm is now an integrated part of the Fortran language, as of Fortran 2008 which standardized coarrays.

<span class="mw-page-title-main">Data parallelism</span> Parallelization across multiple processors in parallel computing environments

Data parallelism is parallelization across multiple processors in parallel computing environments. It focuses on distributing the data across different nodes, which operate on the data in parallel. It can be applied on regular data structures like arrays and matrices by working on each element in parallel. It contrasts to task parallelism as another form of parallelism.

<span class="mw-page-title-main">The Portland Group</span> American technology company

PGI was a company that produced a set of commercially available Fortran, C and C++ compilers for high-performance computing systems. On July 29, 2013, Nvidia acquired The Portland Group, Inc. As of August 5, 2020, the "PGI Compilers and Tools" technology is a part of the Nvidia HPC SDK product available as a free download from Nvidia.

In computing, algorithmic skeletons, or parallelism patterns, are a high-level parallel programming model for parallel and distributed computing.

OpenACC is a programming standard for parallel computing developed by Cray, CAPS, Nvidia and PGI. The standard is designed to simplify parallel programming of heterogeneous CPU/GPU systems.

C++ Accelerated Massive Parallelism is a native programming model that contains elements that span the C++ programming language and its runtime library. It provides an easy way to write programs that compile and execute on data-parallel hardware, such as graphics cards (GPUs).

In computing, a memory access pattern or IO access pattern is the pattern with which a system or program reads and writes memory on secondary storage. These patterns differ in the level of locality of reference and drastically affect cache performance, and also have implications for the approach to parallelism and distribution of workload in shared memory systems. Further, cache coherency issues can affect multiprocessor performance, which means that certain memory access patterns place a ceiling on parallelism.

<span class="mw-page-title-main">SYCL</span> Higher-level programming standard for heterogeneous computing

SYCL is a higher-level programming model to improve programming productivity on various hardware accelerators. It is a single-source embedded domain-specific language (eDSL) based on pure C++17. It is a standard developed by Khronos Group, announced in March 2014.

<span class="mw-page-title-main">ROCm</span> Parallel computing platform: GPGPU libraries and application programming interface

ROCm is an Advanced Micro Devices (AMD) software stack for graphics processing unit (GPU) programming. ROCm spans several domains: general-purpose computing on graphics processing units (GPGPU), high performance computing (HPC), heterogeneous computing. It offers several programming models: HIP, OpenMP, and OpenCL.


Performance portability refers to the ability of computer programs and applications to operate effectively across different platforms. Developers of performance portable applications seek to support multiple platforms without impeding performance, and ideally while minimizing platform-specific code.

The Center for Supercomputing Research and Development (CSRD) at the University of Illinois (UIUC) was a research center funded from 1984 to 1993. It built the shared memory Cedar computer system, which included four hardware multiprocessor clusters, as well as parallel system and applications software. It was distinguished from the four earlier UIUC Illiac systems by starting with commercial shared memory subsystems that were based on an earlier paper published by the CSRD founders. Thus CSRD was able to avoid many of the hardware design issues that slowed the Illiac series work. Over its 9 years of major funding, plus follow-on work by many of its participants, CSRD pioneered many of the shared memory architectural and software technologies upon which all 21st century computation is based.

References

Further reading