Futhark (programming language)

Last updated
Futhark
Paradigm Array, functional
Family ML
Designed by Troels Henriksen, Cosmin Oancea, Martin Elsman
Developer University of Copenhagen [1]
First appeared2014;10 years ago (2014)
Typing discipline Inferred, static, strong
OS Cross-platform
License ISC
Website futhark-lang.org
Influenced by
APL, Haskell, NESL, Standard ML

Futhark is a functional data parallel array programming language originally developed at UCPH Department of Computer Science (DIKU) as part of the HIPERFIT project. [2] It focuses on enabling data parallel programs written in a functional style to be executed with high performance on massively parallel hardware, in particular on graphics processing units (GPUs). Futhark is strongly inspired by NESL, and its implementation uses a variant of the flattening transformation, but imposes constraints on how parallelism can be expressed in order to enable more aggressive compiler optimisations. In particular, irregular nested data parallelism is not supported. [3]

Contents

Overview

Futhark is a language in the ML family, with an indentation-insensitive syntax derived from OCaml, Standard ML, and Haskell. The type system is based on Hindley-Milner with a variety of extensions, such as uniqueness types and size-dependent types. Futhark is not intended as a general-purpose programming language for writing full applications, but is instead focused on writing computational "kernels" (not necessarily the same as a GPU kernel) which are then invoked from applications written in conventional languages. [4]

Examples

Dot product

The following program computes the dot product of two vectors containing double-precision numbers.

defdotprodxsys=f64.sum(map2(*)xsys))

It can also be equivalently written with explicit type annotations as follows.

defdotprod[n](xs:[n]f64)(ys:[n]f64):f64=f64.sum(map2(*)xsys))

This makes the size-dependent types explicit: this function can only be invoked with two arrays of the same size, and the type checker will reject any program where this cannot be statically determined.

Matrix multiplication

The following program performs matrix multiplication, using the definition of dot product above.

defmatmul[n][m][p](A:[n][m]f64)(B:[m][p]f64):[n][p]f64=map(\A_row->map(\B_col->dotprodA_rowB_col)(transposeB))A

Note how the types enforce that the function is only invoked with matrices of compatible size. Further, this is an example of nested data parallelism.

Related Research Articles

<span class="mw-page-title-main">Merge sort</span> Divide and conquer sorting algorithm

In computer science, merge sort is an efficient, general-purpose, and comparison-based sorting algorithm. Most implementations produce a stable sort, which means that the relative order of equal elements is the same in the input and output. Merge sort is a divide-and-conquer algorithm that was invented by John von Neumann in 1945. A detailed description and analysis of bottom-up merge sort appeared in a report by Goldstine and von Neumann as early as 1948.

In programming languages, a closure, also lexical closure or function closure, is a technique for implementing lexically scoped name binding in a language with first-class functions. Operationally, a closure is a record storing a function together with an environment. The environment is a mapping associating each free variable of the function with the value or reference to which the name was bound when the closure was created. Unlike a plain function, a closure allows the function to access those captured variables through the closure's copies of their values or references, even when the function is invoked outside their scope.

<span class="mw-page-title-main">Miranda (programming language)</span>

Miranda is a lazy, purely functional programming language designed by David Turner as a successor to his earlier programming languages SASL and KRC, using some concepts from ML and Hope. It was produced by Research Software Ltd. of England and was the first purely functional language to be commercially supported.

Standard ML (SML) is a general-purpose, modular, functional programming language with compile-time type checking and type inference. It is popular for writing compilers, for programming language research, and for developing theorem provers.

Flynn's taxonomy is a classification of computer architectures, proposed by Michael J. Flynn in 1966 and extended in 1972. The classification system has stuck, and it has been used as a tool in the design of modern processors and their functionalities. Since the rise of multiprocessing central processing units (CPUs), a multiprogramming context has evolved as an extension of the classification system. Vector processing, covered by Duncan's taxonomy, is missing from Flynn's work because the Cray-1 was released in 1977: Flynn's second paper was published in 1972.

<span class="mw-page-title-main">F Sharp (programming language)</span> Microsoft programming language

F# is a general-purpose, strongly typed, multi-paradigm programming language that encompasses functional, imperative, and object-oriented programming methods. It is most often used as a cross-platform Common Language Infrastructure (CLI) language on .NET, but can also generate JavaScript and graphics processing unit (GPU) code.

Curry is a declarative programming language, an implementation of the functional logic programming paradigm, and based on the Haskell language. It merges elements of functional and logic programming, including constraint programming integration.

In computing, a persistent data structure or not ephemeral data structure is a data structure that always preserves the previous version of itself when it is modified. Such data structures are effectively immutable, as their operations do not (visibly) update the structure in-place, but instead always yield a new updated structure. The term was introduced in Driscoll, Sarnak, Sleator, and Tarjan's 1986 article.

In computer science, a programming language is said to have first-class functions if it treats functions as first-class citizens. This means the language supports passing functions as arguments to other functions, returning them as the values from other functions, and assigning them to variables or storing them in data structures. Some programming language theorists require support for anonymous functions as well. In languages with first-class functions, the names of functions do not have any special status; they are treated like ordinary variables with a function type. The term was coined by Christopher Strachey in the context of "functions as first-class citizens" in the mid-1960s.

General-purpose computing on graphics processing units is the use of a graphics processing unit (GPU), which typically handles computation only for computer graphics, to perform computation in applications traditionally handled by the central processing unit (CPU). The use of multiple video cards in one computer, or large numbers of graphics chips, further parallelizes the already parallel nature of graphics processing.

In computing, a parallel programming model is an abstraction of parallel computer architecture, with which it is convenient to express algorithms and their composition in programs. The value of a programming model can be judged on its generality: how well a range of different problems can be expressed for a variety of different architectures, and its performance: how efficiently the compiled programs can execute. The implementation of a parallel programming model can take the form of a library invoked from a sequential language, as an extension to an existing language, or as an entirely new language.

In computer science, stream processing is a programming paradigm which views streams, or sequences of events in time, as the central input and output objects of computation. Stream processing encompasses dataflow programming, reactive programming, and distributed data processing. Stream processing systems aim to expose parallel processing for data streams and rely on streaming algorithms for efficient implementation. The software stack for these systems includes components such as programming models and query languages, for expressing computation; stream management systems, for distribution and scheduling; and hardware components for acceleration including floating-point units, graphics processing units, and field-programmable gate arrays.

<span class="mw-page-title-main">Data parallelism</span> Parallelization across multiple processors in parallel computing environments

Data parallelism is parallelization across multiple processors in parallel computing environments. It focuses on distributing the data across different nodes, which operate on the data in parallel. It can be applied on regular data structures like arrays and matrices by working on each element in parallel. It contrasts to task parallelism as another form of parallelism.

<span class="mw-page-title-main">OpenCL</span> Open standard for programming heterogenous computing systems, such as CPUs or GPUs

OpenCL is a framework for writing programs that execute across heterogeneous platforms consisting of central processing units (CPUs), graphics processing units (GPUs), digital signal processors (DSPs), field-programmable gate arrays (FPGAs) and other processors or hardware accelerators. OpenCL specifies programming languages for programming these devices and application programming interfaces (APIs) to control the platform and execute programs on the compute devices. OpenCL provides a standard interface for parallel computing using task- and data-based parallelism.

In computing, algorithmic skeletons, or parallelism patterns, are a high-level parallel programming model for parallel and distributed computing.

This article describes the features in the programming language Haskell.

C++ Accelerated Massive Parallelism is a native programming model that contains elements that span the C++ programming language and its runtime library. It provides an easy way to write programs that compile and execute on data-parallel hardware, such as graphics cards (GPUs).

SequenceL is a general purpose functional programming language and auto-parallelizing compiler and tool set, whose primary design objectives are performance on multi-core processor hardware, ease of programming, platform portability/optimization, and code clarity and readability. Its main advantage is that it can be used to write straightforward code that automatically takes full advantage of all the processing power available, without programmers needing to be concerned with identifying parallelisms, specifying vectorization, avoiding race conditions, and other challenges of manual directive-based programming approaches such as OpenMP.

A conc-tree is a data structure that stores element sequences, and provides amortized O(1) time append and prepend operations, O(log n) time insert and remove operations and O(log n) time concatenation. This data structure is particularly viable for functional task-parallel and data-parallel programming, and is relatively simple to implement compared to other data-structures with similar asymptotic complexity. Conc-trees were designed to improve efficiency of data-parallel operations that do not require sequential left-to-right iteration order, and improve constant factors in these operations by avoiding unnecessary copies of the data. Orthogonally, they are used to efficiently aggregate data in functional-style task-parallel algorithms, as an implementation of the conc-list data abstraction. Conc-list is a parallel programming counterpart to functional cons-lists, and was originally introduced by the Fortress language.

Flix is a functional, imperative, and logic programming language developed at Aarhus University, with funding from the Independent Research Fund Denmark, and by a community of open source contributors. The Flix language supports algebraic data types, pattern matching, parametric polymorphism, currying, higher-order functions, extensible records, channel and process-based concurrency, and tail call elimination. Two notable features of Flix are its type and effect system and its support for first-class Datalog constraints.

References

  1. "License". futhark-lang.org. Retrieved 2023-03-26. Developed at DIKU
  2. "Home". hiperfit.dk.
  3. Henriksen, Troels; Serup, Niels G. W.; Elsman, Martin; Henglein, Fritz; Oancea, Cosmin (2017). "Futhark: Purely Functional GPU-Programming with Nested Parallelism and In-Place Array Updates" (PDF). Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation. PLDI 2017. ACM.
  4. "Futhark User's Guide". futhark.readthedocs.io.