In computer science, a programming language is said to have first-class functions if it treats functions as first-class citizens. This means the language supports passing functions as arguments to other functions, returning them as the values from other functions, and assigning them to variables or storing them in data structures. [1] Some programming language theorists require support for anonymous functions (function literals) as well. [2] In languages with first-class functions, the names of functions do not have any special status; they are treated like ordinary variables with a function type. [3] The term was coined by Christopher Strachey in the context of "functions as first-class citizens" in the mid-1960s. [4]
First-class functions are a necessity for the functional programming style, in which the use of higher-order functions is a standard practice. A simple example of a higher-ordered function is the map function, which takes, as its arguments, a function and a list, and returns the list formed by applying the function to each member of the list. For a language to support map, it must support passing a function as an argument.
There are certain implementation difficulties in passing functions as arguments or returning them as results, especially in the presence of non-local variables introduced in nested and anonymous functions. Historically, these were termed the funarg problems, the name coming from "function argument". [5] In early imperative languages these problems were avoided by either not supporting functions as result types (e.g. ALGOL 60, Pascal) or omitting nested functions and thus non-local variables (e.g. C). The early functional language Lisp took the approach of dynamic scoping, where non-local variables refer to the closest definition of that variable at the point where the function is executed, instead of where it was defined. Proper support for lexically scoped first-class functions was introduced in Scheme and requires handling references to functions as closures instead of bare function pointers, [4] which in turn makes garbage collection a necessity.
In this section, we compare how particular programming idioms are handled in a functional language with first-class functions (Haskell) compared to an imperative language where functions are second-class citizens (C).
In languages where functions are first-class citizens, functions can be passed as arguments to other functions in the same way as other values (a function taking another function as argument is called a higher-order function). In the language Haskell:
map::(a->b)->[a]->[b]mapf[]=[]mapf(x:xs)=fx:mapfxs
Languages where functions are not first-class often still allow one to write higher-order functions through the use of features such as function pointers or delegates. In the language C:
voidmap(int(*f)(int),intx[],size_tn){for(inti=0;i<n;i++)x[i]=f(x[i]);}
There are a number of differences between the two approaches that are not directly related to the support of first-class functions. The Haskell sample operates on lists, while the C sample operates on arrays. Both are the most natural compound data structures in the respective languages and making the C sample operate on linked lists would have made it unnecessarily complex. This also accounts for the fact that the C function needs an additional parameter (giving the size of the array.) The C function updates the array in-place, returning no value, whereas in Haskell data structures are persistent (a new list is returned while the old is left intact.) The Haskell sample uses recursion to traverse the list, while the C sample uses iteration. Again, this is the most natural way to express this function in both languages, but the Haskell sample could easily have been expressed in terms of a fold and the C sample in terms of recursion. Finally, the Haskell function has a polymorphic type, as this is not supported by C we have fixed all type variables to the type constant int
.
In languages supporting anonymous functions, we can pass such a function as an argument to a higher-order function:
main=map(\x->3*x+1)[1,2,3,4,5]
In a language which does not support anonymous functions, we have to bind it to a name instead:
intf(intx){return3*x+1;}intmain(){intlist[]={1,2,3,4,5};map(f,list,5);}
Once we have anonymous or nested functions, it becomes natural for them to refer to variables outside of their body (called non-local variables):
main=leta=3b=1inmap(\x->a*x+b)[1,2,3,4,5]
If functions are represented with bare function pointers, we can not know anymore how the value that is outside of the function's body should be passed to it, and because of that a closure needs to be built manually. Therefore we can not speak of "first-class" functions here.
typedefstruct{int(*f)(int,int,int);inta;intb;}closure_t;voidmap(closure_t*closure,intx[],size_tn){for(inti=0;i<n;++i)x[i]=(closure->f)(closure->a,closure->b,x[i]);}intf(inta,intb,intx){returna*x+b;}voidmain(){intl[]={1,2,3,4,5};inta=3;intb=1;closure_tclosure={f,a,b};map(&closure,l,5);}
Also note that the map
is now specialized to functions referring to two int
s outside of their environment. This can be set up more generally, but requires more boilerplate code. If f
would have been a nested function we would still have run into the same problem and this is the reason they are not supported in C. [6]
When returning a function, we are in fact returning its closure. In the C example any local variables captured by the closure will go out of scope once we return from the function that builds the closure. Forcing the closure at a later point will result in undefined behaviour, possibly corrupting the stack. This is known as the upwards funarg problem.
Assigning functions to variables and storing them inside (global) datastructures potentially suffers from the same difficulties as returning functions.
f::[[Integer]->[Integer]]f=leta=3b=1in[map(\x->a*x+b),map(\x->b*x+a)]
As one can test most literals and values for equality, it is natural to ask whether a programming language can support testing functions for equality. On further inspection, this question appears more difficult and one has to distinguish between several types of function equality: [7]
In type theory, the type of functions accepting values of type A and returning values of type B may be written as A → B or BA. In the Curry–Howard correspondence, function types are related to logical implication; lambda abstraction corresponds to discharging hypothetical assumptions and function application corresponds to the modus ponens inference rule. Besides the usual case of programming functions, type theory also uses first-class functions to model associative arrays and similar data structures.
In category-theoretical accounts of programming, the availability of first-class functions corresponds to the closed category assumption. For instance, the simply typed lambda calculus corresponds to the internal language of Cartesian closed categories.
Functional programming languages, such as Erlang, Scheme, ML, Haskell, F#, and Scala, all have first-class functions. When Lisp, one of the earliest functional languages, was designed, not all aspects of first-class functions were then properly understood, resulting in functions being dynamically scoped. The later Scheme and Common Lisp dialects do have lexically scoped first-class functions.
Many scripting languages, including Perl, Python, PHP, Lua, Tcl/Tk, JavaScript and Io, have first-class functions.
For imperative languages, a distinction has to be made between Algol and its descendants such as Pascal, the traditional C family, and the modern garbage-collected variants. The Algol family has allowed nested functions and higher-order taking function as arguments, but not higher-order functions that return functions as results (except Algol 68, which allows this). The reason for this was that it was not known how to deal with non-local variables if a nested-function was returned as a result (and Algol 68 produces runtime errors in such cases).
The C family allowed both passing functions as arguments and returning them as results, but avoided any problems by not supporting nested functions. (The gcc compiler allows them as an extension.) As the usefulness of returning functions primarily lies in the ability to return nested functions that have captured non-local variables, instead of top-level functions, these languages are generally not considered to have first-class functions.
Modern imperative languages often support garbage-collection making the implementation of first-class functions feasible. First-class functions have often only been supported in later revisions of the language, including C# 2.0 and Apple's Blocks extension to C, C++, and Objective-C. C++11 has added support for anonymous functions and closures to the language, but because of the non-garbage collected nature of the language, special care has to be taken for non-local variables in functions to be returned as results (see below).
Language | Higher-order functions | Nested functions | Non-local variables | Notes | ||||
---|---|---|---|---|---|---|---|---|
Arguments | Results | Named | Anonymous | Closures | Partial application | |||
Algol family | ALGOL 60 | Yes | No | Yes | No | Downwards | No | Have function types. |
ALGOL 68 | Yes | Yes [8] | Yes | Yes | Downwards [9] | No | ||
Pascal | Yes | No | Yes | No | Downwards | No | ||
Ada | Yes | No | Yes | No | Downwards | No | ||
Oberon | Yes | Non-nested only | Yes | No | Downwards | No | ||
Delphi | Yes | Yes | Yes | 2009 | 2009 | No | ||
C family | C | Yes | Yes | Yes in GNU C | Yes in Clang(Blocks) | Yes in Clang(Blocks) | No | Has function pointers. |
C++ | Yes | Yes | C++11 [10] | C++11 [11] | C++11 [11] | C++11 | Has function pointers, function objects. (Also, see below.) Explicit partial application possible with | |
C# | Yes | Yes | 7 | 2.0 / 3.0 | 2.0 | 3.0 | Has delegates (2.0) and lambda expressions (3.0). | |
Objective-C | Yes | Yes | Using anonymous | 2.0 + Blocks [12] | 2.0 + Blocks | No | Has function pointers. | |
Java | Yes | Yes | Using anonymous | Java 8 | Java 8 | Yes | Has anonymous inner classes. | |
Go | Yes | Yes | Using anonymous | Yes | Yes | Yes [13] | ||
Limbo | Yes | Yes | Yes | Yes | Yes | No | ||
Newsqueak | Yes | Yes | Yes | Yes | Yes | No | ||
Rust | Yes | Yes | Yes | Yes | Yes | Yes [14] | ||
Functional languages | Lisp | Syntax | Syntax | Yes | Yes | Common Lisp | No | (see below) |
Scheme | Yes | Yes | Yes | Yes | Yes | SRFI 26 [15] | ||
Julia | Yes | Yes | Yes | Yes | Yes | Yes | ||
Clojure | Yes | Yes | Yes | Yes | Yes | Yes | ||
ML | Yes | Yes | Yes | Yes | Yes | Yes | ||
Haskell | Yes | Yes | Yes | Yes | Yes | Yes | ||
jq | Yes | No | Yes | Expressions only | Downwards | No | ||
Scala | Yes | Yes | Yes | Yes | Yes | Yes | ||
Erlang | Yes | Yes | Yes | Yes | Yes | Yes | ||
Elixir | Yes | Yes | Yes | Yes | Yes | Yes | ||
F# | Yes | Yes | Yes | Yes | Yes | Yes | ||
OCaml | Yes | Yes | Yes | Yes | Yes | Yes | ||
Scripting languages | Io | Yes | Yes | Yes | Yes | Yes | No | |
JavaScript | Yes | Yes | Yes | Yes | Yes | ECMAScript 5 | Partial application possible with user-land code on ES3 [16] | |
Lua | Yes | Yes | Yes | Yes | Yes | Yes [17] | ||
PHP | Yes | Yes | Using anonymous | 5.3 | 5.3 | No | Partial application possible with user-land code. | |
Perl | Yes | Yes | 6 | Yes | Yes | 6 [18] | ||
Python | Yes | Yes | Yes | Expressions only | Yes | 2.5 [19] | (see below) | |
Ruby | Syntax | Syntax | Unscoped | Yes | Yes | 1.9 | (see below) | |
Other languages | Fortran | Yes | Yes | Yes | No | No | No | |
Maple | Yes | Yes | Yes | Yes | Yes | No | ||
Mathematica | Yes | Yes | Yes | Yes | Yes | No | ||
MATLAB | Yes | Yes | Yes | Yes [20] | Yes | Yes | Partial application possible by automatic generation of new functions. [21] | |
Smalltalk | Yes | Yes | Yes | Yes | Yes | Partial | Partial application possible through library. | |
Swift | Yes | Yes | Yes | Yes | Yes | Yes |
function
must be used to retrieve the function as a value: (function foo)
evaluates to a function object. #'foo
exists as a shorthand notation. To apply such a function object, one must use the funcall
function: (funcall #'foo bar baz)
. functools.partial
since version 2.5, and operator.methodcaller
since version 2.6.Method
or Proc
object to be used as first-class data. The syntax for calling such a function object differs from calling regular methods.
.{{cite journal}}
: CS1 maint: bot: original URL status unknown (link) (also on 2010-02-16Common Lisp (CL) is a dialect of the Lisp programming language, published in American National Standards Institute (ANSI) standard document ANSI INCITS 226-1994 (S2018). The Common Lisp HyperSpec, a hyperlinked HTML version, has been derived from the ANSI Common Lisp standard.
In computer science, functional programming is a programming paradigm where programs are constructed by applying and composing functions. It is a declarative programming paradigm in which function definitions are trees of expressions that map values to other values, rather than a sequence of imperative statements which update the running state of the program.
Scheme is a dialect of the Lisp family of programming languages. Scheme was created during the 1970s at the MIT Computer Science and Artificial Intelligence Laboratory and released by its developers, Guy L. Steele and Gerald Jay Sussman, via a series of memos now known as the Lambda Papers. It was the first dialect of Lisp to choose lexical scope and the first to require implementations to perform tail-call optimization, giving stronger support for functional programming and associated techniques such as recursive algorithms. It was also one of the first programming languages to support first-class continuations. It had a significant influence on the effort that led to the development of Common Lisp.
In computer programming, the scope of a name binding is the part of a program where the name binding is valid; that is, where the name can be used to refer to the entity. In other parts of the program, the name may refer to a different entity, or to nothing at all. Scope helps prevent name collisions by allowing the same name to refer to different objects – as long as the names have separate scopes. The scope of a name binding is also known as the visibility of an entity, particularly in older or more technical literature—this is in relation to the referenced entity, not the referencing name.
In programming languages, a closure, also lexical closure or function closure, is a technique for implementing lexically scoped name binding in a language with first-class functions. Operationally, a closure is a record storing a function together with an environment. The environment is a mapping associating each free variable of the function with the value or reference to which the name was bound when the closure was created. Unlike a plain function, a closure allows the function to access those captured variables through the closure's copies of their values or references, even when the function is invoked outside their scope.
In combinatory logic for computer science, a fixed-point combinator, is a higher-order function that returns some fixed point of its argument function, if one exists.
In mathematics and computer science, a higher-order function (HOF) is a function that does at least one of the following:
In computer programming, an assignment statement sets and/or re-sets the value stored in the storage location(s) denoted by a variable name; in other words, it copies a value into the variable. In most imperative programming languages, the assignment statement is a fundamental construct.
In computer science, conditionals are programming language commands for handling decisions. Specifically, conditionals perform different computations or actions depending on whether a programmer-defined Boolean condition evaluates to true or false. In terms of control flow, the decision is always achieved by selectively altering the control flow based on some condition . Although dynamic dispatch is not usually classified as a conditional construct, it is another way to select between alternatives at runtime. Conditional statements are the checkpoints in the programme that determines behaviour according to situation.
In computer programming, a function object is a construct allowing an object to be invoked or called as if it were an ordinary function, usually with the same syntax. In some languages, particularly C++, function objects are often called functors.
In computer programming, a thunk is a subroutine used to inject a calculation into another subroutine. Thunks are primarily used to delay a calculation until its result is needed, or to insert operations at the beginning or end of the other subroutine. They have many other applications in compiler code generation and modular programming.
In computer science, the funarg problem(function argument problem) refers to the difficulty in implementing first-class functions in programming language implementations so as to use stack-based memory allocation of the functions.
In computer science, a relational operator is a programming language construct or operator that tests or defines some kind of relation between two entities. These include numerical equality and inequalities.
In computer programming, a nested function is a named function that is defined within another, enclosing, block and is lexically scoped within the enclosing block – meaning it is only callable by name within the body of the enclosing block and can use identifiers declared in outer blocks, including outer functions. The enclosing block is typically, but not always, another function.
In computing, a procedural parameter is a parameter of a procedure that is itself a procedure.
In many programming languages, map is a higher-order function that applies a given function to each element of a collection, e.g. a list or set, returning the results in a collection of the same type. It is often called apply-to-all when considered in functional form.
In computer programming, an anonymous function is a function definition that is not bound to an identifier. Anonymous functions are often arguments being passed to higher-order functions or used for constructing the result of a higher-order function that needs to return a function. If the function is only used once, or a limited number of times, an anonymous function may be syntactically lighter than using a named function. Anonymous functions are ubiquitous in functional programming languages and other languages with first-class functions, where they fulfil the same role for the function type as literals do for other data types.
This article compares a large number of programming languages by tabulating their data types, their expression, statement, and declaration syntax, and some common operating-system interfaces.
In computer programming, a function, subprogram, procedure, method, routine or subroutine is a callable unit that has a well-defined behavior and can be invoked by other software units to exhibit that behavior.
In programming language theory, a non-local variable is a variable that is not defined in the local scope. While the term can refer to global variables, it is primarily used in the context of nested and anonymous functions where some variables can be in neither the local nor the global scope.