Pure function

Last updated

In computer programming, a pure function is a function that has the following properties: [1] [2]

Contents

  1. the function return values are identical for identical arguments (no variation with local static variables, non-local variables, mutable reference arguments or input streams, i.e., referential transparency), and
  2. the function has no side effects (no mutation of local static variables, non-local variables, mutable reference arguments or input/output streams).

Some authors, particularly from the imperative language community, use the term "pure" for all functions that just have the above property 2 (discussed in § Compiler optimizations below). [3]

Examples

Pure functions

The following examples of C++ functions are pure:

  • floor, returning the floor of a number;
  • max, returning the maximum of two values.
  • the function f, defined as
    voidf(){staticstd::atomic<unsignedint>x=0;++x;}
    The value of x can be only observed inside other invocations of f(), and as f() does not communicate the value of x to its environment, it is indistinguishable from function void f() {} that does nothing. Note that x is std::atomic so that modifications from multiple threads executing f() concurrently do not result in a data race, which has undefined behavior in C and C++.

Impure functions

The following C++ functions are impure as they lack the above property 1:

  • because of return value variation with a static variable
    intf(){staticintx=0;++x;returnx;}
  • because of return value variation with a non-local variable
    intf(){returnx;}
    For the same reason, e.g. the C++ library function sin() is not pure, since its result depends on the IEEE rounding mode which can be changed at runtime.
  • because of return value variation with a mutable reference argument
    intf(int*x){return*x;}
  • because of return value variation with an input stream
    intf(){intx=0;std::cin>>x;returnx;}

The following C++ functions are impure as they lack the above property 2:

  • because of mutation of a local static variable
    voidf(){staticintx=0;++x;}
  • because of mutation of a non-local variable
    voidf(){++x;}
  • because of mutation of a mutable reference argument
    voidf(int*x){++*x;}
  • because of mutation of an output stream
    voidf(){std::cout<<"Hello, world!"<<std::endl;}

The following C++ functions are impure as they lack both the above properties 1 and 2:

  • because of return value variation with a local static variable and mutation of a local static variable
    intf(){staticintx=0;++x;returnx;}
  • because of return value variation with an input stream and mutation of an input stream
    intf(){intx=0;std::cin>>x;returnx;}

I/O in pure functions

I/O is inherently impure: input operations undermine referential transparency, and output operations create side effects. Nevertheless, there is a sense in which a function can perform input or output and still be pure, if the sequence of operations on the relevant I/O devices is modeled explicitly as both an argument and a result, and I/O operations are taken to fail when the input sequence does not describe the operations actually taken since the program began execution.[ clarification needed ]

The second point ensures that the only sequence usable as an argument must change with each I/O action; the first allows different calls to an I/O-performing function to return different results on account of the sequence arguments having changed. [4] [5]

The I/O monad is a programming idiom typically used to perform I/O in pure functional languages.

Memoization

staticintfact(intn){returnn<=1?1:fact(n-1)*n;}intfact_wrapper(intn){staticintcache[13];assert(0<=n&&n<13);if(cache[n]==0)cache[n]=fact(n);returncache[n];}
C program for cached computation of factorial [6]

The outputs of a pure function can be precomputed and cached in a look-up table. In a technique called memoization, any result that is returned from a given function is cached, and the next time the function is called with the same input parameters, the cached result is returned instead of computing the function again.

Memoization can be performed by wrapping the function in another function (wrapper function). [7]

By means of memoization, the computational effort involved in the computations of the function itself can be reduced, at the cost of the overhead for managing the cache and an increase of memory requirements.

Compiler optimizations

Functions that have just the above property 2 – that is, have no side efects – allow for compiler optimization techniques such as common subexpression elimination and loop optimization similar to arithmetic operators. [8] A C++ example is the length method, returning the size of a string, which depends on the memory contents where the string points to, therefore lacking the above property 1. Nevertheless, in a single-threaded environment, the following C++ code

std::strings="Hello, world!";inta[10]={1,2,3,4,5,6,7,8,9,10};intl=0;for(inti=0;i<10;++i){l+=s.length()+a[i];}

can be optimized such that the value of s.length() is computed only once, before the loop.

Some programming languages allow for declaring a pure property to a function:

Unit testing

Since pure functions have identical return values for identical arguments, they are well suited to unit testing.

See also

Related Research Articles

Templates are a feature of the C++ programming language that allows functions and classes to operate with generic types. This allows a function or class declaration to reference via a generic variable another different class without creating full declaration for each of these different classes.

In object-oriented (OO) and functional programming, an immutable object is an object whose state cannot be modified after it is created. This is in contrast to a mutable object, which can be modified after it is created. In some cases, an object is considered immutable even if some internally used attributes change, but the object's state appears unchanging from an external point of view. For example, an object that uses memoization to cache the results of expensive computations could still be considered an immutable object.

Template metaprogramming (TMP) is a metaprogramming technique in which templates are used by a compiler to generate temporary source code, which is merged by the compiler with the rest of the source code and then compiled. The output of these templates can include compile-time constants, data structures, and complete functions. The use of templates can be thought of as compile-time polymorphism. The technique is used by a number of languages, the best-known being C++, but also Curl, D, Nim, and XL.

<span class="mw-page-title-main">D (programming language)</span> Multi-paradigm system programming language

D, also known as dlang, is a multi-paradigm system programming language created by Walter Bright at Digital Mars and released in 2001. Andrei Alexandrescu joined the design and development effort in 2007. Though it originated as a re-engineering of C++, D is now a very different language drawing inspiration from other high-level programming languages, notably Java, Python, Ruby, C#, and Eiffel.

In mathematics and computer science, a higher-order function (HOF) is a function that does at least one of the following:

In computer programming, a parameter or a formal argument is a special kind of variable used in a subroutine to refer to one of the pieces of data provided as input to the subroutine. These pieces of data are the values of the arguments with which the subroutine is going to be called/invoked. An ordered list of parameters is usually included in the definition of a subroutine, so that, each time the subroutine is called, its arguments for that call are evaluated, and the resulting values can be assigned to the corresponding parameters.

In computer programming, a default argument is an argument to a function that a programmer is not required to specify. In most programming languages, functions may take one or more arguments. Usually, each argument must be specified in full. Later languages allow the programmer to specify default arguments that always have a value, even if one is not specified when calling the function.

In compiler construction, name mangling is a technique used to solve various problems caused by the need to resolve unique names for programming entities in many modern programming languages.

In mathematics and in computer programming, a variadic function is a function of indefinite arity, i.e., one which accepts a variable number of arguments. Support for variadic functions differs widely among programming languages.

In some programming languages, const is a type qualifier, which indicates that the data is read-only. While this can be used to declare constants, const in the C family of languages differs from similar constructs in other languages in that it is part of the type, and thus has complicated behavior when combined with pointers, references, composite data types, and type-checking. In other languages, the data is not in a single memory location, but copied at compile time for each use. Languages which use it include C, C++, D, JavaScript, Julia, and Rust.

In a programming language, an evaluation strategy is a set of rules for evaluating expressions. The term is often used to refer to the more specific notion of a parameter-passing strategy that defines the kind of value that is passed to the function for each parameter and whether to evaluate the parameters of a function call, and if so in what order. The notion of reduction strategy is distinct, although some authors conflate the two terms and the definition of each term is not widely agreed upon.

Interprocedural optimization (IPO) is a collection of compiler techniques used in computer programming to improve performance in programs containing many frequently used functions of small or medium length. IPO differs from other compiler optimizations by analyzing the entire program as opposed to a single function or block of code.

C++11 is a version of the ISO/IEC 14882 standard for the C++ programming language. C++11 replaced the prior version of the C++ standard, called C++03, and was later replaced by C++14. The name follows the tradition of naming language versions by the publication year of the specification, though it was formerly named C++0x because it was expected to be published before 2010.

In computer programming, an anonymous function is a function definition that is not bound to an identifier. Anonymous functions are often arguments being passed to higher-order functions or used for constructing the result of a higher-order function that needs to return a function. If the function is only used once, or a limited number of times, an anonymous function may be syntactically lighter than using a named function. Anonymous functions are ubiquitous in functional programming languages and other languages with first-class functions, where they fulfil the same role for the function type as literals do for other data types.

In computing, compile-time function execution is the ability of a compiler, that would normally compile a function to machine code and execute it at run time, to execute the function at compile time. This is possible if the arguments to the function are known at compile time, and the function does not make any reference to or attempt to modify any global state.

This article compares a large number of programming languages by tabulating their data types, their expression, statement, and declaration syntax, and some common operating-system interfaces.

Concepts are an extension to the templates feature provided by the C++ programming language. Concepts are named Boolean predicates on template parameters, evaluated at compile time. A concept may be associated with a template, in which case it serves as a constraint: it limits the set of arguments that are accepted as template parameters.

C++14 is a version of the ISO/IEC 14882 standard for the C++ programming language. It is intended to be a small extension over C++11, featuring mainly bug fixes and small improvements, and was replaced by C++17. Its approval was announced on August 18, 2014. C++14 was published as ISO/IEC 14882:2014 in December 2014.

C23 is the informal name for ISO/IEC 9899:2024, the next standard for the C programming language, which will replace C17. It was started in 2016 informally as C2x, and expected to be published in 2024. The most recent publicly available working draft of C23 was released on April 1, 2023. The first WG14 meeting for the C2x draft was held in October 2019, virtual remote meetings were held in 2020 due to the COVID-19 pandemic, then various teleconference meetings continued to occur through 2024.

C++23 is the informal name for the version of the ISO/IEC 14882 standard for the C++ programming language that followed C++20. The final draft of this version is N4950.

References

  1. Bartosz Milewski (2013). "Basics of Haskell". School of Haskell. FP Complete. Archived from the original on 2016-10-27. Retrieved 2018-07-13. Here are the fundamental properties of a pure function: 1. A function returns exactly the same result every time it's called with the same set of arguments. In other words a function has no state, nor can it access any external state. Every time you call it, it behaves like a newborn baby with blank memory and no knowledge of the external world. 2. A function has no side effects. Calling a function once is the same as calling it twice and discarding the result of the first call.
  2. Brian Lonsdorf (2015). "Professor Frisby's Mostly Adequate Guide to Functional Programming". GitHub. Retrieved 2020-03-20. A pure function is a function that, given the same input, will always return the same output and does not have any observable side effect.
  3. "Common Function Attributes - Using the GNU Compiler Collection (GCC)". gcc.gnu.org, the GNU Compiler Collection. Free Software Foundation, Inc. Retrieved 2018-06-28.
  4. Peyton Jones, Simon L. (2003). Haskell 98 Language and Libraries: The Revised Report (PDF). Cambridge, United Kingdom: Cambridge University Press. p. 95. ISBN   0-521 826144 . Retrieved 17 July 2014.
  5. Hanus, Michael. "Curry: An Integrated Functional Logic Language" (PDF). www-ps.informatik.uni-kiel.de. Institut für Informatik, Christian-Albrechts-Universität zu Kiel. p. 33. Archived from the original (PDF) on 25 July 2014. Retrieved 17 July 2014.
  6. assert() aborts with an error message if its argument is false; on a 32-bit machine, values beyond fact(12) cannot be represented, anyway.
  7. Aley, R. (2017). Pro Functional PHP Programming: Application Development Strategies for Performance Optimization, Concurrency, Testability, and Code Brevity. SpringerLink : Bücher. Apress. p. 109. ISBN   978-1-4842-2958-3 . Retrieved 2024-02-04.
  8. "Common Function Attributes - Using the GNU Compiler Collection (GCC)". gcc.gnu.org, the GNU Compiler Collection. Free Software Foundation, Inc. Retrieved 2018-06-28.
  9. Pure attribute in Fortran
  10. Pure attribute in D language
  11. "Common Function Attributes". Using the GNU Compiler Collection (GCC. Retrieved 22 July 2021.
  12. constexpr attribute in C++