Pure function

Last updated

In computer programming, a pure function is a function that has the following properties: [1] [2]

Contents

  1. the function return values are identical for identical arguments (no variation with local static variables, non-local variables, mutable reference arguments or input streams, i.e., referential transparency), and
  2. the function has no side effects (no mutation of local static variables, non-local variables, mutable reference arguments or input/output streams).

Examples

Pure functions

The following examples of C++ functions are pure:

  • floor, returning the floor of a number;
  • max, returning the maximum of two values.
  • the function f, defined as
    voidf(){staticstd::atomic<unsignedint>x=0;++x;}
    The value of x can be only observed inside other invocations of f(), and as f() does not communicate the value of x to its environment, it is indistinguishable from function void f() {} that does nothing. Note that x is std::atomic so that modifications from multiple threads executing f() concurrently do not result in a data race, which has undefined behavior in C and C++.

Impure functions

The following C++ functions are impure as they lack the above property 1:

  • because of return value variation with a static variable
    intf(){staticintx=0;++x;returnx;}
  • because of return value variation with a non-local variable
    intf(){returnx;}
    For the same reason, e.g. the C++ library function sin() is not pure, since its result depends on the IEEE rounding mode which can be changed at runtime.
  • because of return value variation with a mutable reference argument
    intf(int*x){return*x;}
  • because of return value variation with an input stream
    intf(){intx=0;std::cin>>x;returnx;}

The following C++ functions are impure as they lack the above property 2:

  • because of mutation of a local static variable
    voidf(){staticintx=0;++x;}
  • because of mutation of a non-local variable
    voidf(){++x;}
  • because of mutation of a mutable reference argument
    voidf(int*x){++*x;}
  • because of mutation of an output stream
    voidf(){std::cout<<"Hello, world!"<<std::endl;}

The following C++ functions are impure as they lack both the above properties 1 and 2:

  • because of return value variation with a local static variable and mutation of a local static variable
    intf(){staticintx=0;++x;returnx;}
  • because of return value variation with an input stream and mutation of an input stream
    intf(){intx=0;std::cin>>x;returnx;}

I/O in pure functions

I/O is inherently impure: input operations undermine referential transparency, and output operations create side effects. Nevertheless, there is a sense in which a function can perform input or output and still be pure, if the sequence of operations on the relevant I/O devices is modeled explicitly as both an argument and a result, and I/O operations are taken to fail when the input sequence does not describe the operations actually taken since the program began execution.[ clarification needed ]

The second point ensures that the only sequence usable as an argument must change with each I/O action; the first allows different calls to an I/O-performing function to return different results on account of the sequence arguments having changed. [3] [4]

The I/O monad is a programming idiom typically used to perform I/O in pure functional languages.

Memoization

The outputs of a pure function can be precomputed and cached in a look-up table. In a technique called memoization, any result that is returned from a given function is cached, and the next time the function is called with the same input parameters, the cached result is returned instead of computing the function again.

Memoization can be performed by wrapping the function in another function (wrapper function). [5]

By means of memoization, the computational effort involved in the computations of the function itself can be reduced, at the cost of the overhead for managing the cache and an increase of memory requirements.

A C program for cached computation of factorial (assert() aborts with an error message if its argument is false; on a 32-bit machine, values beyond fact(12) cannot be represented anyway.[ citation needed ]

staticintfact(intn){returnn<=1?1:fact(n-1)*n;}intfact_wrapper(intn){staticintcache[13];assert(0<=n&&n<13);if(cache[n]==0)cache[n]=fact(n);returncache[n];}

Compiler optimizations

Functions that have just the above property 2 – that is, have no side effects – allow for compiler optimization techniques such as common subexpression elimination and loop optimization similar to arithmetic operators. [6] A C++ example is the length method, returning the size of a string, which depends on the memory contents where the string points to, therefore lacking the above property 1. Nevertheless, in a single-threaded environment, the following C++ code

std::strings="Hello, world!";inta[10]={1,2,3,4,5,6,7,8,9,10};intl=0;for(inti=0;i<10;++i){l+=s.length()+a[i];}

can be optimized such that the value of s.length() is computed only once, before the loop.

Some programming languages allow for declaring a pure property to a function:

Unit testing

Since pure functions have identical return values for identical arguments, they are well suited to unit testing.

See also

Related Research Articles

Templates are a feature of the C++ programming language that allows functions and classes to operate with generic types. This allows a function or class declaration to reference via a generic variable another different class without creating full declaration for each of these different classes.

In computer programming, lazy initialization is the tactic of delaying the creation of an object, the calculation of a value, or some other expensive process until the first time it is needed. It is a kind of lazy evaluation that refers specifically to the instantiation of objects or other resources.

In object-oriented (OO) and functional programming, an immutable object is an object whose state cannot be modified after it is created. This is in contrast to a mutable object, which can be modified after it is created. In some cases, an object is considered immutable even if some internally used attributes change, but the object's state appears unchanging from an external point of view. For example, an object that uses memoization to cache the results of expensive computations could still be considered an immutable object.

Template metaprogramming (TMP) is a metaprogramming technique in which templates are used by a compiler to generate temporary source code, which is merged by the compiler with the rest of the source code and then compiled. The output of these templates can include compile-time constants, data structures, and complete functions. The use of templates can be thought of as compile-time polymorphism. The technique is used by a number of languages, the best-known being C++, but also Curl, D, Nim, and XL.

In mathematics and computer science, a higher-order function (HOF) is a function that does at least one of the following:

In computer programming, a parameter or a formal argument is a special kind of variable used in a subroutine to refer to one of the pieces of data provided as input to the subroutine. These pieces of data are the values of the arguments with which the subroutine is going to be called/invoked. An ordered list of parameters is usually included in the definition of a subroutine, so that, each time the subroutine is called, its arguments for that call are evaluated, and the resulting values can be assigned to the corresponding parameters.

In computer programming, a default argument is an argument to a function that a programmer is not required to specify. In most programming languages, functions may take one or more arguments. Usually, each argument must be specified in full. Later languages allow the programmer to specify default arguments that always have a value, even if one is not specified when calling the function.

In computer programming, a function object is a construct allowing an object to be invoked or called as if it were an ordinary function, usually with the same syntax. In some languages, particularly C++, function objects are often called functors.

In compiler construction, name mangling is a technique used to solve various problems caused by the need to resolve unique names for programming entities in many modern programming languages.

In the C++ programming language, a reference is a simple reference datatype that is less powerful but safer than the pointer type inherited from C. The name C++ reference may cause confusion, as in computer science a reference is a general concept datatype, with pointers and C++ references being specific reference datatype implementations. The definition of a reference in C++ is such that it does not need to exist. It can be implemented as a new name for an existing object.

In some programming languages, const is a type qualifier that indicates that the data is read-only. While this can be used to declare constants, const in the C family of languages differs from similar constructs in other languages in that it is part of the type, and thus has complicated behavior when combined with pointers, references, composite data types, and type-checking. In other languages, the data is not in a single memory location, but copied at compile time for each use. Languages which use it include C, C++, D, JavaScript, Julia, and Rust.

In a programming language, an evaluation strategy is a set of rules for evaluating expressions. The term is often used to refer to the more specific notion of a parameter-passing strategy that defines the kind of value that is passed to the function for each parameter and whether to evaluate the parameters of a function call, and if so in what order. The notion of reduction strategy is distinct, although some authors conflate the two terms and the definition of each term is not widely agreed upon.

C++11 is a version of a joint technical standard, ISO/IEC 14882, by the International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC), for the C++ programming language. C++11 replaced the prior version of the C++ standard, named C++03, and was later replaced by C++14. The name follows the tradition of naming language versions by the publication year of the specification, though it was formerly named C++0x because it was expected to be published before 2010.

In computing, compile-time function execution is the ability of a compiler, that would normally compile a function to machine code and execute it at run time, to execute the function at compile time. This is possible if the arguments to the function are known at compile time, and the function does not make any reference to or attempt to modify any global state.

This article describes the syntax of the C# programming language. The features described are compatible with .NET Framework and Mono.

SystemVerilog DPI is an interface which can be used to interface SystemVerilog with foreign languages. These foreign languages can be C, C++, SystemC as well as others. DPIs consist of two layers: a SystemVerilog layer and a foreign language layer. Both the layers are isolated from each other.

C++14 is a version of the ISO/IEC 14882 standard for the C++ programming language. It is intended to be a small extension over C++11, featuring mainly bug fixes and small improvements, and was replaced by C++17. Its approval was announced on August 18, 2014. C++14 was published as ISO/IEC 14882:2014 in December 2014.

C23 is a draft open standard for the C programming language, which is expected to become ISO/IEC 9899:2024, replacing C17. It was started in 2016 informally as C2x, and is expected to be published in October or November 2024. The freely available draft most similar to the one balloted is doc N3220. The first WG14 meeting for the C2x draft was held in October 2019, virtual remote meetings were held in 2020 due to the COVID-19 pandemic, then various teleconference meetings continued to occur through 2024.

C++23 is the name for the version of the International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC) 14882 standard for the C++ programming language that follows C++20. The final draft of this version is N4950.

References

  1. Bartosz Milewski (2013). "Basics of Haskell". School of Haskell. FP Complete. Archived from the original on 2016-10-27. Retrieved 2018-07-13.
  2. Brian Lonsdorf (2015). "Professor Frisby's Mostly Adequate Guide to Functional Programming". GitHub. Retrieved 2020-03-20.
  3. Peyton Jones, Simon L. (2003). Haskell 98 Language and Libraries: The Revised Report (PDF). Cambridge, United Kingdom: Cambridge University Press. p. 95. ISBN   0-521 826144 . Retrieved 17 July 2014.
  4. Hanus, Michael. "Curry: An Integrated Functional Logic Language" (PDF). www-ps.informatik.uni-kiel.de. Institut für Informatik, Christian-Albrechts-Universität zu Kiel. p. 33. Archived from the original (PDF) on 25 July 2014. Retrieved 17 July 2014.
  5. Aley, R. (2017). Pro Functional PHP Programming: Application Development Strategies for Performance Optimization, Concurrency, Testability, and Code Brevity. SpringerLink : Bücher. Apress. p. 109. ISBN   978-1-4842-2958-3 . Retrieved 2024-02-04.
  6. "Common Function Attributes - Using the GNU Compiler Collection (GCC)". gcc.gnu.org, the GNU Compiler Collection. Free Software Foundation, Inc. Retrieved 2018-06-28.
  7. Pure attribute in Fortran
  8. Pure attribute in D language
  9. "Common Function Attributes". Using the GNU Compiler Collection (GCC. Retrieved 22 July 2021.
  10. constexpr attribute in C++