Evaluation strategies |
---|
In programming language theory, lazy evaluation, or call-by-need, [1] is an evaluation strategy which delays the evaluation of an expression until its value is needed (non-strict evaluation) and which also avoids repeated evaluations (by the use of sharing). [2] [3]
The benefits of lazy evaluation include:
Lazy evaluation is often combined with memoization, as described in Jon Bentley's Writing Efficient Programs. [4] After a function's value is computed for that parameter or set of parameters, the result is stored in a lookup table that is indexed by the values of those parameters; the next time the function is called, the table is consulted to determine whether the result for that combination of parameter values is already available. If so, the stored result is simply returned. If not, the function is evaluated, and another entry is added to the lookup table for reuse.
Lazy evaluation is difficult to combine with imperative features such as exception handling and input/output, because the order of operations becomes indeterminate.
The opposite of lazy evaluation is eager evaluation, sometimes known as strict evaluation. Eager evaluation is the evaluation strategy employed in most[ quantify ] programming languages.
Lazy evaluation was introduced for lambda calculus by Christopher Wadsworth. [5] For programming languages, it was independently introduced by Peter Henderson and James H. Morris [6] and by Daniel P. Friedman and David S. Wise. [7] [8]
Delayed evaluation is used particularly in functional programming languages. When using delayed evaluation, an expression is not evaluated as soon as it gets bound to a variable, but when the evaluator is forced to produce the expression's value. That is, a statement such as x = expression;
(i.e. the assignment of the result of an expression to a variable) clearly calls for the expression to be evaluated and the result placed in x
, but what actually is in x
is irrelevant until there is a need for its value via a reference to x
in some later expression whose evaluation could itself be deferred, though eventually the rapidly growing tree of dependencies would be pruned to produce some symbol rather than another for the outside world to see. [9]
This section needs additional citations for verification .(March 2011) |
Lazy evaluation allows control structures to be defined normally, and not as primitives or compile-time techniques. For example, one can define if-then-else and short-circuit evaluation operators: [10] [11]
ifThenElseTruebc=bifThenElseFalsebc=c-- orTrue||b=TrueFalse||b=b-- andTrue&&b=bFalse&&b=False
These have the usual semantics, i.e., ifThenElseabc
evaluates (a), then if and only if (a) evaluates to true does it evaluate (b), otherwise it evaluates (c). That is, exactly one of (b) or (c) will be evaluated. Similarly, for EasilyComputed<nowiki>||</nowiki>LotsOfWork
, if the easy part gives True the lots of work expression could be avoided. Finally, when evaluating SafeToTry&&Expression
, if SafeToTry is false there will be no attempt at evaluating the Expression.
Conversely, in an eager language the above definition for ifThenElseabc
would evaluate (a), (b), and (c) regardless of the value of (a). This is not the desired behavior, as (b) or (c) may have side effects, take a long time to compute, or throw errors. It is usually possible to introduce user-defined lazy control structures in eager languages as functions, though they may depart from the language's syntax for eager evaluation: Often the involved code bodies need to be wrapped in a function value, so that they are executed only when called.
Delayed evaluation has the advantage of being able to create calculable infinite lists without infinite loops or size matters interfering in computation. The actual values are only computed when needed. For example, one could create a function that creates an infinite list (often called a stream ) of Fibonacci numbers. The calculation of the n-th Fibonacci number would be merely the extraction of that element from the infinite list, forcing the evaluation of only the first n members of the list. [12] [13]
Take for example this trivial program in Haskell:
numberFromInfiniteList::Int->IntnumberFromInfiniteListn=infinity!!n-1whereinfinity=[1..]main=print$numberFromInfiniteList4
In the function numberFromInfiniteList, the value of infinity is an infinite range, but until an actual value (or more specifically, a specific value at a certain index) is needed, the list is not evaluated, and even then, it is only evaluated as needed (that is, until the desired index.) Provided the programmer is careful, the program completes normally. However, certain calculations may result in the program attempting to evaluate an infinite number of elements; for example, requesting the length of the list or trying to sum the elements of the list with a fold operation would result in the program either failing to terminate or running out of memory.
As another example, the list of all Fibonacci numbers can be written in the programming language Haskell as: [13]
fibs=0:1:zipWith(+)fibs(tailfibs)
In Haskell syntax, ":
" prepends an element to a list, tail
returns a list without its first element, and zipWith
uses a specified function (in this case addition) to combine corresponding elements of two lists to produce a third. [12]
This section needs expansion. You can help by adding to it. (March 2011) |
In computer windowing systems, the painting of information to the screen is driven by expose events which drive the display code at the last possible moment. By doing this, windowing systems avoid computing unnecessary display content updates. [14]
Another example of laziness in modern computer systems is copy-on-write page allocation or demand paging, where memory is allocated only when a value stored in that memory is changed. [14]
Laziness can be useful for high performance scenarios. An example is the Unix mmap function, which provides demand driven loading of pages from disk, so that only those pages actually touched are loaded into memory, and unneeded memory is not allocated.
MATLAB implements copy on edit , where arrays which are copied have their actual memory storage replicated only when their content is changed, possibly leading to an out of memory error when updating an element afterwards instead of during the copy operation. [15]
The number of beta reductions to reduce a lambda term with call-by-need is no larger than the number needed by call-by-value or call-by-name reduction. [16] [17] And with certain programs the number of steps may be much smaller, for example a specific family of lambda terms using Church numerals take an infinite amount of steps with call-by-value (i.e. never complete), an exponential number of steps with call-by-name, but only a polynomial number with call-by-need. Call-by-need embodies two optimizations - never repeat work (similar to call-by-value), and never perform unnecessary work (similar to call-by-name). [18] Lazy evaluation can also lead to reduction in memory footprint, since values are created when needed. [19]
In practice, lazy evaluation may cause significant performance issues compared to eager evaluation. For example, on modern computer architectures, delaying a computation and performing it later is slower than performing it immediately. This can be alleviated through strictness analysis. [18] Lazy evaluation can also introduce memory leaks due to unevaluated expressions. [20] [21]
Some programming languages delay evaluation of expressions by default, and some others provide functions or special syntax to delay evaluation. In Miranda and Haskell, evaluation of function arguments is delayed by default. In many other languages, evaluation can be delayed by explicitly suspending the computation using special syntax (as with Scheme's "delay
" and "force
" and OCaml's "lazy
" and "Lazy.force
") or, more generally, by wrapping the expression in a thunk. The object representing such an explicitly delayed evaluation is called a lazy future. Raku uses lazy evaluation of lists, so one can assign infinite lists to variables and use them as arguments to functions, but unlike Haskell and Miranda, Raku does not use lazy evaluation of arithmetic operators and functions by default. [9]
In lazy programming languages such as Haskell, although the default is to evaluate expressions only when they are demanded, it is possible in some cases to make code more eager—or conversely, to make it more lazy again after it has been made more eager. This can be done by explicitly coding something which forces evaluation (which may make the code more eager) or avoiding such code (which may make the code more lazy). Strict evaluation usually implies eagerness, but they are technically different concepts.
However, there is an optimisation implemented in some compilers called strictness analysis, which, in some cases, allows the compiler to infer that a value will always be used. In such cases, this may render the programmer's choice of whether to force that particular value or not, irrelevant, because strictness analysis will force strict evaluation.
In Haskell, marking constructor fields strict means that their values will always be demanded immediately. The seq
function can also be used to demand a value immediately and then pass it on, which is useful if a constructor field should generally be lazy. However, neither of these techniques implements recursive strictness—for that, a function called deepSeq
was invented.
Also, pattern matching in Haskell 98 is strict by default, so the ~
qualifier has to be used to make it lazy. [22]
In Java, lazy evaluation can be done by using objects that have a method to evaluate them when the value is needed. The body of this method must contain the code required to perform this evaluation. Since the introduction of lambda expressions in Java SE8, Java has supported a compact notation for this. The following example generic interface provides a framework for lazy evaluation: [23] [24]
interfaceLazy<T>{Teval();}
The Lazy
interface with its eval()
method is equivalent to the Supplier
interface with its get()
method in the java.util.function
library. [25] [26] : 200
Each class that implements the Lazy
interface must provide an eval
method, and instances of the class may carry whatever values the method needs to accomplish lazy evaluation. For example, consider the following code to lazily compute and print 210:
Lazy<Integer>a=()->1;for(inti=0;i<10;i++){Lazy<Integer>b=a;a=()->b.eval()+b.eval();}System.out.println("a = "+a.eval());
In the above, the variable a initially refers to a lazy integer object created by the lambda expression () -> 1
. Evaluating this lambda expression is similar [lower-alpha 1] to constructing a new instance of an anonymous class that implements Lazy<Integer>
with an eval method returning 1.
Each iteration of the loop links a to a new object created by evaluating the lambda expression inside the loop. Each of these objects holds a reference to another lazy object, b, and has an eval method that calls b.eval()
twice and returns the sum. The variable b is needed here to meet Java's requirement that variables referenced from within a lambda expression be effectively final.
This is an inefficient program because this implementation of lazy integers does not memoize the result of previous calls to eval. It also involves considerable autoboxing and unboxing. What may not be obvious is that, at the end of the loop, the program has constructed a linked list of 11 objects and that all of the actual additions involved in computing the result are done in response to the call to a.eval()
on the final line of code. This call recursively traverses the list to perform the necessary additions.
We can build a Java class that memoizes a lazy object as follows: [23] [24]
classMemo<T>implementsLazy<T>{privateLazy<T>lazy;// a lazy expression, eval sets it to nullprivateTmemo;// the memorandum of the previous valuepublicMemo(Lazy<T>lazy){this.lazy=lazy;}publicTeval(){if(lazy!=null){memo=lazy.eval();lazy=null;}returnmemo;}}
This allows the previous example to be rewritten to be far more efficient. Where the original ran in time exponential in the number of iterations, the memoized version runs in linear time:
Lazy<Integer>a=()->1;for(inti=0;i<10;i++){Lazy<Integer>b=a;a=newMemo<Integer>(()->b.eval()+b.eval());}System.out.println("a = "+a.eval());
Java's lambda expressions are just syntactic sugar. Anything that can be written with a lambda expression can be rewritten as a call to construct an instance of an anonymous inner class implementing the interface, [lower-alpha 1] and any use of an anonymous inner class can be rewritten using a named inner class, and any named inner class can be moved to the outermost nesting level.
In JavaScript, lazy evaluation can be simulated by using a generator. For example, the stream of all Fibonacci numbers can be written, using memoization, as:
/** * Generator functions return generator objects, which reify lazy evaluation. * @return {!Generator<bigint>} A non-null generator of integers. */function*fibonacciNumbers(){letmemo=[1n,-1n];// create the initial state (e.g. a vector of "negafibonacci" numbers)while(true){// repeat indefinitelymemo=[memo[0]+memo[1],memo[0]];// update the state on each evaluationyieldmemo[0];// yield the next value and suspend execution until resumed}}letstream=fibonacciNumbers();// create a lazy evaluated stream of numbersletfirst10=Array.from(newArray(10),()=>stream.next().value);// evaluate only the first 10 numbersconsole.log(first10);// the output is [0n, 1n, 1n, 2n, 3n, 5n, 8n, 13n, 21n, 34n]
In Python 2.x the range()
function [27] computes a list of integers. The entire list is stored in memory when the first assignment statement is evaluated, so this is an example of eager or immediate evaluation:
>>> r=range(10)>>> printr[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]>>> printr[3]3
In Python 3.x the range()
function [28] returns a generator which computes elements of the list on demand. Elements are only generated when they are needed (e.g., when print(r[3])
is evaluated in the following example), so this is an example of lazy or deferred evaluation:
>>> r=range(10)>>> print(r)range(0, 10)>>> print(r[3])3
In Python 2.x is possible to use a function called xrange()
which returns an object that generates the numbers in the range on demand. The advantage of xrange
is that generated object will always take the same amount of memory.
>>> r=xrange(10)>>> print(r)xrange(10)>>> lst=[xforxinr]>>> print(lst)[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
From version 2.2 forward, Python manifests lazy evaluation by implementing iterators (lazy sequences) unlike tuple or list sequences. For instance (Python 2):
>>> numbers=range(10)>>> iterator=iter(numbers)>>> printnumbers[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]>>> printiterator<listiterator object at 0xf7e8dd4c>>>> printiterator.next()0
In the .NET framework, it is possible to do lazy evaluation using the class System.Lazy<T>
. [29] The class can be easily exploited in F# using the lazy
keyword, while the force
method will force the evaluation. There are also specialized collections like Microsoft.FSharp.Collections.Seq
that provide built-in support for lazy evaluation.
letfibonacci=Seq.unfold(fun(x,y)->Some(x,(y,x+y)))(0I,1I)fibonacci|>Seq.nth1000
In C# and VB.NET, the class System.Lazy<T>
is directly used.
publicintSum(){inta=0;intb=0;Lazy<int>x=newLazy<int>(()=>a+b);a=3;b=5;returnx.Value;// returns 8}
Or with a more practical example:
// recursive calculation of the n'th fibonacci numberpublicintFib(intn){return(n==1)?1:(n==2)?1:Fib(n-1)+Fib(n-2);}publicvoidMain(){Console.WriteLine("Which Fibonacci number do you want to calculate?");intn=Int32.Parse(Console.ReadLine());Lazy<int>fib=newLazy<int>(()=>Fib(n));// function is prepared, but not executedboolexecute;if(n>100){Console.WriteLine("This can take some time. Do you really want to calculate this large number? [y/n]");execute=(Console.ReadLine()=="y");}elseexecute=true;if(execute)Console.WriteLine(fib.Value);// number is only calculated if needed}
Another way is to use the yield
keyword:
// eager evaluation publicIEnumerable<int>Fibonacci(intx){IList<int>fibs=newList<int>();intprev=-1;intnext=1;for(inti=0;i<x;i++){intsum=prev+next;prev=next;next=sum;fibs.Add(sum);}returnfibs;}// lazy evaluation publicIEnumerable<int>LazyFibonacci(intx){intprev=-1;intnext=1;for(inti=0;i<x;i++){intsum=prev+next;prev=next;next=sum;yieldreturnsum;}}
This section needs expansion. You can help by adding to it. (May 2011) |
In computer science, functional programming is a programming paradigm where programs are constructed by applying and composing functions. It is a declarative programming paradigm in which function definitions are trees of expressions that map values to other values, rather than a sequence of imperative statements which update the running state of the program.
ML is a general-purpose, high-level, functional programming language. It is known for its use of the polymorphic Hindley–Milner type system, which automatically assigns the data types of most expressions without requiring explicit type annotations, and ensures type safety; there is a formal proof that a well-typed ML program does not cause runtime type errors. ML provides pattern matching for function arguments, garbage collection, imperative programming, call-by-value and currying. While a general-purpose programming language, ML is used heavily in programming language research and is one of the few languages to be completely specified and verified using formal semantics. Its types and pattern matching make it well-suited and commonly used to operate on other formal languages, such as in compiler writing, automated theorem proving, and formal verification.
Scheme is a dialect of the Lisp family of programming languages. Scheme was created during the 1970s at the MIT Computer Science and Artificial Intelligence Laboratory and released by its developers, Guy L. Steele and Gerald Jay Sussman, via a series of memos now known as the Lambda Papers. It was the first dialect of Lisp to choose lexical scope and the first to require implementations to perform tail-call optimization, giving stronger support for functional programming and associated techniques such as recursive algorithms. It was also one of the first programming languages to support first-class continuations. It had a significant influence on the effort that led to the development of Common Lisp.
In programming languages, a closure, also lexical closure or function closure, is a technique for implementing lexically scoped name binding in a language with first-class functions. Operationally, a closure is a record storing a function together with an environment. The environment is a mapping associating each free variable of the function with the value or reference to which the name was bound when the closure was created. Unlike a plain function, a closure allows the function to access those captured variables through the closure's copies of their values or references, even when the function is invoked outside their scope.
In combinatory logic for computer science, a fixed-point combinator, is a higher-order function that returns some fixed point of its argument function, if one exists.
F# is a general-purpose, high-level, strongly typed, multi-paradigm programming language that encompasses functional, imperative, and object-oriented programming methods. It is most often used as a cross-platform Common Language Infrastructure (CLI) language on .NET, but can also generate JavaScript and graphics processing unit (GPU) code.
In computer science, pattern matching is the act of checking a given sequence of tokens for the presence of the constituents of some pattern. In contrast to pattern recognition, the match usually has to be exact: "either it will or will not be a match." The patterns generally have the form of either sequences or tree structures. Uses of pattern matching include outputting the locations of a pattern within a token sequence, to output some component of the matched pattern, and to substitute the matching pattern with some other token sequence.
In computer programming, especially functional programming and type theory, an algebraic data type (ADT) is a kind of composite type, i.e., a type formed by combining other types.
In computer programming, a default argument is an argument to a function that a programmer is not required to specify. In most programming languages, functions may take one or more arguments. Usually, each argument must be specified in full. Later languages allow the programmer to specify default arguments that always have a value, even if one is not specified when calling the function.
In computer science, a generator is a routine that can be used to control the iteration behaviour of a loop. All generators are also iterators. A generator is very similar to a function that returns an array, in that a generator has parameters, can be called, and generates a sequence of values. However, instead of building an array containing all the values and returning them all at once, a generator yields the values one at a time, which requires less memory and allows the caller to get started processing the first few values immediately. In short, a generator looks like a function but behaves like an iterator.
In computer science, corecursion is a type of operation that is dual to recursion. Whereas recursion works analytically, starting on data further from a base case and breaking it down into smaller data and repeating until one reaches a base case, corecursion works synthetically, starting from a base case and building it up, iteratively producing data further removed from a base case. Put simply, corecursive algorithms use the data that they themselves produce, bit by bit, as they become available, and needed, to produce further bits of data. A similar but distinct concept is generative recursion, which may lack a definite "direction" inherent in corecursion and recursion.
In computer science, function composition is an act or mechanism to combine simple functions to build more complicated ones. Like the usual composition of functions in mathematics, the result of each function is passed as the argument of the next, and the result of the last one is the result of the whole.
In computer science, future, promise, delay, and deferred refer to constructs used for synchronizing program execution in some concurrent programming languages. They describe an object that acts as a proxy for a result that is initially unknown, usually because the computation of its value is not yet complete.
In a programming language, an evaluation strategy is a set of rules for evaluating expressions. The term is often used to refer to the more specific notion of a parameter-passing strategy that defines the kind of value that is passed to the function for each parameter and whether to evaluate the parameters of a function call, and if so in what order. The notion of reduction strategy is distinct, although some authors conflate the two terms and the definition of each term is not widely agreed upon.
In computer science, recursion is a method of solving a computational problem where the solution depends on solutions to smaller instances of the same problem. Recursion solves such recursive problems by using functions that call themselves from within their own code. The approach can be applied to many types of problems, and recursion is one of the central ideas of computer science.
The power of recursion evidently lies in the possibility of defining an infinite set of objects by a finite statement. In the same manner, an infinite number of computations can be described by a finite recursive program, even if this program contains no explicit repetitions.
Tacit programming, also called point-free style, is a programming paradigm in which function definitions do not identify the arguments on which they operate. Instead the definitions merely compose other functions, among which are combinators that manipulate the arguments. Tacit programming is of theoretical interest, because the strict use of composition results in programs that are well adapted for equational reasoning. It is also the natural style of certain programming languages, including APL and its derivatives, and concatenative languages such as Forth. The lack of argument naming gives point-free style a reputation of being unnecessarily obscure, hence the epithet "pointless style".
In computer programming, an anonymous function is a function definition that is not bound to an identifier. Anonymous functions are often arguments being passed to higher-order functions or used for constructing the result of a higher-order function that needs to return a function. If the function is only used once, or a limited number of times, an anonymous function may be syntactically lighter than using a named function. Anonymous functions are ubiquitous in functional programming languages and other languages with first-class functions, where they fulfil the same role for the function type as literals do for other data types.
This article describes the features in the programming language Haskell.
Alice ML is a general-purpose, high-level, multi-paradigm, functional programming language designed by the Programming Systems Laboratory at Saarland University, Saarbrücken, Germany. It is a dialect of Standard ML, augmented with support for lazy evaluation, concurrency and constraint programming.