In computer science, future, promise, delay, and deferred refer to constructs used for synchronizing program execution in some concurrent programming languages. They describe an object that acts as a proxy for a result that is initially unknown, usually because the computation of its value is not yet complete.
The term promise was proposed in 1976 by Daniel P. Friedman and David Wise, [1] and Peter Hibbard called it eventual. [2] A somewhat similar concept future was introduced in 1977 in a paper by Henry Baker and Carl Hewitt. [3]
The terms future, promise, delay, and deferred are often used interchangeably, although some differences in usage between future and promise are treated below. Specifically, when usage is distinguished, a future is a read-only placeholder view of a variable, while a promise is a writable, single assignment container which sets the value of the future. Notably, a future may be defined without specifying which specific promise will set its value, and different possible promises may set the value of a given future, though this can be done only once for a given future. In other cases a future and a promise are created together and associated with each other: the future is the value, the promise is the function that sets the value – essentially the return value (future) of an asynchronous function (promise). Setting the value of a future is also called resolving, fulfilling, or binding it.
Futures and promises originated in functional programming and related paradigms (such as logic programming) to decouple a value (a future) from how it was computed (a promise), allowing the computation to be done more flexibly, notably by parallelizing it. Later, it found use in distributed computing, in reducing the latency from communication round trips. Later still, it gained more use by allowing writing asynchronous programs in direct style, rather than in continuation-passing style.
Use of futures may be implicit (any use of the future automatically obtains its value, as if it were an ordinary reference) or explicit (the user must call a function to obtain the value, such as the get
method of java.util.concurrent.Future
in Java). Obtaining the value of an explicit future can be called stinging or forcing. Explicit futures can be implemented as a library, whereas implicit futures are usually implemented as part of the language.
The original Baker and Hewitt paper described implicit futures, which are naturally supported in the actor model of computation and pure object-oriented programming languages like Smalltalk. The Friedman and Wise paper described only explicit futures, probably reflecting the difficulty of efficiently implementing implicit futures on stock hardware. The difficulty is that stock hardware does not deal with futures for primitive data types like integers. For example, an add instruction does not know how to deal with 3 + future factorial(100000)
. In pure actor or object languages this problem can be solved by sending future factorial(100000)
the message +[3]
, which asks the future to add 3
to itself and return the result. Note that the message passing approach works regardless of when factorial(100000)
finishes computation and that no stinging/forcing is needed.
The use of futures can dramatically reduce latency in distributed systems. For instance, futures enable promise pipelining, [4] [5] as implemented in the languages E and Joule, which was also called call-stream [6] in the language Argus.
Consider an expression involving conventional remote procedure calls, such as:
t3 := ( x.a() ).c( y.b() )
which could be expanded to
t1 := x.a(); t2 := y.b(); t3 := t1.c(t2);
Each statement needs a message to be sent and a reply received before the next statement can proceed. Suppose, for example, that x
, y
, t1
, and t2
are all located on the same remote machine. In this case, two complete network round-trips to that machine must take place before the third statement can begin to execute. The third statement will then cause yet another round-trip to the same remote machine.
Using futures, the above expression could be written
t3 := (x <- a()) <- c(y <- b())
which could be expanded to
t1 := x <- a(); t2 := y <- b(); t3 := t1 <- c(t2);
The syntax used here is that of the language E, where x <- a()
means to send the message a()
asynchronously to x
. All three variables are immediately assigned futures for their results, and execution proceeds to subsequent statements. Later attempts to resolve the value of t3
may cause a delay; however, pipelining can reduce the number of round-trips needed. If, as in the prior example, x
, y
, t1
, and t2
are all located on the same remote machine, a pipelined implementation can compute t3
with one round-trip instead of three. Because all three messages are destined for objects which are on the same remote machine, only one request need be sent and only one response need be received containing the result. The send t1 <- c(t2)
would not block even if t1
and t2
were on different machines to each other, or to x
or y
.
Promise pipelining should be distinguished from parallel asynchronous message passing. In a system supporting parallel message passing but not pipelining, the message sends x <- a()
and y <- b()
in the above example could proceed in parallel, but the send of t1 <- c(t2)
would have to wait until both t1
and t2
had been received, even when x
, y
, t1
, and t2
are on the same remote machine. The relative latency advantage of pipelining becomes even greater in more complicated situations involving many messages.
Promise pipelining also should not be confused with pipelined message processing in actor systems, where it is possible for an actor to specify and begin executing a behaviour for the next message before having completed processing of the current message.
In some programming languages such as Oz, E, and AmbientTalk, it is possible to obtain a read-only view of a future, which allows reading its value when resolved, but does not permit resolving it:
!!
operator is used to obtain a read-only view.std::future
provides a read-only view. The value is set directly by using a std::promise
, or set to the result of a function call using std::packaged_task
or std::async
.System.Threading.Tasks.Task<T>
represents a read-only view. Resolving the value can be done via System.Threading.Tasks.TaskCompletionSource<T>
.Support for read-only views is consistent with the principle of least privilege, since it enables the ability to set the value to be restricted to subjects that need to set it. In a system that also supports pipelining, the sender of an asynchronous message (with result) receives the read-only promise for the result, and the target of the message receives the resolver.
Some languages, such as Alice ML, define futures that are associated with a specific thread that computes the future's value. [9] This computation can start either eagerly when the future is created, or lazily when its value is first needed. A lazy future is similar to a thunk, in the sense of a delayed computation.
Alice ML also supports futures that can be resolved by any thread, and calls these promises. [8] This use of promise is different from its use in E as described above. In Alice, a promise is not a read-only view, and promise pipelining is unsupported. Instead, pipelining naturally happens for futures, including ones associated with promises.
If the value of a future is accessed asynchronously, for example by sending a message to it, or by explicitly waiting for it using a construct such as when
in E, then there is no difficulty in delaying until the future is resolved before the message can be received or the wait completes. This is the only case to be considered in purely asynchronous systems such as pure actor languages.
However, in some systems it may also be possible to attempt to immediately or synchronously access a future's value. Then there is a design choice to be made:
As an example of the first possibility, in C++11, a thread that needs the value of a future can block until it is available by calling the wait()
or get()
member functions. A timeout can also be specified on the wait using the wait_for()
or wait_until()
member functions to avoid indefinite blocking. If the future arose from a call to std::async
then a blocking wait (without a timeout) may cause synchronous invocation of the function to compute the result on the waiting thread.
Futures are a particular case of the synchronization primitive "events," which can be completed only once. In general, events can be reset to initial empty state and, thus, completed as many times as desired. [11]
An I-var (as in the language Id) is a future with blocking semantics as defined above. An I-structure is a data structure containing I-vars. A related synchronization construct that can be set multiple times with different values is called an M-var. M-vars support atomic operations to take or put the current value, where taking the value also sets the M-var back to its initial empty state. [12]
A concurrent logic variable[ citation needed ] is similar to a future, but is updated by unification, in the same way as logic variables in logic programming. Thus it can be bound more than once to unifiable values, but cannot be set back to an empty or unresolved state. The dataflow variables of Oz act as concurrent logic variables, and also have blocking semantics as mentioned above.
A concurrent constraint variable is a generalization of concurrent logic variables to support constraint logic programming: the constraint may be narrowed multiple times, indicating smaller sets of possible values. Typically there is a way to specify a thunk that should run whenever the constraint is narrowed further; this is needed to support constraint propagation.
Eager thread-specific futures can be straightforwardly implemented in non-thread-specific futures, by creating a thread to calculate the value at the same time as creating the future. In this case it is desirable to return a read-only view to the client, so that only the newly created thread is able to resolve this future.
To implement implicit lazy thread-specific futures (as provided by Alice ML, for example) in terms in non-thread-specific futures, needs a mechanism to determine when the future's value is first needed (for example, the WaitNeeded
construct in Oz [13] ). If all values are objects, then the ability to implement transparent forwarding objects is sufficient, since the first message sent to the forwarder indicates that the future's value is needed.
Non-thread-specific futures can be implemented in thread-specific futures, assuming that the system supports message passing, by having the resolving thread send a message to the future's own thread. However, this can be viewed as unneeded complexity. In programming languages based on threads, the most expressive approach seems to be to provide a mix of non-thread-specific futures, read-only views, and either a WaitNeeded construct, or support for transparent forwarding.
The evaluation strategy of futures, which may be termed call by future , is non-deterministic: the value of a future will be evaluated at some time between when the future is created and when its value is used, but the precise time is not determined beforehand and can change from run to run. The computation can start as soon as the future is created (eager evaluation) or only when the value is actually needed (lazy evaluation), and may be suspended part-way through, or executed in one run. Once the value of a future is assigned, it is not recomputed on future accesses; this is like the memoization used in call by need.
A lazy future is a future that deterministically has lazy evaluation semantics: the computation of the future's value starts when the value is first needed, as in call by need. Lazy futures are of use in languages which evaluation strategy is by default not lazy. For example, in C++11 such lazy futures can be created by passing the std::launch::deferred
launch policy to std::async
, along with the function to compute the value.
In the actor model, an expression of the form future <Expression>
is defined by how it responds to an Eval
message with environment E and customer C as follows: The future expression responds to the Eval
message by sending the customer C a newly created actor F (the proxy for the response of evaluating <Expression>
) as a return value concurrently with sending <Expression>
an Eval
message with environment E and customer C. The default behavior of F is as follows:
<Expression>
proceeding as follows: <Expression>
, then V is stored in F and However, some futures can deal with requests in special ways to provide greater parallelism. For example, the expression 1 + future factorial(n)
can create a new future that will behave like the number 1+factorial(n)
. This trick does not always work. For example, the following conditional expression:
if m>future factorial(n) then print("bigger") else print("smaller")
suspends until the future for factorial(n)
has responded to the request asking if m
is greater than itself.
The future and/or promise constructs were first implemented in programming languages such as MultiLisp and Act 1. The use of logic variables for communication in concurrent logic programming languages was quite similar to futures. These began in Prolog with Freeze and IC Prolog, and became a true concurrency primitive with Relational Language, Concurrent Prolog, guarded Horn clauses (GHC), Parlog, Strand, Vulcan, Janus, Oz-Mozart, Flow Java, and Alice ML. The single-assignment I-var from dataflow programming languages, originating in Id and included in Reppy's Concurrent ML , is much like the concurrent logic variable.
The promise pipelining technique (using futures to overcome latency) was invented by Barbara Liskov and Liuba Shrira in 1988, [6] and independently by Mark S. Miller, Dean Tribble and Rob Jellinghaus in the context of Project Xanadu circa 1989. [14]
The term promise was coined by Liskov and Shrira, although they referred to the pipelining mechanism by the name call-stream, which is now rarely used.
Both the design described in Liskov and Shrira's paper, and the implementation of promise pipelining in Xanadu, had the limit that promise values were not first-class: an argument to, or the value returned by a call or send could not directly be a promise (so the example of promise pipelining given earlier, which uses a promise for the result of one send as an argument to another, would not have been directly expressible in the call-stream design or in the Xanadu implementation). It seems that promises and call-streams were never implemented in any public release of Argus, [15] the programming language used in the Liskov and Shrira paper. Argus development stopped around 1988. [16] The Xanadu implementation of promise pipelining only became publicly available with the release of the source code for Udanax Gold [17] in 1999, and was never explained in any published document. [18] The later implementations in Joule and E support fully first-class promises and resolvers.
Several early actor languages, including the Act series, [19] [20] supported both parallel message passing and pipelined message processing, but not promise pipelining. (Although it is technically possible to implement the last of these features in the first two, there is no evidence that the Act languages did so.)
After 2000, a major revival of interest in futures and promises occurred, due to their use in responsiveness of user interfaces, and in web development, due to the request–response model of message-passing. Several mainstream languages now have language support for futures and promises, most notably popularized by FutureTask
in Java 5 (announced 2004) [21] and the async/await constructions in .NET 4.5 (announced 2010, released 2012) [22] [23] largely inspired by the asynchronous workflows of F#, [24] which dates to 2007. [25] This has subsequently been adopted by other languages, notably Dart (2014), [26] Python (2015), [27] Hack (HHVM), and drafts of ECMAScript 7 (JavaScript), Scala, and C++ (2011).
Some programming languages are supporting futures, promises, concurrent logic variables, dataflow variables, or I-vars, either by direct language support or in the standard library.
java.util.concurrent.Future
or java.util.concurrent.CompletableFuture
async
and await
since ECMAScript 2017 [33] kotlin.native.concurrent.Future
is only usually used when writing Kotlin that is intended to run natively [35] .await
) [41] Languages also supporting promise pipelining include:
async
/non-blocking await
[95] Futures can be implemented in coroutines [27] or generators, [103] resulting in the same evaluation strategy (e.g., cooperative multitasking or lazy evaluation).
Futures can easily be implemented in channels: a future is a one-element channel, and a promise is a process that sends to the channel, fulfilling the future. [104] [105] This allows futures to be implemented in concurrent programming languages with support for channels, such as CSP and Go. The resulting futures are explicit, as they must be accessed by reading from the channel, rather than only evaluation.
In software engineering, double-checked locking is a software design pattern used to reduce the overhead of acquiring a lock by testing the locking criterion before acquiring the lock. Locking occurs only if the locking criterion check indicates that locking is required.
Programming languages can be grouped by the number and types of paradigms supported.
F# is a general-purpose, high-level, strongly typed, multi-paradigm programming language that encompasses functional, imperative, and object-oriented programming methods. It is most often used as a cross-platform Common Language Infrastructure (CLI) language on .NET, but can also generate JavaScript and graphics processing unit (GPU) code.
Coroutines are computer program components that allow execution to be suspended and resumed, generalizing subroutines for cooperative multitasking. Coroutines are well-suited for implementing familiar program components such as cooperative tasks, exceptions, event loops, iterators, infinite lists and pipes.
In computer science, the dining philosophers problem is an example problem often used in concurrent algorithm design to illustrate synchronization issues and techniques for resolving them.
In computer programming, thread-local storage (TLS) is a memory management method that uses static or global memory local to a thread. The concept allows storage of data that appears to be global in a system with separate threads.
Concurrent computing is a form of computing in which several computations are executed concurrently—during overlapping time periods—instead of sequentially—with one completing before the next starts.
In digital logic design, an asynchronous circuit is quasi delay-insensitive (QDI) when it operates correctly, independent of gate and wire delay with the weakest exception necessary to be turing-complete.
In a programming language, an evaluation strategy is a set of rules for evaluating expressions. The term is often used to refer to the more specific notion of a parameter-passing strategy that defines the kind of value that is passed to the function for each parameter and whether to evaluate the parameters of a function call, and if so in what order. The notion of reduction strategy is distinct, although some authors conflate the two terms and the definition of each term is not widely agreed upon.
The active object design pattern decouples method execution from method invocation for objects that each reside in their own thread of control. The goal is to introduce concurrency, by using asynchronous method invocation and a scheduler for handling requests.
C++11 is a version of the ISO/IEC 14882 standard for the C++ programming language. C++11 replaced the prior version of the C++ standard, called C++03, and was later replaced by C++14. The name follows the tradition of naming language versions by the publication year of the specification, though it was formerly named C++0x because it was expected to be published before 2010.
Cooperative multitasking, also known as non-preemptive multitasking, is a style of computer multitasking in which the operating system never initiates a context switch from a running process to another process. Instead, in order to run multiple applications concurrently, processes voluntarily yield control periodically or when idle or logically blocked. This type of multitasking is called cooperative because all programs must cooperate for the scheduling scheme to work.
In computer programming, a green thread is a thread that is scheduled by a runtime library or virtual machine (VM) instead of natively by the underlying operating system (OS). Green threads emulate multithreaded environments without relying on any native OS abilities, and they are managed in user space instead of kernel space, enabling them to work in environments that do not have native thread support.
In computing, the producer-consumer problem is a family of problems described by Edsger W. Dijkstra since 1965.
In computing, a channel is a model for interprocess communication and synchronization via message passing. A message may be sent over a channel, and another process or thread is able to receive messages sent over a channel it has a reference to, as a stream. Different implementations of channels may be buffered or not, and either synchronous or asynchronous.
In multithreaded computer programming, asynchronous method invocation (AMI), also known as asynchronous method calls or the asynchronous pattern is a design pattern in which the call site is not blocked while waiting for the called code to finish. Instead, the calling thread is notified when the reply arrives. Polling for a reply is an undesired option.
Join-patterns provides a way to write concurrent, parallel and distributed computer programs by message passing. Compared to the use of threads and locks, this is a high level programming model using communication constructs model to abstract the complexity of concurrent environment and to allow scalability. Its focus is on the execution of a chord between messages atomically consumed from a group of channels.
In computer programming, the async/await pattern is a syntactic feature of many programming languages that allows an asynchronous, non-blocking function to be structured in a way similar to an ordinary synchronous function. It is semantically related to the concept of a coroutine and is often implemented using similar techniques, and is primarily intended to provide opportunities for the program to execute other code while waiting for a long-running, asynchronous task to complete, usually represented by promises or similar data structures. The feature is found in C#, C++, Python, F#, Hack, Julia, Dart, Kotlin, Rust, Nim, JavaScript, Swift and Zig.
Nim is a general-purpose, multi-paradigm, statically typed, compiled high-level systems programming language, designed and developed by a team around Andreas Rumpf. Nim is designed to be "efficient, expressive, and elegant", supporting metaprogramming, functional, message passing, procedural, and object-oriented programming styles by providing several features such as compile time code generation, algebraic data types, a foreign function interface (FFI) with C, C++, Objective-C, and JavaScript, and supporting compiling to those same languages as intermediate representations.
Tokio is a software library for the Rust programming language. It provides a runtime and functions that enable the use of asynchronous I/O, allowing for concurrency in regards to task completion.