Execution model

Last updated March 03, 2024

In computing, a programming language consists of a syntax plus an execution model. The execution model specifies the behavior of elements of the language. By applying the execution model, one can derive the behavior of a program that was written in terms of that programming language. For example, when a programmer "reads" code, in their mind, they walk through what each line of code does. In effect they simulate the behavior inside their mind. What the programmer is doing is applying the execution model to the code, which results in the behavior of the code.

Each and every programming language has an execution model, which determines the manner in which the units of work (that are indicated by program syntax) are scheduled for execution. Detailed examples of the specification of execution models of a few popular languages include those of Python,^[1] the execution model of the Unified Parallel C (UPC) programming language, ^[2] a discussion of various classes of execution model such as for imperative versus functional languages,^[3] and an article discussing execution models for real-time embedded languages.^[4]

Details of an execution model

Operational Semantics is one method of specifying a language's execution model. The observed behavior of a running program must match the behavior derived from the operational semantics (which define the execution model of the language).

An execution model covers things such as what is an indivisible unit of work, and what are the constraints on the order in which those units of work may take place. For example, the addition operation is an indivisible unit of work in many languages, and in sequential languages such units of work are constrained to take place one after the other.

To illustrate this, consider the C programming language, as described in the book by Kernighan and Richie.^[5] C has a concept called a statement. The language specification defines a statement as a chunk of syntax that is terminated by a ";". The language spec then says that "execution of the program proceeds one statement after the other, in sequence". Those words: "execution of the program proceeds one statement after the other, in sequence" are one piece of the execution model of C. Those words tell us that statements are indivisible units of work and that they proceed in the same order as their syntactic appearance in the code (except when a control statement such as IF or FOR modifies the order). By stating that "execution of the program proceeds one statement after the other, in sequence", the programming model has stated constraints on the order of performing units of work.

The C language actually has an additional level to its execution model, which is the order of precedence. Order of precedence states the rules for the order of operations within a single statement. The order of precedence can be viewed as stating the constraints on performing the units of work that are within a single statement. So, ";" and "IF" and "WHILE" cover constraints on the order of statements, while order of precedence covers constraints on work within a statement. Hence, these parts of the C language specification are also part of the execution model of the C language.

Execution models can also exist independently from programming languages, examples of which would be the POSIX Threads library, and Hadoop's Map-Reduce programming model. The implementation of an execution model can be via compiler, or interpreter, and often includes a runtime system.

An implementation of an execution model controls the order in which work takes place during execution. This order may be chosen ahead of time, in some situations, or it can be dynamically determined as the execution proceeds. Most execution models allow varying degrees of both. For example, the C language fixes the order of work within a statement and it fixes the order of all statements, except ones that involve an IF statement or a form of loop statement. Hence, most of the order of execution may be chosen statically, before execution begins, but a small portion must be chosen dynamically, as execution proceeds.

The static choices are most often implemented inside a compiler, in which case the order of work is represented by the order in which instructions are placed into the executable binary. The dynamic choices would then be implemented inside the language's runtime system. The runtime system may be a library, which is called by instructions inserted by the compiler, or the runtime system may be embedded into the executable directly, such as by inserting branch instructions, which make dynamic choices about which work to perform next.

However, an interpreter may also be constructed for any language, in which case all decisions on order of execution are dynamic. An interpreter can be viewed as being part translator, and part execution model implementation.

Assembly language execution model versus implementation by micro-architectures

Assembly languages also have execution models, the same as any other language. Such an execution model is implemented by a CPU micro-architecture. For example, both a 5 stage in-order pipeline and a large out of order CPU implement the same assembly language execution model. The execution model is the definition of the behavior, so all implementations, whether in-order or out-of-order or interpreted or JIT'd etc.. must all give the exact same result, and that result is defined by the execution model.

Parallel Execution Models

In the modern age, parallel programming is an increasingly important topic. Parallel execution models tend to be complex because they involve multiple timelines. Parallel execution models necessarily include the behavior of synchronization constructs. A synchronization construct has the effect of establishing an ordering between activities in one timeline relative to activities in another timeline.

For example, a common synchronization construct is the lock. Consider one timeline. The timeline has a point at which it executes the "gain ownership of the lock" synchronization construct. In Posix threads this would be pthread_mutex_lock(&myMutex). In Java this would be lock.lock(). In both cases, the timeline is called a thread. The C and Java execution models are sequential, and they state that the timeline has activities that come before the call to "gain ownership of the lock", and activities that come after the call. Likewise there is a "give up ownership of the lock" operation. In C this would be pthread_mutex_unlock(&myMutex). In Java this would be lock.unlock(). Again, the execution models C and Java define that one group of statements is executed before ownership of the lock is given up, and another group of statements is executed after ownership of the lock is given up.

Now, consider the case of two timelines, also known as two threads. One thread, call it thread A, executes some statements, call them A-pre-gain-lock statements. Then thread A executes "gain ownership of the lock", then thread A executes A-post-gain-lock statements, which come after A gains ownership of the lock. Finally, thread A performs "give up ownership of the lock". Then thread A performs A-post-giveup-lock statements.

A second thread, call it thread B, executes some statements, call them B-pre-lock statements. Then thread B executes "gain ownership of the lock", then thread B executes B-post-lock statements, which come after B gains ownership of the lock.

Now, we can say the parallel execution model of the "gain ownership of lock" and "give up ownership of lock" synchronization construct. The execution model is this:

"In the case that ownership of the lock goes from thread A to thread B, A-post-gain-lock statements come before B-post-gain-lock statements."

The complication comes from the fact that the execution model does not have any means for the execution of "give up ownership of the lock" to have any influence over which execution of "gain ownership of the lock" in some other timeline (thread) follows. Very often, only certain handoffs give valid results. Thus, the programmer must think of all possible combinations of one thread giving up a lock and another thread getting it next, and make sure their code only allows valid combinations.

The only effect is that A-post-gain-lock statements come before B-post-gain-lock statements. No other effect happens, and no other relative ordering can be relied upon. Specifically, A-post-give-up-lock and B-post-gain-lock have no relative ordering defined, which surprises many people. But thread A may have been swapped out after giving up ownership, so A-post-give-up-lock statements may happen long after many B-post-gain-lock statements have finished. That is one of the possibilities that must be thought about when designing locks, and illustrates why multi-threaded programming is difficult.

Modern parallel languages have much easier to use execution models. The thread model was one of the original parallel execution models, which may account for why it has persisted despite being difficult to use.

Related Research Articles

In computer science, a thread of execution is the smallest sequence of programmed instructions that can be managed independently by a scheduler, which is typically a part of the operating system. In many cases, a thread is a component of a process.

Thread safety is a computer programming concept applicable to multi-threaded code. Thread-safe code only manipulates shared data structures in a manner that ensures that all threads behave properly and fulfill their design specifications without unintended interaction. There are various strategies for making thread-safe data structures.

Parallel computing is a type of computation in which many calculations or processes are carried out simultaneously. Large problems can often be divided into smaller ones, which can then be solved at the same time. There are several different forms of parallel computing: bit-level, instruction-level, data, and task parallelism. Parallelism has long been employed in high-performance computing, but has gained broader interest due to the physical constraints preventing frequency scaling. As power consumption by computers has become a concern in recent years, parallel computing has become the dominant paradigm in computer architecture, mainly in the form of multi-core processors.

In software engineering, double-checked locking is a software design pattern used to reduce the overhead of acquiring a lock by testing the locking criterion before acquiring the lock. Locking occurs only if the locking criterion check indicates that locking is required.

In computer science, a lock or mutex is a synchronization primitive that prevents state from being modified or accessed by multiple threads of execution at once. Locks enforce mutual exclusion concurrency control policies, and with a variety of possible methods there exist multiple unique implementations for different applications.

In concurrent programming, concurrent accesses to shared resources can lead to unexpected or erroneous behavior, so parts of the program where the shared resource is accessed need to be protected in ways that avoid the concurrent access. One way to do so is known as a critical section or critical region. This protected section cannot be entered by more than one process or thread at a time; others are suspended until the first leaves the critical section. Typically, the critical section accesses a shared resource, such as a data structure, a peripheral device, or a network connection, that would not operate correctly in the context of multiple concurrent accesses.

Resource acquisition is initialization (RAII) is a programming idiom used in several object-oriented, statically typed programming languages to describe a particular language behavior. In RAII, holding a resource is a class invariant, and is tied to object lifetime. Resource allocation is done during object creation, by the constructor, while resource deallocation (release) is done during object destruction, by the destructor. In other words, resource acquisition must succeed for initialization to succeed. Thus the resource is guaranteed to be held between when initialization finishes and finalization starts, and to be held only when the object is alive. Thus if there are no object leaks, there are no resource leaks.

In computing, a memory barrier, also known as a membar, memory fence or fence instruction, is a type of barrier instruction that causes a central processing unit (CPU) or compiler to enforce an ordering constraint on memory operations issued before and after the barrier instruction. This typically means that operations issued prior to the barrier are guaranteed to be performed before operations issued after the barrier.

In concurrent programming, a monitor is a synchronization construct that prevents threads from concurrently accessing a shared object's state and allows them to wait for the state to change. They provide a mechanism for threads to temporarily give up exclusive access in order to wait for some condition to be met, before regaining exclusive access and resuming their task. A monitor consists of a mutex (lock) and at least one condition variable. A condition variable is explicitly 'signalled' when the object's state is modified, temporarily passing the mutex to another thread 'waiting' on the conditional variable.

In computer programming, a runtime system or runtime environment is a sub-system that exists both in the computer where a program is created, as well as in the computers where the program is intended to be run. The name comes from the compile time and runtime division from compiled languages, which similarly distinguishes the computer processes involved in the creation of a program (compilation) and its execution in the target machine.

Concurrent computing is a form of computing in which several computations are executed concurrently—during overlapping time periods—instead of sequentially—with one completing before the next starts.

In computer science, a readers–writer is a synchronization primitive that solves one of the readers–writers problems. An RW lock allows concurrent access for read-only operations, whereas write operations require exclusive access. This means that multiple threads can read the data in parallel but an exclusive lock is needed for writing or modifying data. When a writer is writing the data, all other writers and readers will be blocked until the writer is finished writing. A common use might be to control access to a data structure in memory that cannot be updated atomically and is invalid until the update is complete.

In parallel computing, a barrier is a type of synchronization method. A barrier for a group of threads or processes in the source code means any thread/process must stop at this point and cannot proceed until all other threads/processes reach this barrier.

The Java memory model describes how threads in the Java programming language interact through memory. Together with the description of single-threaded execution of code, the memory model provides the semantics of the Java programming language.

The Java programming language and the Java virtual machine (JVM) is designed to support concurrent programming. All execution takes place in the context of threads. Objects and resources can be accessed by many separate threads. Each thread has its own path of execution, but can potentially access any object in the program. The programmer must ensure read and write access to objects is properly coordinated between threads. Thread synchronization ensures that objects are modified by only one thread at a time and prevents threads from accessing partially updated objects during modification by another thread. The Java language has built-in constructs to support this coordination.

Parallel Extensions was the development name for a managed concurrency library developed by a collaboration between Microsoft Research and the CLR team at Microsoft. The library was released in version 4.0 of the .NET Framework. It is composed of two parts: Parallel LINQ (PLINQ) and Task Parallel Library (TPL). It also consists of a set of coordination data structures (CDS) – sets of data structures used to synchronize and co-ordinate the execution of concurrent tasks.

Loop-level parallelism is a form of parallelism in software programming that is concerned with extracting parallel tasks from loops. The opportunity for loop-level parallelism often arises in computing programs where data is stored in random access data structures. Where a sequential program will iterate over the data structure and operate on indices one at a time, a program exploiting loop-level parallelism will use multiple threads or processes which operate on some or all of the indices at the same time. Such parallelism provides a speedup to overall execution time of the program, typically in line with Amdahl's law.

In computing, a memory model describes the interactions of threads through memory and their shared use of the data.

In parallel computing, work stealing is a scheduling strategy for multithreaded computer programs. It solves the problem of executing a dynamically multithreaded computation, one that can "spawn" new threads of execution, on a statically multithreaded computer, with a fixed number of processors. It does so efficiently in terms of execution time, memory usage, and inter-processor communication.

References

↑ "Python Documentation: Execution Model".
↑ "UPC Language Features".
↑ Cardoso, J.M.P.; Diniz, P.C. (2011). Programming Languages and Execution Models. Springer US. ISBN 9780387096711.
↑ PELLIZZONI, R.; BETTI, E.; BAK, S.; YAO, G.; CRISWELL, J.; CACCAMO, M. & KEGLEY, R (2011). "A Predictable Execution Model for COTS-based Embedded Systems" (PDF). Real-Time and Embedded Technology and Applications Symposium. IEEE.
↑ Kernighan, Brian W.; Dennis M. Ritchie (February 1978). The C Programming Language (1st ed.). Englewood Cliffs, NJ: Prentice Hall. ISBN 0-13-110163-3.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] "Python Documentation: Execution Model".

[2] "UPC Language Features".

[3] Cardoso, J.M.P.; Diniz, P.C. (2011). Programming Languages and Execution Models. Springer US. ISBN 9780387096711.

[4] PELLIZZONI, R.; BETTI, E.; BAK, S.; YAO, G.; CRISWELL, J.; CACCAMO, M. & KEGLEY, R (2011). "A Predictable Execution Model for COTS-based Embedded Systems" (PDF). Real-Time and Embedded Technology and Applications Symposium. IEEE.

[k&r1e-5] Kernighan, Brian W.; Dennis M. Ritchie (February 1978). The C Programming Language (1st ed.). Englewood Cliffs, NJ: Prentice Hall. ISBN 0-13-110163-3.

[1]

[2]

[3]

[4]

[5]