Memory model (programming)

Last updated

In computing, a memory model describes the interactions of threads through memory and their shared use of the data.

Contents

History and significance

A memory model allows a compiler to perform many important optimizations. Compiler optimizations like loop fusion move statements in the program, which can influence the order of read and write operations of potentially shared variables. Changes in the ordering of reads and writes can cause race conditions. Without a memory model, a compiler is not allowed to apply such optimizations to multi-threaded programs in general, or only in special cases. Or for some compilers assume no multi-threaded execution (so better optimized code can be produced), which can lead to optimizations that are incompatible with multi-threading - these can often lead to subtle bugs, that don't show up in early testing.

Modern programming languages like Java therefore implement a memory model. The memory model specifies synchronization barriers that are established via special, well-defined synchronization operations such as acquiring a lock by entering a synchronized block or method. The memory model stipulates that changes to the values of shared variables only need to be made visible to other threads when such a synchronization barrier is reached. Moreover, the entire notion of a race condition is defined over the order of operations with respect to these memory barriers. [1]

These semantics then give optimizing compilers a higher degree of freedom when applying optimizations: the compiler needs to make sure only that the values of (potentially shared) variables at synchronization barriers are guaranteed to be the same in both the optimized and unoptimized code. In particular, reordering statements in a block of code that contains no synchronization barrier is assumed to be safe by the compiler.

Most research in the area of memory models revolves around:

The Java Memory Model was the first attempt to provide a comprehensive threading memory model for a popular programming language. [2] After it was established that threads could not be implemented safely as a library without placing certain restrictions on the implementation and, in particular, that the C and C++ standards (C99 and C++03) lacked necessary restrictions, [3] [4] the C++ threading subcommittee set to work on suitable memory model; in 2005, they submitted C working document n1131 [5] to get the C Committee on board with their efforts. The final revision of the proposed memory model, C++ n2429, [6] was accepted into the C++ draft standard at the October 2007 meeting in Kona. [7] The memory model was then included in the next C++ and C standards, C++11 and C11. [8] [9] The Rust programming language inherited most of C/C++'s memory model. [10]

See also

Related Research Articles

Java and C++ are two prominent object-oriented programming languages. By many language popularity metrics, the two languages have dominated object-oriented and high-performance software development for much of the 21st century, and are often directly compared and contrasted. Java's syntax was based on C/C++.

<span class="mw-page-title-main">Thread (computing)</span> Smallest sequence of programmed instructions that can be managed independently by a scheduler

In computer science, a thread of execution is the smallest sequence of programmed instructions that can be managed independently by a scheduler, which is typically a part of the operating system. In many cases, a thread is a component of a process.

In multi-threaded computer programming, a function is thread-safe when it can be invoked or accessed concurrently by multiple threads without causing unexpected behavior, race conditions, or data corruption. As in the multi-threaded context where a program executes several threads simultaneously in a shared address space and each of those threads has access to all every other thread's memory, thread-safe functions need to ensures all those threads behave properly and fulfill their design specifications without unintended interaction.

In software engineering, double-checked locking is a software design pattern used to reduce the overhead of acquiring a lock by testing the locking criterion before acquiring the lock. Locking occurs only if the locking criterion check indicates that locking is required.

<span class="mw-page-title-main">OpenMP</span> Open standard for parallelizing

OpenMP is an application programming interface (API) that supports multi-platform shared-memory multiprocessing programming in C, C++, and Fortran, on many platforms, instruction-set architectures and operating systems, including Solaris, AIX, FreeBSD, HP-UX, Linux, macOS, and Windows. It consists of a set of compiler directives, library routines, and environment variables that influence run-time behavior.

<span class="mw-page-title-main">Race condition</span> When a systems behavior depends on timing of uncontrollable events

A race condition or race hazard is the condition of an electronics, software, or other system where the system's substantive behavior is dependent on the sequence or timing of other uncontrollable events, leading to unexpected or inconsistent results. It becomes a bug when one or more of the possible behaviors is undesirable.

In computer science, an algorithm is called non-blocking if failure or suspension of any thread cannot cause failure or suspension of another thread; for some operations, these algorithms provide a useful alternative to traditional blocking implementations. A non-blocking algorithm is lock-free if there is guaranteed system-wide progress, and wait-free if there is also guaranteed per-thread progress. "Non-blocking" was used as a synonym for "lock-free" in the literature until the introduction of obstruction-freedom in 2003.

In computing, a memory barrier, also known as a membar, memory fence or fence instruction, is a type of barrier instruction that causes a central processing unit (CPU) or compiler to enforce an ordering constraint on memory operations issued before and after the barrier instruction. This typically means that operations issued prior to the barrier are guaranteed to be performed before operations issued after the barrier.

In computer programming, volatile means that a value is prone to change over time, outside the control of some code. Volatility has implications within function calling conventions, and also impacts how variables are stored, accessed and cached.

Automatic parallelization, also auto parallelization, or autoparallelization refers to converting sequential code into multi-threaded and/or vectorized code in order to use multiple processors simultaneously in a shared-memory multiprocessor (SMP) machine. Fully automatic parallelization of sequential programs is a challenge because it requires complex program analysis and the best approach may depend upon parameter values that are not known at compilation time.

Concurrent computing is a form of computing in which several computations are executed concurrently—during overlapping time periods—instead of sequentially—with one completing before the next starts.

In computer science and engineering, transactional memory attempts to simplify concurrent programming by allowing a group of load and store instructions to execute in an atomic way. It is a concurrency control mechanism analogous to database transactions for controlling access to shared memory in concurrent computing. Transactional memory systems provide high-level abstraction as an alternative to low-level thread synchronization. This abstraction allows for coordination between concurrent reads and writes of shared data in parallel systems.

In computer science, a readers–writer is a synchronization primitive that solves one of the readers–writers problems. An RW lock allows concurrent access for read-only operations, whereas write operations require exclusive access. This means that multiple threads can read the data in parallel but an exclusive lock is needed for writing or modifying data. When a writer is writing the data, all other writers and readers will be blocked until the writer is finished writing. A common use might be to control access to a data structure in memory that cannot be updated atomically and is invalid until the update is complete.

The Java memory model describes how threads in the Java programming language interact through memory. Together with the description of single-threaded execution of code, the memory model provides the semantics of the Java programming language.

In computer programming, a green thread is a thread that is scheduled by a runtime library or virtual machine (VM) instead of natively by the underlying operating system (OS). Green threads emulate multithreaded environments without relying on any native OS abilities, and they are managed in user space instead of kernel space, enabling them to work in environments that do not have native thread support.

The Java programming language and the Java virtual machine (JVM) is designed to support concurrent programming. All execution takes place in the context of threads. Objects and resources can be accessed by many separate threads. Each thread has its own path of execution, but can potentially access any object in the program. The programmer must ensure read and write access to objects is properly coordinated between threads. Thread synchronization ensures that objects are modified by only one thread at a time and prevents threads from accessing partially updated objects during modification by another thread. The Java language has built-in constructs to support this coordination.

Memory ordering describes the order of accesses to computer memory by a CPU. The term can refer either to the memory ordering generated by the compiler during compile time, or to the memory ordering generated by a CPU during runtime.

In software development, the programming language Java was historically considered slower than the fastest third-generation typed languages such as C and C++. In contrast to those languages, Java compiles by default to a Java Virtual Machine (JVM) with operations distinct from those of the actual computer hardware. Early JVM implementations were interpreters; they simulated the virtual operations one-by-one rather than translating them into machine code for direct hardware execution.

For several years parallel hardware was only available for distributed computing but recently it is becoming available for the low end computers as well. Hence it has become inevitable for software programmers to start writing parallel applications. It is quite natural for programmers to think sequentially and hence they are less acquainted with writing multi-threaded or parallel processing applications. Parallel programming requires handling various issues such as synchronization and deadlock avoidance. Programmers require added expertise for writing such applications apart from their expertise in the application domain. Hence programmers prefer to write sequential code and most of the popular programming languages support it. This allows them to concentrate more on the application. Therefore, there is a need to convert such sequential applications to parallel applications with the help of automated tools. The need is also non-trivial because large amount of legacy code written over the past few decades needs to be reused and parallelized.

<span class="mw-page-title-main">Nim (programming language)</span> Programming language

Nim is a general-purpose, multi-paradigm, statically typed, compiled high-level systems programming language, designed and developed by a team around Andreas Rumpf. Nim is designed to be "efficient, expressive, and elegant", supporting metaprogramming, functional, message passing, procedural, and object-oriented programming styles by providing several features such as compile time code generation, algebraic data types, a foreign function interface (FFI) with C, C++, Objective-C, and JavaScript, and supporting compiling to those same languages as intermediate representations.

References

  1. Jeremy Manson and Brian Goetz (February 2004). "JSR 133 (Java Memory Model) FAQ" . Retrieved 2010-10-18. The Java Memory Model describes what behaviors are legal in multithreaded code, and how threads may interact through memory. It describes the relationship between variables in a program and the low-level details of storing and retrieving them to and from memory or registers in a real computer system. It does this in a way that can be implemented correctly using a wide variety of hardware and a wide variety of compiler optimizations.
  2. Goetz, Brian (2004-02-24). "Fixing the Java Memory Model, Part 1". IBM . Retrieved 2008-02-17.
  3. Buhr, Peter A. (September 11, 1995). "Are Safe Concurrency Libraries Possible?" (PDF). Communications of the ACM. Retrieved 2015-05-12.
  4. Boehm, Hans-J. (November 12, 2004). "Threads Cannot be Implemented as a Library" (PDF). Archived from the original (PDF) on 2017-05-30. Retrieved 2015-05-12.
  5. Boehm, Hans; Lea, Doug; Pugh, Bill (2005-08-26). "Implications of C++ Memory Model Discussions on the C Language" (PDF). www.open-std.org. Retrieved 2015-05-12.
  6. "WG21/N2429: Concurrency memory model (final revision)". www.open-std.org. 2007-10-05. Retrieved 2015-05-12.
  7. "N2480: A Less Formal Explanation of the Proposed C++ Concurrency Memory Model". www.open-std.org. Retrieved 2015-05-12.
  8. Alexandrescu, Andrei; Boehm, Hans; Henney, Kevlin; Hutchings, Ben; Lea, Doug; Pugh, Bill (2005-03-04). "Memory Model for Multithreaded C++: Issues" (PDF). Retrieved 2014-04-24. C++ threading libraries are in the awkward situation of specifying (implicitly or explicitly) an extended memory model for C++ in order to specify program execution.We propose integrating a memory model suitable for multithreaded execution into the C++ Standard.
  9. Boehm, Hans. "Threads and memory model for C++" . Retrieved 2014-04-24. This [link farm] provides information related to the effort to clarify the meaning of multi-threaded C++ programs, and to provide some standard thread-related APIs where those are currently missing.
  10. "The Rustonomicon, Atomics" . Retrieved 2024-07-08. Rust pretty blatantly just inherits the memory model for atomics from C++20.