Java memory model

Last updated March 30, 2023

The Java memory model describes how threads in the Java programming language interact through memory. Together with the description of single-threaded execution of code, the memory model provides the semantics of the Java programming language.

The original Java memory model developed in 1995, was widely perceived as broken,^[1] preventing many runtime optimizations and not providing strong enough guarantees for code safety. It was updated through the Java Community Process, as Java Specification Request 133 (JSR-133), which took effect back in 2004, for Tiger (Java 5.0).^[2]^[3]

Context

The Java programming language and platform provide thread capabilities. Synchronization between threads is notoriously difficult for developers; this difficulty is compounded because Java applications can run on a wide range of processors and operating systems. To be able to draw conclusions about a program's behavior, Java's designers decided they had to clearly define possible behaviors of all Java programs.

On modern platforms, code is frequently not executed in the order it was written. It is reordered by the compiler, the processor and the memory subsystem to achieve maximum performance. On multiprocessor architectures, individual processors may have their own local caches that are out of sync with main memory. It is generally undesirable to require threads to remain perfectly in sync with one another because this would be too costly from a performance point of view. This means that at any given time, different threads may see different values for the same shared data.

In a single-threaded environment, it is easy to reason about code execution. The typical approach requires the system to implement as-if-serial semantics for individual threads in isolation. When an individual thread executes, it will appear as if all of the actions taken by that thread occur in the order they appear in the program, even if the actions themselves occur out of order.

If one thread executes its instructions out of order, then another thread might see the fact that those instructions were executed out of order, even if that did not affect the semantics of the first thread. For example, consider two threads with the following instructions, executing concurrently, where the variables x and y are both initialized to 0:

Thread 1	Thread 2
x = 1;	int r1 = y;
y = 2;	int r2 = x;

If no reorderings are performed, and the read of y in Thread 2 returns the value 2, then the subsequent read of x should return the value 1, because the write to x was performed before the write to y. However, if the two writes are reordered, then the read of y can return the value 2, and the read of x can return the value 0.

The Java Memory Model (JMM) defines the allowable behavior of multithreaded programs, and therefore describes when such reorderings are possible. It places execution-time constraints on the relationship between threads and main memory in order to achieve consistent and reliable Java applications. By doing this, it makes it possible to reason about code execution in a multithreaded environment, even in the face of optimizations performed by the dynamic compiler, the processor(s), and the caches.

The memory model

For the execution of a single thread, the rules are simple. The Java Language Specification requires a Java virtual machine to observe within-thread as-if-serial semantics. The runtime (which, in this case, usually refers to the dynamic compiler, the processor and the memory subsystem) is free to introduce any useful execution optimizations as long as the result of the thread in isolation is guaranteed to be exactly the same as it would have been had all the statements been executed in the order the statements occurred in the program (also called program order).^[4]

The major caveat of this is that as-if-serial semantics do not prevent different threads from having different views of the data. The memory model provides clear guidance about what values are allowed to be returned when the data is read. The basic rules imply that individual actions can be reordered, as long as the as-if-serial semantics of the thread are not violated, and actions that imply communication between threads, such as the acquisition or release of a lock, ensure that actions that happen prior to them are seen by other threads that see their effects. For example, everything that happens before the release of a lock will be seen to be ordered before and visible to everything that happens after a subsequent acquisition of that same lock.^[5]

Mathematically, there is a partial order called the happens-before order over all actions performed by the program. The happens-before order subsumes the program order; if one action occurs before another in the program order, it will occur before the other in the happens-before order. In addition, releases and subsequent acquisitions of locks form edges in the happens-before graph. A read is allowed to return the value of a write if that write is the last write to that variable before the read along some path in the happens-before order, or if the write is not ordered with respect to that read in the happens-before order.

Impact

The Java memory model was the first attempt to provide a comprehensive memory model for a popular programming language.^[6] It was justified by the increasing prevalence of concurrent and parallel systems, and the need to provide tools and technologies with clear semantics for such systems. Since then, the need for a memory model has been more widely accepted, with similar semantics being provided for languages such as C++.^[7]

Related Research Articles

In computer science, a thread of execution is the smallest sequence of programmed instructions that can be managed independently by a scheduler, which is typically a part of the operating system. The implementation of threads and processes differs between operating systems. In Modern Operating Systems, Tanenbaum shows that many distinct models of process organization are possible. In many cases, a thread is a component of a process. The multiple threads of a given process may be executed concurrently, sharing resources such as memory, while different processes do not share these resources. In particular, the threads of a process share its executable code and the values of its dynamically allocated variables and non-thread-local global variables at any given time.

Thread safety is a computer programming concept applicable to multi-threaded code. Thread-safe code only manipulates shared data structures in a manner that ensures that all threads behave properly and fulfill their design specifications without unintended interaction. There are various strategies for making thread-safe data structures.

In software engineering, double-checked locking is a software design pattern used to reduce the overhead of acquiring a lock by testing the locking criterion before acquiring the lock. Locking occurs only if the locking criterion check indicates that locking is required.

In computer science, a consistency model specifies a contract between the programmer and a system, wherein the system guarantees that if the programmer follows the rules for operations on memory, memory will be consistent and the results of reading, writing, or updating memory will be predictable. Consistency models are used in distributed systems like distributed shared memory systems or distributed data stores. Consistency is different from coherence, which occurs in systems that are cached or cache-less, and is consistency of data with respect to all processors. Coherence deals with maintaining a global order in which writes to a single location or single variable are seen by all processors. Consistency deals with the ordering of operations to multiple locations with respect to all processors.

A race condition or race hazard is the condition of an electronics, software, or other system where the system's substantive behavior is dependent on the sequence or timing of other uncontrollable events. It becomes a bug when one or more of the possible behaviors is undesirable.

Release consistency is one of the synchronization-based consistency models used in concurrent programming.

In computing, a memory barrier, also known as a membar, memory fence or fence instruction, is a type of barrier instruction that causes a central processing unit (CPU) or compiler to enforce an ordering constraint on memory operations issued before and after the barrier instruction. This typically means that operations issued prior to the barrier are guaranteed to be performed before operations issued after the barrier.

In computer programming, a thread pool is a software design pattern for achieving concurrency of execution in a computer program. Often also called a replicated workers or worker-crew model, a thread pool maintains multiple threads waiting for tasks to be allocated for concurrent execution by the supervising program. By maintaining a pool of threads, the model increases performance and avoids latency in execution due to frequent creation and destruction of threads for short-lived tasks. The number of available threads is tuned to the computing resources available to the program, such as a parallel task queue after completion of execution.

In computer programming, volatile means that a value is prone to change over time, outside the control of some code. Volatility has implications within function calling conventions, and also impacts how variables are stored, accessed and cached.

In concurrent programming, an operation is linearizable if it consists of an ordered list of invocation and response events, that may be extended by adding response events such that:

The extended list can be re-expressed as a sequential history.
That sequential history is a subset of the original unextended list.

Lamport's bakery algorithm is a computer algorithm devised by computer scientist Leslie Lamport, as part of his long study of the formal correctness of concurrent systems, which is intended to improve the safety in the usage of shared resources among multiple threads by means of mutual exclusion.

Automatic parallelization, also auto parallelization, or autoparallelization refers to converting sequential code into multi-threaded and/or vectorized code in order to use multiple processors simultaneously in a shared-memory multiprocessor (SMP) machine. Fully automatic parallelization of sequential programs is a challenge because it requires complex program analysis and the best approach may depend upon parameter values that are not known at compilation time.

Concurrent computing is a form of computing in which several computations are executed concurrently—during overlapping time periods—instead of sequentially—with one completing before the next starts.

In computer science and engineering, transactional memory attempts to simplify concurrent programming by allowing a group of load and store instructions to execute in an atomic way. It is a concurrency control mechanism analogous to database transactions for controlling access to shared memory in concurrent computing. Transactional memory systems provide high-level abstraction as an alternative to low-level thread synchronization. This abstraction allows for coordination between concurrent reads and writes of shared data in parallel systems.

The Java programming language and the Java virtual machine (JVM) have been designed to support concurrent programming, and all execution takes place in the context of threads. Objects and resources can be accessed by many separate threads; each thread has its own path of execution but can potentially access any object in the program. The programmer must ensure read and write access to objects is properly coordinated between threads. Thread synchronization ensures that objects are modified by only one thread at a time and that threads are prevented from accessing partially updated objects during modification by another thread. The Java language has built-in constructs to support this coordination.

Memory ordering describes the order of accesses to computer memory by a CPU. The term can refer either to the memory ordering generated by the compiler during compile time, or to the memory ordering generated by a CPU during runtime.

Thread Level Speculation (TLS), also known as Speculative Multithreading, or Speculative Parallelization, is a technique to speculatively execute a section of computer code that is anticipated to be executed later in parallel with the normal execution on a separate independent thread. Such a speculative thread may need to make assumptions about the values of input variables. If these prove to be invalid, then the portions of the speculative thread that rely on these input variables will need to be discarded and squashed. If the assumptions are correct the program can complete in a shorter time provided the thread was able to be scheduled efficiently.

In software development, the programming language Java was historically considered slower than the fastest 3rd generation typed languages such as C and C++. The main reason being a different language design, where after compiling, Java programs run on a Java virtual machine (JVM) rather than directly on the computer's processor as native code, as do C and C++ programs. Performance was a matter of concern because much business software has been written in Java after the language quickly became popular in the late 1990s and early 2000s.

In computing, a memory model describes the interactions of threads through memory and their shared use of the data.

References

↑ Pugh, William (2000). "The Java memory model is fatally flawed" (PDF). Concurrency: Practice and Experience. 12 (6): 445–455. doi:10.1002/1096-9128(200005)12:6<445::AID-CPE484>3.0.CO;2-A . Retrieved 15 July 2021.
↑ Goetz, Brian (2004-02-24). "Fixing the Java Memory Model, Part 2" (PDF). IBM . Retrieved 2010-10-18.
↑ Jeremy Manson and Brian Goetz (February 2004). "JSR 133 (Java Memory Model) FAQ" . Retrieved 2010-10-18. The Java Memory Model describes what behaviors are legal in multithreaded code, and how threads may interact through memory. It describes the relationship between variables in a program and the low-level details of storing and retrieving them to and from memory or registers in a real computer system. It does this in a way that can be implemented correctly using a wide variety of hardware and a wide variety of compiler optimizations.
↑ Manson, Jeremy. "JSR-133 FAQ".
↑ "JLS Happens-before Order".
↑ Goetz, Brian (2004-02-24). "Fixing the Java Memory Model, Part 1" (PDF). IBM . Retrieved 2008-02-17.
↑ Boehm, Hans. "Threads and memory model for C++" . Retrieved 2014-08-08.

External links

Java theory and practice: Fixing the Java Memory Model, part 1 - An article describing problems with the original Java memory model.
Java theory and practice: Fixing the Java Memory Model, part 2 - Explains the changes JSR 133 made to the Java memory model.
Java Memory Model Pragmatics (transcript)
The Java Memory Model links
Java internal structure
JSR-133 webpage
JSR-133 FAQ
JSR-133 implementation guide

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Pugh, William (2000). "The Java memory model is fatally flawed" (PDF). Concurrency: Practice and Experience. 12 (6): 445–455. doi:10.1002/1096-9128(200005)12:6<445::AID-CPE484>3.0.CO;2-A . Retrieved 15 July 2021.

[2] Goetz, Brian (2004-02-24). "Fixing the Java Memory Model, Part 2" (PDF). IBM . Retrieved 2010-10-18.

[3] Jeremy Manson and Brian Goetz (February 2004). "JSR 133 (Java Memory Model) FAQ" . Retrieved 2010-10-18. The Java Memory Model describes what behaviors are legal in multithreaded code, and how threads may interact through memory. It describes the relationship between variables in a program and the low-level details of storing and retrieving them to and from memory or registers in a real computer system. It does this in a way that can be implemented correctly using a wide variety of hardware and a wide variety of compiler optimizations.

[4] Manson, Jeremy. "JSR-133 FAQ".

[5] "JLS Happens-before Order".

[6] Goetz, Brian (2004-02-24). "Fixing the Java Memory Model, Part 1" (PDF). IBM . Retrieved 2008-02-17.

[7] Boehm, Hans. "Threads and memory model for C++" . Retrieved 2014-08-08.

[1]

[2]

[3]

[4]

[5]

[6]

[7]