The Java memory model describes how threads in the Java programming language interact through memory. Together with the description of single-threaded execution of code, the memory model provides the semantics of the Java programming language.
The original Java memory model developed in 1995, was widely perceived as broken, [1] preventing many runtime optimizations and not providing strong enough guarantees for code safety. It was updated through the Java Community Process, as Java Specification Request 133 (JSR-133), which took effect back in 2004, for Tiger (Java 5.0). [2] [3]
The Java programming language and platform provide thread capabilities. Synchronization between threads is notoriously difficult for developers; this difficulty is compounded because Java applications can run on a wide range of processors and operating systems. To be able to draw conclusions about a program's behavior, Java's designers decided they had to clearly define possible behaviors of all Java programs.
On modern platforms, code is frequently not executed in the order it was written. It is reordered by the compiler, the processor and the memory subsystem to achieve maximum performance. On multiprocessor architectures, individual processors may have their own local caches that are out of sync with main memory. It is generally undesirable to require threads to remain perfectly in sync with one another because this would be too costly from a performance point of view. This means that at any given time, different threads may see different values for the same shared data.
In a single-threaded environment, it is easy to reason about code execution. The typical approach requires the system to implement as-if-serial semantics for individual threads in isolation. When an individual thread executes, it will appear as if all of the actions taken by that thread occur in the order they appear in the program, even if the actions themselves occur out of order.
If one thread executes its instructions out of order, then another thread might see the fact that those instructions were executed out of order, even if that did not affect the semantics of the first thread. For example, consider two threads with the following instructions, executing concurrently, where the variables x and y are both initialized to 0:
Thread 1 | Thread 2 |
---|---|
x = 1; | int r1 = y; |
y = 2; | int r2 = x; |
If no reorderings are performed, and the read of y in Thread 2 returns the value 2, then the subsequent read of x should return the value 1, because the write to x was performed before the write to y. However, if the two writes are reordered, then the read of y can return the value 2, and the read of x can return the value 0.
The Java Memory Model (JMM) defines the allowable behavior of multithreaded programs, and therefore describes when such reorderings are possible. It places execution-time constraints on the relationship between threads and main memory in order to achieve consistent and reliable Java applications. By doing this, it makes it possible to reason about code execution in a multithreaded environment, even in the face of optimizations performed by the dynamic compiler, the processor(s), and the caches.
For the execution of a single thread, the rules are simple. The Java Language Specification requires a Java virtual machine to observe within-thread as-if-serial semantics. The runtime (which, in this case, usually refers to the dynamic compiler, the processor and the memory subsystem) is free to introduce any useful execution optimizations as long as the result of the thread in isolation is guaranteed to be exactly the same as it would have been had all the statements been executed in the order the statements occurred in the program (also called program order). [4]
The major caveat of this is that as-if-serial semantics do not prevent different threads from having different views of the data. The memory model provides clear guidance about what values are allowed to be returned when the data is read. The basic rules imply that individual actions can be reordered, as long as the as-if-serial semantics of the thread are not violated, and actions that imply communication between threads, such as the acquisition or release of a lock, ensure that actions that happen prior to them are seen by other threads that see their effects. For example, everything that happens before the release of a lock will be seen to be ordered before and visible to everything that happens after a subsequent acquisition of that same lock. [5]
Mathematically, there is a partial order called the happens-before order over all actions performed by the program. The happens-before order subsumes the program order; if one action occurs before another in the program order, it will occur before the other in the happens-before order. In addition, releases and subsequent acquisitions of locks form edges in the happens-before graph. A read is allowed to return the value of a write if that write is the last write to that variable before the read along some path in the happens-before order, or if the write is not ordered with respect to that read in the happens-before order.
The Java memory model was the first attempt to provide a comprehensive memory model for a popular programming language. [6] It was justified by the increasing prevalence of concurrent and parallel systems, and the need to provide tools and technologies with clear semantics for such systems. Since then, the need for a memory model has been more widely accepted, with similar semantics being provided for languages such as C++. [7]
An optimizing compiler is a compiler designed to generate code that is optimized in aspects such as minimizing program execution time, memory use, storage size, and power consumption. Optimization is generally implemented as a sequence of optimizing transformations, algorithms that transform code to produce semantically equivalent code optimized for some aspect.
In computer science, a thread of execution is the smallest sequence of programmed instructions that can be managed independently by a scheduler, which is typically a part of the operating system. In many cases, a thread is a component of a process.
In multi-threaded computer programming, a function is thread-safe when it can be invoked or accessed concurrently by multiple threads without causing unexpected behavior, race conditions, or data corruption. As in the multi-threaded context where a program executes several threads simultaneously in a shared address space and each of those threads has access to every other thread's memory, thread-safe functions need to ensure that all those threads behave properly and fulfill their design specifications without unintended interaction.
In software engineering, double-checked locking is a software design pattern used to reduce the overhead of acquiring a lock by testing the locking criterion before acquiring the lock. Locking occurs only if the locking criterion check indicates that locking is required.
A race condition or race hazard is the condition of an electronics, software, or other system where the system's substantive behavior is dependent on the sequence or timing of other uncontrollable events, leading to unexpected or inconsistent results. It becomes a bug when one or more of the possible behaviors is undesirable.
Sequential consistency is a consistency model used in the domain of concurrent computing.
Release consistency is one of the synchronization-based consistency models used in concurrent programming.
In computing, a memory barrier, also known as a membar, memory fence or fence instruction, is a type of barrier instruction that causes a central processing unit (CPU) or compiler to enforce an ordering constraint on memory operations issued before and after the barrier instruction. This typically means that operations issued prior to the barrier are guaranteed to be performed before operations issued after the barrier.
In computer programming, a value is said to be volatile if it can be read or modified asynchronously by something other than the current thread of execution. The value of a volatile
variable may spontaneously change for reasons such as: sharing values with other threads; sharing values with asynchronous signal handlers; accessing hardware devices via memory-mapped I/O . Support for these use cases varies considerably among the several programming language that have the volatile
keyword. Volatility can have implications regarding function calling conventions and how variables are stored, accessed and cached.
In concurrent programming, an operation is linearizable if it consists of an ordered list of invocation and response events, that may be extended by adding response events such that:
Lamport's bakery algorithm is a computer algorithm devised by computer scientist Leslie Lamport, as part of his long study of the formal correctness of concurrent systems, which is intended to improve the safety in the usage of shared resources among multiple threads by means of mutual exclusion.
Automatic parallelization, also auto parallelization, or autoparallelization refers to converting sequential code into multi-threaded and/or vectorized code in order to use multiple processors simultaneously in a shared-memory multiprocessor (SMP) machine. Fully automatic parallelization of sequential programs is a challenge because it requires complex program analysis and the best approach may depend upon parameter values that are not known at compilation time.
Concurrent computing is a form of computing in which several computations are executed concurrently—during overlapping time periods—instead of sequentially—with one completing before the next starts.
In computer science and engineering, transactional memory attempts to simplify concurrent programming by allowing a group of load and store instructions to execute in an atomic way. It is a concurrency control mechanism analogous to database transactions for controlling access to shared memory in concurrent computing. Transactional memory systems provide high-level abstraction as an alternative to low-level thread synchronization. This abstraction allows for coordination between concurrent reads and writes of shared data in parallel systems.
In computer programming, a green thread is a thread that is scheduled by a runtime library or virtual machine (VM) instead of natively by the underlying operating system (OS). Green threads emulate multithreaded environments without relying on any native OS abilities, and they are managed in user space instead of kernel space, enabling them to work in environments that do not have native thread support.
The Java programming language and the Java virtual machine (JVM) is designed to support concurrent programming. All execution takes place in the context of threads. Objects and resources can be accessed by many separate threads. Each thread has its own path of execution, but can potentially access any object in the program. The programmer must ensure read and write access to objects is properly coordinated between threads. Thread synchronization ensures that objects are modified by only one thread at a time and prevents threads from accessing partially updated objects during modification by another thread. The Java language has built-in constructs to support this coordination.
Memory ordering is the order of accesses to computer memory by a CPU. Memory ordering depends on both the order of the instructions generated by the compiler at compile time and the execution order of the CPU at runtime. However, memory order is of little concern outside of multithreading and memory-mapped I/O, because if the compiler or CPU changes the order of any operations, it must necessarily ensure that the reordering does not change the output of ordinary single-threaded code.
Thread Level Speculation (TLS), also known as Speculative Multi-threading, or Speculative Parallelization, is a technique to speculatively execute a section of computer code that is anticipated to be executed later in parallel with the normal execution on a separate independent thread. Such a speculative thread may need to make assumptions about the values of input variables. If these prove to be invalid, then the portions of the speculative thread that rely on these input variables will need to be discarded and squashed. If the assumptions are correct the program can complete in a shorter time provided the thread was able to be scheduled efficiently.
In software development, the programming language Java was historically considered slower than the fastest third-generation typed languages such as C and C++. In contrast to those languages, Java compiles by default to a Java Virtual Machine (JVM) with operations distinct from those of the actual computer hardware. Early JVM implementations were interpreters; they simulated the virtual operations one-by-one rather than translating them into machine code for direct hardware execution.
In computing, a memory model describes the interactions of threads through memory and their shared use of the data.
The Java Memory Model describes what behaviors are legal in multithreaded code, and how threads may interact through memory. It describes the relationship between variables in a program and the low-level details of storing and retrieving them to and from memory or registers in a real computer system. It does this in a way that can be implemented correctly using a wide variety of hardware and a wide variety of compiler optimizations.