Treiber stack

Last updated January 25, 2025

The Treiber stack algorithm is a scalable lock-free stack utilizing the fine-grained concurrency primitive compare-and-swap.^[1] It is believed that R. Kent Treiber was the first to publish it in his 1986 article "Systems Programming: Coping with Parallelism".^[2]

Basic principle

The basic principle for the algorithm is to only add something new to the stack once you know the item you are trying to add is the only thing that has been added since you began the operation. This is done by using compare-and-swap. Pushing an item to the stack is done by first taking the top of the stack (old head) and placing it after your new item to create a new head. You then compare the old head to the current head. If the two are matching then you can swap old head to the new one, if not then it means another thread has added an item to the stack, in which case you must try again.

When popping an item from the stack, before returning the item you must check that another thread has not added a new item since the operation began.

Correctness

In some languages—particularly, those without garbage collection—the Treiber stack can be at risk for the ABA problem. When a process is about to remove an element from the stack (just before the compare and set in the pop routine below) another process can change the stack such that the head is the same, but the second element is different. The compare and swap will set the head of the stack to the old second element in the stack mixing up the complete data structure. However, the Java version on this page is not subject to this problem, because of the stronger guarantees offered by the Java runtime (it is impossible for a newly created, unaliased object reference to be reference-equal to any other reachable object.)

Testing for failures such as ABA can be exceedingly difficult, because the problematic sequence of events is very rare. Model checking is an excellent way to uncover such problems. See for instance exercise 7.3.3 in "Modeling and analysis of communicating Systems".^[3]

Java example

Below is an implementation of the Treiber Stack in Java, based on the one provided by book Java Concurrency in Practice.^[4]

importjava.util.concurrent.atomic.*;importnet.jcip.annotations.*;/** * ConcurrentStack * * Nonblocking stack using Treiber's algorithm * * @author Brian Goetz and Tim Peierls */@ThreadSafepublicclassConcurrentStack<E>{AtomicReference<Node<E>>top=newAtomicReference<Node<E>>();publicvoidpush(Eitem){Node<E>newHead=newNode<E>(item);Node<E>oldHead;do{oldHead=top.get();newHead.next=oldHead;}while(!top.compareAndSet(oldHead,newHead));}publicEpop(){Node<E>oldHead;Node<E>newHead;do{oldHead=top.get();if(oldHead==null)returnnull;newHead=oldHead.next;}while(!top.compareAndSet(oldHead,newHead));returnoldHead.item;}privatestaticclassNode<E>{publicfinalEitem;publicNode<E>next;publicNode(Eitem){this.item=item;}}}

Related Research Articles

In computer science, a linked list is a linear collection of data elements whose order is not given by their physical placement in memory. Instead, each element points to the next. It is a data structure consisting of a collection of nodes which together represent a sequence. In its most basic form, each node contains data, and a reference to the next node in the sequence. This structure allows for efficient insertion or removal of elements from any position in the sequence during iteration. More complex variants add additional links, allowing more efficient insertion or removal of nodes at arbitrary positions. A drawback of linked lists is that data access time is linear in respect to the number of nodes in the list. Because nodes are serially linked, accessing any node requires that the prior node be accessed beforehand. Faster access, such as random access, is not feasible. Arrays have better cache locality compared to linked lists.

In computer science, a priority queue is an abstract data type similar to a regular queue or stack abstract data type. Each element in a priority queue has an associated priority. In a priority queue, elements with high priority are served before elements with low priority. In some implementations, if two elements have the same priority, they are served in the same order in which they were enqueued. In other implementations, the order of elements with the same priority is undefined.

A splay tree is a binary search tree with the additional property that recently accessed elements are quick to access again. Like self-balancing binary search trees, a splay tree performs basic operations such as insertion, look-up and removal in O(log n) amortized time. For random access patterns drawn from a non-uniform random distribution, their amortized time can be faster than logarithmic, proportional to the entropy of the access pattern. For many patterns of non-random operations, also, splay trees can take better than logarithmic time, without requiring advance knowledge of the pattern. According to the unproven dynamic optimality conjecture, their performance on all access patterns is within a constant factor of the best possible performance that could be achieved by any other self-adjusting binary search tree, even one selected to fit that pattern. The splay tree was invented by Daniel Sleator and Robert Tarjan in 1985.

In computer science, mutual exclusion is a property of concurrency control, which is instituted for the purpose of preventing race conditions. It is the requirement that one thread of execution never enters a critical section while a concurrent thread of execution is already accessing said critical section, which refers to an interval of time during which a thread of execution accesses a shared resource or shared memory.

<span class="mw-page-title-main">Binary heap</span> Variant of heap data structure

A binary heap is a heap data structure that takes the form of a binary tree. Binary heaps are a common way of implementing priority queues. The binary heap was introduced by J. W. J. Williams in 1964 as a data structure for implementing heapsort.

In computer science, an algorithm is called non-blocking if failure or suspension of any thread cannot cause failure or suspension of another thread; for some operations, these algorithms provide a useful alternative to traditional blocking implementations. A non-blocking algorithm is lock-free if there is guaranteed system-wide progress, and wait-free if there is also guaranteed per-thread progress. "Non-blocking" was used as a synonym for "lock-free" in the literature until the introduction of obstruction-freedom in 2003.

In computer science, compare-and-swap (CAS) is an atomic instruction used in multithreading to achieve synchronization. It compares the contents of a memory location with a given value and, only if they are the same, modifies the contents of that memory location to a new given value. This is done as a single atomic operation. The atomicity guarantees that the new value is calculated based on up-to-date information; if the value had been updated by another thread in the meantime, the write would fail. The result of the operation must indicate whether it performed the substitution; this can be done either with a simple boolean response, or by returning the value read from the memory location, thus "swapping" the read and written values.

In concurrent programming, an operation is linearizable if it consists of an ordered list of invocation and response events, that may be extended by adding response events such that:

The extended list can be re-expressed as a sequential history.
That sequential history is a subset of the original unextended list.

In a multithreaded computing environment, hazard pointers are one approach to solving the problems posed by dynamic memory management of the nodes in a lock-free data structure. These problems generally arise only in environments that don't have automatic garbage collection.

Concurrent computing is a form of computing in which several computations are executed concurrently—during overlapping time periods—instead of sequentially—with one completing before the next starts.

In computing, a threaded binary tree is a binary tree variant that facilitates traversal in a particular order.

The Java collections framework is a set of classes and interfaces that implement commonly reusable collection data structures.

In multithreaded computing, the ABA problem occurs during synchronization, when a location is read twice, has the same value for both reads, and the read value being the same twice is used to conclude that nothing has happened in the interim; however, another thread can execute between the two reads and change the value, do other work, then change the value back, thus fooling the first thread into thinking nothing has changed even though the second thread did work that violates that assumption.

<span class="mw-page-title-main">Bubble sort</span> Simple sorting algorithm using comparisons

Bubble sort, sometimes referred to as sinking sort, is a simple sorting algorithm that repeatedly steps through the input list element by element, comparing the current element with the one after it, swapping their values if needed. These passes through the list are repeated until no swaps have to be performed during a pass, meaning that the list has become fully sorted. The algorithm, which is a comparison sort, is named for the way the larger elements "bubble" up to the top of the list.

In parallel computing, work stealing is a scheduling strategy for multithreaded computer programs. It solves the problem of executing a dynamically multithreaded computation, one that can "spawn" new threads of execution, on a statically multithreaded computer, with a fixed number of processors. It does so efficiently in terms of execution time, memory usage, and inter-processor communication.

In computer science, persistent memory is any method or apparatus for efficiently storing data structures such that they can continue to be accessed using memory instructions or memory APIs even after the end of the process that created or last modified them.

A non-blocking linked list is an example of non-blocking data structures designed to implement a linked list in shared memory using synchronization primitives:

The Java programming language's Java Collections Framework version 1.5 and later defines and implements the original regular single-threaded Maps, and also new thread-safe Maps implementing the java.util.concurrent.ConcurrentMap interface among other concurrent interfaces. In Java 1.6, the java.util.NavigableMap interface was added, extending java.util.SortedMap, and the java.util.concurrent.ConcurrentNavigableMap interface was added as a subinterface combination.

A concurrent hash table or concurrent hash map is an implementation of hash tables allowing concurrent access by multiple threads using a hash function.

In computer programming, a virtual thread is a thread that is managed by a runtime library or virtual machine (VM) and made to resemble "real" operating system thread to code executing on it, while requiring substantially fewer resources than the latter.

References

↑ Hendler, D., Shavit, N. and Yerushalmi, L., 2004, June. A scalable lock-free stack algorithm. In Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures (pp. 206-215). ACM.
↑ Treiber, R.K., 1986. Systems programming: Coping with parallelism. International Business Machines Incorporated, Thomas J. Watson Research Center.
↑ J.F. Groote and M.R. Mousavi. Modeling and analysis of communicating systems. The MIT Press 2014.
↑ Jcip.net. (2016). [online] Available at: http://jcip.net/listings/ConcurrentStack.java [Accessed 13 May 2016]

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Hendler, D., Shavit, N. and Yerushalmi, L., 2004, June. A scalable lock-free stack algorithm. In Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures (pp. 206-215). ACM.

[2] Treiber, R.K., 1986. Systems programming: Coping with parallelism. International Business Machines Incorporated, Thomas J. Watson Research Center.

[3] J.F. Groote and M.R. Mousavi. Modeling and analysis of communicating systems. The MIT Press 2014.

[4] Jcip.net. (2016). [online] Available at: http://jcip.net/listings/ConcurrentStack.java [Accessed 13 May 2016]

[1]

[2]

[3]

[4]