Shared register

Last updated

In distributed computing, shared-memory systems and message-passing systems are two means of interprocess communication which have been heavily studied. In shared-memory systems, processes communicate by accessing shared data structures. A shared (read/write) register, sometimes just called a register, is a fundamental type of shared data structure which stores a value and has two operations: read, which returns the value stored in the register, and write, which updates the value stored. Other types of shared data structures include read–modify–write, test-and-set, compare-and-swap etc. The memory location which is concurrently accessed is sometimes called a register.

Contents

Classification

Registers can be classified according to the consistency condition they satisfy when accessed concurrently, the domain of possible values that can be stored, and how many processes can access with the read or write operation, which leads to in total 24 register types. [1]

Shared Register Classification
Consistency conditionDomainWriteRead
safe
regular
atomic
binary
integer
single-writer
multi-writer
single-reader
multi-reader

When read and write happen concurrently, the value returned by read may not be uniquely determined. Lamport defined three types of registers: safe registers, regular registers and atomic registers. [1] A read operation of a safe register can return any value if it is concurrent with a Write operation, and returns the value written by the most recent write operation if the read operation does not overlap with any write. A regular register differs from a safe register in that the read operation can return the value written by either the most recent completed Write operation or a Write operation it overlaps with. An atomic register satisfies the stronger condition of being linearizable.

Registers can be characterized by how many processes can access with a read or write operation. A single-writer (SW) register can only be written by one process and a multiple-writer (MW) register can be written by multiple processes. Similarly single-reader (SR) register can only be read by one process and multiple-reader (MR) register can be read by multiple processes. For a SWSR register, it is not necessary that the writer process and the reader process are the same.

Constructions

The figure below illustrates the constructions stage by stage from the implementation of SWSR register in an asynchronous message-passing system to the implementation of MWMR register using a SW Snapshot object. This kind of construction is sometimes called simulation or emulation. [2] In each stage (except Stage 3), the object type on the right can be implemented by the simpler object type on the left. The constructions of each stage (except Stage 3) are briefly presented below. There is an article which discusses the details of constructing snapshot objects.



 
Shared Register Stages of Constructions

An implementation is linearizable if, for every execution there is a linearization ordering that satisfies the following two properties:

  1. if operations were done sequentially in order of their linearization, they would return the same result as in the concurrent execution.
  2. If operation op1 ends before operation op2 begins, then op1 comes before op2 in linearization.

Implementing an atomic SWSR register in a message passing system

A SWSR atomic (linearizable) register can be implemented in an asynchronous message-passing system, even if processes may crash. There is no time limit for processes to deliver messages to receivers or to execute local instructions. In other words, processes can not distinguish between processes which respond slowly or simply crash.

Implementation of Atomic SWSR Register in MP System SWSR2.JPG
Implementation of Atomic SWSR Register in MP System

The implementation given by Attiya, Bar-Noy and Dolev [3] requires n > 2f, where n is the total number of processes in the system, and f is the maximum number of processes that can crash during execution. The algorithm is as follows:

WriterReader
WRITE(v)   t++   send (v,t) to all processes   wait until getting (n-f) acknowledgements 
READ()   send read request to all processes   wait until getting (n-f) responses of them   choose v with the biggest t 

The linearization order of operations is: linearize writes in the order as they occur and insert the read after the write whose value it returns. We can check that the implementation is linearizable. We can check property 2 especially when op1 is write and op2 is read, and read is immediately after write. We can show by contradiction. Assume the read does not see the write, and then according to the implementation, we must have two disjoint sets of size (n - f) among the n processes. So 2 * (n - f) ≤ n leading to n≤ 2f, which contradicts the fact that n > 2f. So the read must read at least one value written by that write.

Implementing a SWMR register from SWSR registers

A SWMR register can be written by only one process but can be read by multiple processes.

Implementation of SWMR register using SWSR registers
Readers
Writers
A[1,1]A[1,2]...A[1,n]
A[2,1]A[2,2]...A[2,n]
............
A[n,1]A[n,2]...A[n,n]
wA[n+1,1]A[n+1,2]...A[n+1,n]

Let n be the number of processes which can read the SWMR register. Let Ri, 0 < in, refer to the readers of the SWMR register. Let w be the single writer of the SWMR. The figure on the right shows a construction of a SWMR register using an array of n(n + 1) SWSR registers. We denote the array by A. Each SWSR register A[i,j] is writable by Ri when 0 < in and is writable by w when i = n + 1. Each SWSR register A[i,j] is readable by Rj. The implementations of read and write are shown below.

Writer

w: WRITE(v)

for j = i..n      t++     write (v,t) in A[n+1,j] end for 
Readers

Ri: READ()

for k = 1..(n+1)     (V[k],T[k]) <- read A[k,i] end for take k such that for all l, T[k] >= T[l] r <- V[k] t <- T[k] for j=1..n     write (r,t) in A[i,j] end for return r 

The t-value of an operation is the value of t it writes and the operations are linearized by t-values. If write and read have the same t-value, order write before read. If several reads have the same t-values, order them by the start time.

Implementing a MWMR register from a SW Snapshot object

We can use the a SW Snapshot object of size n to construct a MWMR register.

WriterReaders
Pi: WRITE(v)Pi: READ()
((v1, t1), ..., (vn, tn)) <- V.SCAN() let t = max(t1, ..., tn) + 1 V.UPDATE(i, (v, t)) 
V.SCAN return value with largest timestamp, in the result of the scan 

(break ties by using rightmost pair of largest timestamp)

The linearization order is as follows. Order write operations by t-values. If several writes have the same t-value, order the operation with small process ID in front. Insert reads right after write whose value they return, breaking ties by process ID and if still tied, break tie by start time.

See also

Related Research Articles

<span class="mw-page-title-main">Atomic semantics</span>

Atomic semantics is a type of guarantee provided by a data register shared by several processors in a parallel machine or in a network of computers working together. Atomic semantics are very strong. An atomic register provides strong guarantees even when there is concurrency and failures.

Regular semantics is a computing term which describes one type of guarantee provided by a data register shared by several processors in a parallel machine or in a network of computers working together. Regular semantics are defined for a variable with a single writer but multiple readers. These semantics are stronger than safe semantics but weaker than atomic semantics: they guarantee that there is a total order to the write operations which is consistent with real-time and that read operations return either the value of the last write completed before the read begins, or that of one of the writes which are concurrent with the read.

Safe semantics is a computer hardware consistency model. It describes one type of guarantee that a data register provides when it is shared by several processors in a parallel computer or in a network of computers working together.

<span class="mw-page-title-main">Mutual exclusion</span> In computing, restricting data to be accessible by one thread at a time

In computer science, mutual exclusion is a property of concurrency control, which is instituted for the purpose of preventing race conditions. It is the requirement that one thread of execution never enters a critical section while a concurrent thread of execution is already accessing said critical section, which refers to an interval of time during which a thread of execution accesses a shared resource or shared memory.

In information technology and computer science, especially in the fields of computer programming, operating systems, multiprocessors, and databases, concurrency control ensures that correct results for concurrent operations are generated, while getting those results as quickly as possible.

Multiversion concurrency control, is a concurrency control method commonly used by database management systems to provide concurrent access to the database and in programming languages to implement transactional memory.

A database transaction symbolizes a unit of work, performed within a database management system against a database, that is treated in a coherent and reliable way independent of other transactions. A transaction generally represents any change in a database. Transactions in a database environment have two main purposes:

  1. To provide reliable units of work that allow correct recovery from failures and keep a database consistent even in cases of system failure. For example: when execution prematurely and unexpectedly stops in which case many operations upon a database remain uncompleted, with unclear status.
  2. To provide isolation between programs accessing a database concurrently. If this isolation is not provided, the programs' outcomes are possibly erroneous.

In computer science, a lock or mutex is a synchronization primitive: a mechanism that enforces limits on access to a resource when there are many threads of execution. A lock is designed to enforce a mutual exclusion concurrency control policy, and with a variety of possible methods there exists multiple unique implementations for different applications.

In databases and transaction processing, two-phase locking (2PL) is a concurrency control method that guarantees serializability. It is also the name of the resulting set of database transaction schedules (histories). The protocol uses locks, applied by a transaction to data, which may block other transactions from accessing the same data during the transaction's life.

In computer science, a consistency model specifies a contract between the programmer and a system, wherein the system guarantees that if the programmer follows the rules for operations on memory, memory will be consistent and the results of reading, writing, or updating memory will be predictable. Consistency models are used in distributed systems like distributed shared memory systems or distributed data stores. Consistency is different from coherence, which occurs in systems that are cached or cache-less, and is consistency of data with respect to all processors. Coherence deals with maintaining a global order in which writes to a single location or single variable are seen by all processors. Consistency deals with the ordering of operations to multiple locations with respect to all processors.

<span class="mw-page-title-main">Linearizability</span> Property of some operation(s) in concurrent programming

In concurrent programming, an operation is linearizable if it consists of an ordered list of invocation and response events, that may be extended by adding response events such that:

  1. The extended list can be re-expressed as a sequential history.
  2. That sequential history is a subset of the original unextended list.

In computing, a parallel programming model is an abstraction of parallel computer architecture, with which it is convenient to express algorithms and their composition in programs. The value of a programming model can be judged on its generality: how well a range of different problems can be expressed for a variety of different architectures, and its performance: how efficiently the compiled programs can execute. The implementation of a parallel programming model can take the form of a library invoked from a sequential language, as an extension to an existing language, or as an entirely new language.

Replication in computing involves sharing information so as to ensure consistency between redundant resources, such as software or hardware components, to improve reliability, fault-tolerance, or accessibility.

In concurrency control of databases, transaction processing, and various transactional applications, both centralized and distributed, a transaction schedule is serializable if its outcome is equal to the outcome of its transactions executed serially, i.e. without overlapping in time. Transactions are normally executed concurrently, since this is the most efficient way. Serializability is the major correctness criterion for concurrent transactions' executions. It is considered the highest level of isolation between transactions, and plays an essential role in concurrency control. As such it is supported in all general purpose database systems. Strong strict two-phase locking (SS2PL) is a popular serializability mechanism utilized in most of the database systems since their early days in the 1970s.

Causal consistency is one of the major memory consistency models. In concurrent programming, where concurrent processes are accessing a shared memory, a consistency model restricts which accesses are legal. This is useful for defining correct data structures in distributed shared memory or distributed transactions.

A fundamental problem in distributed computing and multi-agent systems is to achieve overall system reliability in the presence of a number of faulty processes. This often requires coordinating processes to reach consensus, or agree on some data value that is needed during computation. Example applications of consensus include agreeing on what transactions to commit to a database in which order, state machine replication, and atomic broadcasts. Real-world applications often requiring consensus include cloud computing, clock synchronization, PageRank, opinion formation, smart power grids, state estimation, control of UAVs, load balancing, blockchain, and others.

The Java programming language and the Java virtual machine (JVM) is designed to support concurrent programming. All execution takes place in the context of threads. Objects and resources can be accessed by many separate threads. Each thread has its own path of execution, but can potentially access any object in the program. The programmer must ensure read and write access to objects is properly coordinated between threads. Thread synchronization ensures that objects are modified by only one thread at a time and prevents threads from accessing partially updated objects during modification by another thread. The Java language has built-in constructs to support this coordination.

A concurrent hash-trie or Ctrie is a concurrent thread-safe lock-free implementation of a hash array mapped trie. It is used to implement the concurrent map abstraction. It has particularly scalable concurrent insert and remove operations and is memory-efficient. It is the first known concurrent data-structure that supports O(1), atomic, lock-free snapshots.

In distributed computing, a shared snapshot object is a type of data structure, which is shared between several threads or processes. For many tasks, it is important to have a data structure, that can provide a consistent view of the state of the memory. In practice, it turns out that it is not possible to get such a consistent state of the memory by just accessing one shared register after another, since the values stored in individual registers can be changed at any time during this process. To solve this problem, snapshot objects store a vector of n components and provide the following two atomic operations: update(i,v) changes the value in the ith component to v, and scan returns the values stored in all n components. Snapshot objects can be constructed using atomic single-writer multi-reader shared registers.

The Java programming language's Java Collections Framework version 1.5 and later defines and implements the original regular single-threaded Maps, and also new thread-safe Maps implementing the java.util.concurrent.ConcurrentMap interface among other concurrent interfaces. In Java 1.6, the java.util.NavigableMap interface was added, extending java.util.SortedMap, and the java.util.concurrent.ConcurrentNavigableMap interface was added as a subinterface combination.

References

  1. 1 2 Kshemkalyani, Ajay D.; Singhal, Mukesh (2008). Distributed computing : principles, algorithms, and systems . Cambridge: Cambridge University Press. pp.  435–437. ISBN   9780521876346.
  2. Attiya, Hagit; Welch, Jennifer (Mar 25, 2004). Distributed computing: fundamentals, simulations, and advanced topics. John Wiley & Sons, Inc. ISBN   978-0-471-45324-6.
  3. Attiya, Hagit; Bar-Noy, Amotz; Dolev, Danny (1990). "Sharing memory robustly in message-passing systems". Proceedings of the ninth annual ACM symposium on Principles of distributed computing. Vol. PODC '90. pp. 363–375. doi:10.1145/93385.93441. ISBN   089791404X. S2CID   1233774.