Timestamp-based concurrency control

Last updated

In computer science, a timestamp-based concurrency control algorithm is a optimistic concurrency control method. It is used in some databases to safely handle transactions using timestamps.

Contents

Operation

Assumptions

Generating a timestamp

A number of different approaches can generate timestamps

Formal definition

Each transaction () is an ordered list of actions (). Before the transaction performs its first action (), it is marked with the current timestamp, or any other strictly totally ordered sequence: . Every transaction is also given an initially empty set of transactions upon which it depends, , and an initially empty set of old objects which it updated, .

Each object in the database is given two timestamp fields which are not used other than for concurrency control:

For all :

For each action :
If wishes to read the value of :
If then abort (a more recent thread has overwritten the value),
Otherwise update the set of dependencies and set ;
If wishes to update the value of :
If then abort (a more recent thread is already relying on the old value),
If then skip (the Thomas Write Rule),
Otherwise store the previous values, , set , and update the value of .
While there is a transaction in that has not ended: wait
If there is a transaction in that aborted then abort
Otherwise: commit.

To abort:

For each in
If equals then restore and

Informal definition

Whenever a transaction initiated, it receives a timestamp. The transaction's timestamp indicates when the transaction was initiated. These timestamps ensure that transactions affect each object in the same sequence of their respective timestamps. Thus, given two operations that affect the same object from different transactions, the operation of the transaction with the earlier timestamp must execute before the operation of the transaction with the later timestamp. However, if the operation of the wrong transaction is actually presented first, then it is aborted and the transaction must be restarted.

Every object in the database has a read timestamp, which is updated whenever the object's data is read, and a write timestamp, which is updated whenever the object's data is changed.

If a transaction wants to read an object,

If a transaction wants to write to an object,

Physically unrealizable

The behavior is physically unrealizable if the results of transactions could not have occurred if transactions were instantaneous. The following are the only two situations that result in physically unrealizable behavior:

  1. Transaction T tries to read X but TS(T) < WT(X). Reason: It means that X has been written to by another transaction after T began.
  2. Transaction T tries to write X but TS(T) < RT(X). Reason: It means that a later transaction read X before it was written by T.

Recoverability

Note that timestamp ordering in its basic form does not produce recoverable histories. Consider for example the following history with transactions and :

This could be produced by a TO scheduler, but is not recoverable, as commits even though having read from an uncommitted transaction. To make sure that it produces recoverable histories, a scheduler can keep a list of other transactions each transaction has read from, and not let a transaction commit before this list consisted of only committed transactions. To avoid cascading aborts, the scheduler could tag data written by uncommitted transactions as dirty, and never let a read operation commence on such a data item before it was untagged. To get a strict history, the scheduler should not allow any operations on dirty items.

Implementation issues

Timestamp resolution

This is the minimum time elapsed between two adjacent timestamps. If the resolution of the timestamp is too large (coarse), the possibility of two or more timestamps being equal is increased and thus enabling some transactions to commit out of correct order. For example, for a system that creates one hundred unique timestamps per second, two events that occur 2 milliseconds apart may be given the same timestamp even though they occurred at different times.

Timestamp locking

Even though this technique is a non-locking one, in as much as the object is not locked from concurrent access for the duration of a transaction, the act of recording each timestamp against the Object requires an extremely short duration lock on the Object or its proxy.

See also

Related Research Articles

In a chemical reaction, chemical equilibrium is the state in which both the reactants and products are present in concentrations which have no further tendency to change with time, so that there is no observable change in the properties of the system. This state results when the forward reaction proceeds at the same rate as the reverse reaction. The reaction rates of the forward and backward reactions are generally not zero, but they are equal. Thus, there are no net changes in the concentrations of the reactants and products. Such a state is known as dynamic equilibrium.

In artificial intelligence, with implications for cognitive science, the frame problem describes an issue with using first-order logic to express facts about a robot in the world. Representing the state of a robot with traditional first-order logic requires the use of many axioms that simply imply that things in the environment do not change arbitrarily. For example, Hayes describes a "block world" with rules about stacking blocks together. In a first-order logic system, additional axioms are required to make inferences about the environment. The frame problem is the problem of finding adequate collections of axioms for a viable description of a robot environment.

In computer science, particularly the field of databases, the Thomas write rule is a rule in timestamp-based concurrency control. It can be summarized as ignore outdated writes.

In information technology and computer science, especially in the fields of computer programming, operating systems, multiprocessors, and databases, concurrency control ensures that correct results for concurrent operations are generated, while getting those results as quickly as possible.

In computer science, in the field of databases, write–read conflict, is a computational anomaly associated with interleaved execution of transactions. Specifically, a write–read conflict occurs when "a transaction requests to write an entity, for which an unclosed transaction has already made a read request."

In computer science, in the field of databases, write–write conflict, also known as overwriting uncommitted data is a computational anomaly associated with interleaved execution of transactions. Specifically, a write–write conflict occurs when "transaction requests to write an entity for which an unclosed transaction has already made a write request."

In computer science, in the field of databases, read–write conflict, also known as unrepeatable reads, is a computational anomaly associated with interleaved execution of transactions. Specifically, a read–write conflict occurs when a "transaction requests to read an entity for which an unclosed transaction has already made a write request."

Multiversion concurrency control, is a non-locking concurrency control method commonly used by database management systems to provide concurrent access to the database and in programming languages to implement transactional memory.

Optimistic concurrency control (OCC), also known as optimistic locking, is a non-locking concurrency control method applied to transactional systems such as relational database management systems and software transactional memory. OCC assumes that multiple transactions can frequently complete without interfering with each other. While running, transactions use data resources without acquiring locks on those resources. Before committing, each transaction verifies that no other transaction has modified the data it has read. If the check reveals conflicting modifications, the committing transaction rolls back and can be restarted. Optimistic concurrency control was first proposed in 1979 by H. T. Kung and John T. Robinson.

<span class="mw-page-title-main">Gibbs free energy</span> Type of thermodynamic potential

In thermodynamics, the Gibbs free energy is a thermodynamic potential that can be used to calculate the maximum amount of work, other than pressure-volume work, that may be performed by a thermodynamically closed system at constant temperature and pressure. It also provides a necessary condition for processes such as chemical reactions that may occur under these conditions. The Gibbs free energy is expressed as

<span class="mw-page-title-main">Helmholtz free energy</span> Thermodynamic potential

In thermodynamics, the Helmholtz free energy is a thermodynamic potential that measures the useful work obtainable from a closed thermodynamic system at a constant temperature (isothermal). The change in the Helmholtz energy during a process is equal to the maximum amount of work that the system can perform in a thermodynamic process in which temperature is held constant. At constant temperature, the Helmholtz free energy is minimized at equilibrium.

In the fields of databases and transaction processing, a schedule of a system is an abstract model to describe the order of executions in a set of transactions running in the system. Often it is a list of operations (actions) ordered by time, performed by a set of transactions that are executed together in the system. If the order in time between certain operations is not determined by the system, then a partial order is used. Examples of such operations are requesting a read operation, reading, writing, aborting, committing, requesting a lock, locking, etc. Often, only a subset of the transaction operation types are included in a schedule.

Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using some measures of interestingness. In any given transaction with a variety of items, association rules are meant to discover the rules that determine how or why certain items are connected.

The equilibrium constant of a chemical reaction is the value of its reaction quotient at chemical equilibrium, a state approached by a dynamic chemical system after sufficient time has elapsed at which its composition has no measurable tendency towards further change. For a given set of reaction conditions, the equilibrium constant is independent of the initial analytical concentrations of the reactant and product species in the mixture. Thus, given the initial composition of a system, known equilibrium constant values can be used to determine the composition of the system at equilibrium. However, reaction parameters like temperature, solvent, and ionic strength may all influence the value of the equilibrium constant.

<span class="mw-page-title-main">Linearizability</span> Property of some operation(s) in concurrent programming

In concurrent programming, an operation is linearizable if it consists of an ordered list of invocation and response events, that may be extended by adding response events such that:

  1. The extended list can be re-expressed as a sequential history.
  2. That sequential history is a subset of the original unextended list.

In computer science, software transactional memory (STM) is a concurrency control mechanism analogous to database transactions for controlling access to shared memory in concurrent computing. It is an alternative to lock-based synchronization. STM is a strategy implemented in software, rather than as a hardware component. A transaction in this context occurs when a piece of code executes a series of reads and writes to shared memory. These reads and writes logically occur at a single instant in time; intermediate states are not visible to other (successful) transactions. The idea of providing hardware support for transactions originated in a 1986 paper by Tom Knight. The idea was popularized by Maurice Herlihy and J. Eliot B. Moss. In 1995, Nir Shavit and Dan Touitou extended this idea to software-only transactional memory (STM). Since 2005, STM has been the focus of intense research and support for practical implementations is growing.

Commitment ordering (CO) is a class of interoperable serializability techniques in concurrency control of databases, transaction processing, and related applications. It allows optimistic (non-blocking) implementations. With the proliferation of multi-core processors, CO has also been increasingly utilized in concurrent programming, transactional memory, and software transactional memory (STM) to achieve serializability optimistically. CO is also the name of the resulting transaction schedule (history) property, defined in 1988 with the name dynamic atomicity. In a CO compliant schedule, the chronological order of commitment events of transactions is compatible with the precedence order of the respective transactions. CO is a broad special case of conflict serializability and effective means to achieve global serializability across any collection of database systems that possibly use different concurrency control mechanisms.

In databases, and transaction processing, snapshot isolation is a guarantee that all reads made in a transaction will see a consistent snapshot of the database, and the transaction itself will successfully commit only if no updates it has made conflict with any concurrent updates made since that snapshot.

A precedence graph, also named conflict graph and serializability graph, is used in the context of concurrency control in databases. It is the directed graph representing precedence of transactions in the schedule, as reflected by precedence of conflicting operations in the transactions. A schedule is conflict-serializable if and only if its precedence graph of committed transactions is acyclic.

A quorum is the minimum number of votes that a distributed transaction has to obtain in order to be allowed to perform an operation in a distributed system. A quorum-based technique is implemented to enforce consistent operation in a distributed system.