Timestamp-based concurrency control

Last updated

In computer science, a timestamp-based concurrency control algorithm is a optimistic concurrency control method. It is used in some databases to safely handle transactions, using timestamps.

Contents

Operation

Assumptions

Generating a timestamp

A number of different ways have been used to generate timestamp

Formal

Each transaction () is an ordered list of actions (). Before the transaction performs its first action (), it is marked with the current timestamp, or any other strictly totally ordered sequence: . Every transaction is also given an initially empty set of transactions upon which it depends, , and an initially empty set of old objects which it updated, .

Each object in the database is given two timestamp fields which are not used other than for concurrency control: is the time at which the value of object was last read by a transaction, is the time at which the value of the object was last updated by a transaction.

For all :

For each action :
If wishes to read the value of :
If then abort (a more recent thread has overwritten the value),
Otherwise update the set of dependencies and set ;
If wishes to update the value of :
If then abort (a more recent thread is already relying on the old value),
If then skip (the Thomas Write Rule),
Otherwise store the previous values, , set , and update the value of .
While there is a transaction in that has not ended: wait
If there is a transaction in that aborted then abort
Otherwise: commit.

To abort:

For each in
If equals then restore and

Informal

Whenever a transaction initiated, it receives a timestamp. The transaction's timestamp indicates when the transaction was initiated. These timestamps ensure that transactions affect each object in the same sequence of their respective timestamps. Thus, given two operations that affect the same object from different transactions, the operation of the transaction with the earlier timestamp must execute before the operation of the transaction with the later timestamp. However, if the operation of the wrong transaction is actually presented first, then it is aborted and the transaction must be restarted.

Every object in the database has a read timestamp, which is updated whenever the object's data is read, and a write timestamp, which is updated whenever the object's data is changed.

If a transaction wants to read an object,

If a transaction wants to write to an object,

Physically unrealizable

The behavior is physically unrealizable if the results of transactions could not have occurred if transactions were instantaneous. The following are the only two situations that result in physically unrealizable behavior:

  1. Transaction T tries to read X but TS(T) < WT(X). Reason: It means that X has been written to by another transaction after T began.
  2. Transaction T tries to write X but TS(T) < RT(X). Reason: It means that a later transaction read X before it was written by T.

Recoverability

Note that timestamp ordering in its basic form does not produce recoverable histories. Consider for example the following history with transactions and :

This could be produced by a TO scheduler, but is not recoverable, as commits even though having read from an uncommitted transaction. To make sure that it produces recoverable histories, a scheduler can keep a list of other transactions each transaction has read from, and not let a transaction commit before this list consisted of only committed transactions. To avoid cascading aborts, the scheduler could tag data written by uncommitted transactions as dirty, and never let a read operation commence on such a data item before it was untagged. To get a strict history, the scheduler should not allow any operations on dirty items.

Implementation issues

Timestamp resolution

This is the minimum time elapsed between two adjacent timestamps. If the resolution of the timestamp is too large (coarse), the possibility of two or more timestamps being equal is increased and thus enabling some transactions to commit out of correct order. For example, assuming that we have a system that can create one hundred unique timestamps per second, and given two events that occur 2 milliseconds apart, they will probably be given the same timestamp even though they actually occurred at different times.

Timestamp locking

Even though this technique is a non-locking one, in as much as the Object is not locked from concurrent access for the duration of a transaction, the act of recording each timestamp against the Object requires an extremely short duration lock on the Object or its proxy.

See also

Related Research Articles

In artificial intelligence, with implications for cognitive science, the frame problem describes an issue with using first-order logic to express facts about a robot in the world. Representing the state of a robot with traditional first-order logic requires the use of many axioms that simply imply that things in the environment do not change arbitrarily. For example, Hayes describes a "block world" with rules about stacking blocks together. In a first-order logic system, additional axioms are required to make inferences about the environment. The frame problem is the problem of finding adequate collections of axioms for a viable description of a robot environment.

In computer science, particularly the field of databases, the Thomas write rule is a rule in timestamp-based concurrency control. It can be summarized as ignore outdated writes.

In information technology and computer science, especially in the fields of computer programming, operating systems, multiprocessors, and databases, concurrency control ensures that correct results for concurrent operations are generated, while getting those results as quickly as possible.

In computer science, in the field of databases, read–write conflict, also known as unrepeatable reads, is a computational anomaly associated with interleaved execution of transactions. A read-write conflict occurs when a transaction needs to lock an object both before and after another transaction writes it.

Multiversion concurrency control, is a non-locking concurrency control method commonly used by database management systems to provide concurrent access to the database and in programming languages to implement transactional memory.

Optimistic concurrency control (OCC), also known as optimistic locking, is a non-locking concurrency control method applied to transactional systems such as relational database management systems and software transactional memory. OCC assumes that multiple transactions can frequently complete without interfering with each other. While running, transactions use data resources without acquiring locks on those resources. Before committing, each transaction verifies that no other transaction has modified the data it has read. If the check reveals conflicting modifications, the committing transaction rolls back and can be restarted. Optimistic concurrency control was first proposed in 1979 by H. T. Kung and John T. Robinson. The two main versions of OCC is timestamp OCC and validation OCC. Validation OCC is similar to the timestamp system except that data is recorded about the actions of transactions rather than data about database objects.

<span class="mw-page-title-main">Gibbs free energy</span> Type of thermodynamic potential; useful for calculating reversible work in certain systems

In thermodynamics, the Gibbs free energy is a thermodynamic potential that can be used to calculate the maximum amount of work, other than pressure-volume work, that may be performed by a thermodynamically closed system at constant temperature and pressure. It also provides a necessary condition for processes such as chemical reactions that may occur under these conditions. The Gibbs free energy is expressed as

In the fields of databases and transaction processing, a schedule of a system is an abstract model to describe execution of transactions running in the system. Often it is a list of operations (actions) ordered by time, performed by a set of transactions that are executed together in the system. If the order in time between certain operations is not determined by the system, then a partial order is used. Examples of such operations are requesting a read operation, reading, writing, aborting, committing, requesting a lock, locking, etc. Not all transaction operation types should be included in a schedule, and typically only selected operation types are included, as needed to reason about and describe certain phenomena. Schedules and schedule properties are fundamental concepts in database concurrency control theory.

Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using some measures of interestingness. In any given transaction with a variety of items, association rules are meant to discover the rules that determine how or why certain items are connected.

The equilibrium constant of a chemical reaction is the value of its reaction quotient at chemical equilibrium, a state approached by a dynamic chemical system after sufficient time has elapsed at which its composition has no measurable tendency towards further change. For a given set of reaction conditions, the equilibrium constant is independent of the initial analytical concentrations of the reactant and product species in the mixture. Thus, given the initial composition of a system, known equilibrium constant values can be used to determine the composition of the system at equilibrium. However, reaction parameters like temperature, solvent, and ionic strength may all influence the value of the equilibrium constant.

<span class="mw-page-title-main">Linearizability</span> Property of some operation(s) in concurrent programming

In concurrent programming, an operation is linearizable if it consists of an ordered list of invocation and response events, that may be extended by adding response events such that:

  1. The extended list can be re-expressed as a sequential history.
  2. That sequential history is a subset of the original unextended list.

In computer science, software transactional memory (STM) is a concurrency control mechanism analogous to database transactions for controlling access to shared memory in concurrent computing. It is an alternative to lock-based synchronization. STM is a strategy implemented in software, rather than as a hardware component. A transaction in this context occurs when a piece of code executes a series of reads and writes to shared memory. These reads and writes logically occur at a single instant in time; intermediate states are not visible to other (successful) transactions. The idea of providing hardware support for transactions originated in a 1986 paper by Tom Knight. The idea was popularized by Maurice Herlihy and J. Eliot B. Moss. In 1995 Nir Shavit and Dan Touitou extended this idea to software-only transactional memory (STM). Since 2005, STM has been the focus of intense research and support for practical implementations is growing.

Eventual consistency is a consistency model used in distributed computing to achieve high availability that informally guarantees that, if no new updates are made to a given data item, eventually all accesses to that item will return the last updated value. Eventual consistency, also called optimistic replication, is widely deployed in distributed systems and has origins in early mobile computing projects. A system that has achieved eventual consistency is often said to have converged, or achieved replica convergence. Eventual consistency is a weak guarantee – most stronger models, like linearizability, are trivially eventually consistent.

The Goldman–Hodgkin–Katz voltage equation, sometimes called the Goldman equation, is used in cell membrane physiology to determine the reversal potential across a cell's membrane, taking into account all of the ions that are permeant through that membrane.

In concurrency control of databases, transaction processing, and various transactional applications, both centralized and distributed, a transaction schedule is serializable if its outcome is equal to the outcome of its transactions executed serially, i.e. without overlapping in time. Transactions are normally executed concurrently, since this is the most efficient way. Serializability is the major correctness criterion for concurrent transactions' executions. It is considered the highest level of isolation between transactions, and plays an essential role in concurrency control. As such it is supported in all general purpose database systems. Strong strict two-phase locking (SS2PL) is a popular serializability mechanism utilized in most of the database systems since their early days in the 1970s.

Commitment ordering (CO) is a class of interoperable serializability techniques in concurrency control of databases, transaction processing, and related applications. It allows optimistic (non-blocking) implementations. With the proliferation of multi-core processors, CO has also been increasingly utilized in concurrent programming, transactional memory, and software transactional memory (STM) to achieve serializability optimistically. CO is also the name of the resulting transaction schedule (history) property, defined in 1988 with the name dynamic atomicity. In a CO compliant schedule, the chronological order of commitment events of transactions is compatible with the precedence order of the respective transactions. CO is a broad special case of conflict serializability and effective means to achieve global serializability across any collection of database systems that possibly use different concurrency control mechanisms.

In databases, and transaction processing, snapshot isolation is a guarantee that all reads made in a transaction will see a consistent snapshot of the database, and the transaction itself will successfully commit only if no updates it has made conflict with any concurrent updates made since that snapshot.

In Itô calculus, the Euler–Maruyama method is a method for the approximate numerical solution of a stochastic differential equation (SDE). It is an extension of the Euler method for ordinary differential equations to stochastic differential equations. It is named after Leonhard Euler and Gisiro Maruyama. Unfortunately, the same generalization cannot be done for any arbitrary deterministic method.

A precedence graph, also named conflict graph and serializability graph, is used in the context of concurrency control in databases.

A quorum is the minimum number of votes that a distributed transaction has to obtain in order to be allowed to perform an operation in a distributed system. A quorum-based technique is implemented to enforce consistent operation in a distributed system.