Two-phase locking

Last updated

In databases and transaction processing, two-phase locking (2PL) is a pessimistic concurrency control method that guarantees conflict-serializability. [1] [2] It is also the name of the resulting set of database transaction schedules (histories). The protocol uses locks, applied by a transaction to data, which may block (interpreted as signals to stop) other transactions from accessing the same data during the transaction's life.

Contents

By the 2PL protocol, locks are applied and removed in two phases:

  1. Expanding phase: locks are acquired and no locks are released.
  2. Shrinking phase: locks are released and no locks are acquired.

Two types of locks are used by the basic protocol: Shared and Exclusive locks. Refinements of the basic protocol may use more lock types. Using locks that block processes, 2PL, S2PL, and SS2PL may be subject to deadlocks that result from the mutual blocking of two or more transactions.

Read and write locks

Locks are used to guarantee serializability. A transaction is holding a lock on an object if it that transaction has acquired a lock on that object which has not yet been released.

For 2PL, the only used data-access locks are read-locks (shared locks) and write-locks (exclusive locks). Below are the rules for read-locks and write-locks:

Lock compatibility table
Lock typeread-lockwrite-lock
read-lockX
write-lockXX

Variants

guarantees conflict-serializabilityguarantees view-serializabilityeliminates deadlocksguarantees recoverabilityguarantees strictnessprevents phantom reads prevents dirty reads
2PLYesNoNoNoNoNoNo
C2PLYesYes[ citation needed ]Yesyes?[ citation needed ]yes?[ citation needed ]No[ citation needed ]Yes[ citation needed ]
S2PLYesNoNoYesYesYesYes
SS2PLYesNoNoYesYesYesYes

Two-phase locking

According to the two-phase locking protocol, each transaction handles its locks in two distinct, consecutive phases during the transaction's execution:

  1. Expanding phase (aka Growing phase): locks are acquired and no locks are released (the number of locks can only increase).
  2. Shrinking phase (aka Contracting phase): locks are released and no locks are acquired.

The two phase locking rules can be summarized as: each transaction must never acquire a lock after it has released a lock. The serializability property is guaranteed for a schedule with transactions that obey this rule.

Typically, without explicit knowledge in a transaction on end of phase 1, the rule is safely determined only when a transaction has completed processing and requested commit. In this case, all the locks can be released at once (phase 2).

Conservative two-phase locking

The difference between 2PL and C2PL is that C2PL's transactions obtain all the locks they need before the transactions begin. This is to ensure that a transaction that already holds some locks will not block waiting for other locks. Conservative 2PL prevents deadlocks.

Strict two-phase locking

To comply with the strict two-phase locking (S2PL) protocol, a transaction needs to comply with 2PL, and release its write (exclusive) locks only after the transaction has ended (i.e., either committed or aborted). On the other hand, read (shared) locks are released regularly during the shrinking phase.

Unlike 2PL, S2PL provides strictness (a special case of cascade-less recoverability). This protocol is not appropriate in B-trees because it causes Bottleneck (while B-trees always starts searching from the parent root). [ citation needed ]

Strong strict two-phase locking

or Rigorousness, or Rigorous scheduling, or Rigorous two-phase locking

To comply with strong strict two-phase locking (SS2PL), a transaction's read and write locks are released only after that transaction has ended (i.e., either committed or aborted). A transaction obeying SS2PL has only a phase 1 and lacks a phase 2 until the transaction has completed. Every SS2PL schedule is also an S2PL schedule, but not vice versa.

See also

Related Research Articles

In computer science, ACID is a set of properties of database transactions intended to guarantee data validity despite errors, power failures, and other mishaps. In the context of databases, a sequence of database operations that satisfies the ACID properties is called a transaction. For example, a transfer of funds from one bank account to another, even involving multiple changes such as debiting one account and crediting another, is a single transaction.

In information technology and computer science, especially in the fields of computer programming, operating systems, multiprocessors, and databases, concurrency control ensures that correct results for concurrent operations are generated, while getting those results as quickly as possible.


In computer science, conservative two-phase locking (C2PL) is a locking method used in DBMS and relational databases.

Multiversion concurrency control, is a non-locking concurrency control method commonly used by database management systems to provide concurrent access to the database and in programming languages to implement transactional memory.

In computer science, a lock or mutex is a synchronization primitive that prevents state from being modified or accessed by multiple threads of execution at once. Locks enforce mutual exclusion concurrency control policies, and with a variety of possible methods there exist multiple unique implementations for different applications.

In database systems, isolation determines how transaction integrity is visible to other users and systems. A lower isolation level increases the ability of many users to access the same data at the same time, but also increases the number of concurrency effects users might encounter. Conversely, a higher isolation level reduces the types of concurrency effects that users may encounter, but requires more system resources and increases the chances that one transaction will block another.

In the fields of databases and transaction processing, a schedule of a system is an abstract model to describe execution of transactions running in the system. Often it is a list of operations (actions) ordered by time, performed by a set of transactions that are executed together in the system. If the order in time between certain operations is not determined by the system, then a partial order is used. Examples of such operations are requesting a read operation, reading, writing, aborting, committing, requesting a lock, locking, etc. Not all transaction operation types should be included in a schedule, and typically only selected operation types are included, as needed to reason about and describe certain phenomena. Schedules and schedule properties are fundamental concepts in database concurrency control theory. A schedule describes the order of actions of the transactions as seen by the DBMS.

In computer science, an algorithm is called non-blocking if failure or suspension of any thread cannot cause failure or suspension of another thread; for some operations, these algorithms provide a useful alternative to traditional blocking implementations. A non-blocking algorithm is lock-free if there is guaranteed system-wide progress, and wait-free if there is also guaranteed per-thread progress. "Non-blocking" was used as a synonym for "lock-free" in the literature until the introduction of obstruction-freedom in 2003.

A distributed transaction is a database transaction in which two or more network hosts are involved. Usually, hosts provide transactional resources, while a transaction manager creates and manages a global transaction that encompasses all operations against such resources. Distributed transactions, as any other transactions, must have all four ACID properties, where atomicity guarantees all-or-nothing outcomes for the unit of work.

In transaction processing, databases, and computer networking, the two-phase commit protocol is a type of atomic commitment protocol (ACP). It is a distributed algorithm that coordinates all the processes that participate in a distributed atomic transaction on whether to commit or abort the transaction. This protocol achieves its goal even in many cases of temporary system failure, and is thus widely used. However, it is not resilient to all possible failure configurations, and in rare cases, manual intervention is needed to remedy an outcome. To accommodate recovery from failure the protocol's participants use logging of the protocol's states. Log records, which are typically slow to generate but survive failures, are used by the protocol's recovery procedures. Many protocol variants exist that primarily differ in logging strategies and recovery mechanisms. Though usually intended to be used infrequently, recovery procedures compose a substantial portion of the protocol, due to many possible failure scenarios to be considered and supported by the protocol.

<span class="mw-page-title-main">Linearizability</span> Property of some operation(s) in concurrent programming

In concurrent programming, an operation is linearizable if it consists of an ordered list of invocation and response events, that may be extended by adding response events such that:

  1. The extended list can be re-expressed as a sequential history.
  2. That sequential history is a subset of the original unextended list.

In computer science, software transactional memory (STM) is a concurrency control mechanism analogous to database transactions for controlling access to shared memory in concurrent computing. It is an alternative to lock-based synchronization. STM is a strategy implemented in software, rather than as a hardware component. A transaction in this context occurs when a piece of code executes a series of reads and writes to shared memory. These reads and writes logically occur at a single instant in time; intermediate states are not visible to other (successful) transactions. The idea of providing hardware support for transactions originated in a 1986 paper by Tom Knight. The idea was popularized by Maurice Herlihy and J. Eliot B. Moss. In 1995, Nir Shavit and Dan Touitou extended this idea to software-only transactional memory (STM). Since 2005, STM has been the focus of intense research and support for practical implementations is growing.

For transaction processing in computing, the X/Open XA standard is a specification released in 1991 by X/Open for distributed transaction processing (DTP).

In computer science and engineering, transactional memory attempts to simplify concurrent programming by allowing a group of load and store instructions to execute in an atomic way. It is a concurrency control mechanism analogous to database transactions for controlling access to shared memory in concurrent computing. Transactional memory systems provide high-level abstraction as an alternative to low-level thread synchronization. This abstraction allows for coordination between concurrent reads and writes of shared data in parallel systems.

Commitment ordering (CO) is a class of interoperable serializability techniques in concurrency control of databases, transaction processing, and related applications. It allows optimistic (non-blocking) implementations. With the proliferation of multi-core processors, CO has also been increasingly utilized in concurrent programming, transactional memory, and software transactional memory (STM) to achieve serializability optimistically. CO is also the name of the resulting transaction schedule (history) property, defined in 1988 with the name dynamic atomicity. In a CO compliant schedule, the chronological order of commitment events of transactions is compatible with the precedence order of the respective transactions. CO is a broad special case of conflict serializability and effective means to achieve global serializability across any collection of database systems that possibly use different concurrency control mechanisms.

In concurrency control of databases, transaction processing, and other transactional distributed applications, global serializability is a property of a global schedule of transactions. A global schedule is the unified schedule of all the individual database schedules in a multidatabase environment. Complying with global serializability means that the global schedule is serializable, has the serializability property, while each component database (module) has a serializable schedule as well. In other words, a collection of serializable components provides overall system serializability, which is usually incorrect. A need in correctness across databases in multidatabase systems makes global serializability a major goal for global concurrency control. With the proliferation of the Internet, Cloud computing, Grid computing, and small, portable, powerful computing devices, as well as increase in systems management sophistication, the need for atomic distributed transactions and thus effective global serializability techniques, to ensure correctness in and among distributed transactional applications, seems to increase.

Distributed concurrency control is the concurrency control of a system distributed over a computer network.

<span class="mw-page-title-main">Wait-for graph</span>

A wait-for graph in computer science is a directed graph used for deadlock detection in operating systems and relational database systems.

In databases and transaction processing the term Locks with ordered sharing comprises several variants of the two-phase locking (2PL) concurrency control protocol generated by changing the blocking semantics of locks upon conflicts. Further softening of locks eliminates thrashing.

A quorum is the minimum number of votes that a distributed transaction has to obtain in order to be allowed to perform an operation in a distributed system. A quorum-based technique is implemented to enforce consistent operation in a distributed system.

References

  1. Philip A. Bernstein, Vassos Hadzilacos, Nathan Goodman (1987): Concurrency Control and Recovery in Database Systems, Addison Wesley Publishing Company, ISBN   0-201-10715-5
  2. Gerhard Weikum, Gottfried Vossen (2001): Transactional Information Systems, Elsevier, ISBN   1-55860-508-8