Long-lived transaction

Last updated

A long-lived transaction is a transaction that spans multiple database transactions. The transaction is considered "long-lived" because its boundaries must, by necessity of business logic, extend past a single database transaction. A long-lived transaction can be thought of as a sequence of database transactions grouped to achieve a single atomic result.

A transaction symbolizes a unit of work performed within a database management system against a database, and treated in a coherent and reliable way independent of other transactions. A transaction generally represents any change in a database. Transactions in a database environment have two main purposes:

  1. To provide reliable units of work that allow correct recovery from failures and keep a database consistent even in cases of system failure, when execution stops and many operations upon a database remain uncompleted, with unclear status.
  2. To provide isolation between programs accessing a database concurrently. If this isolation is not provided, the programs' outcomes are possibly erroneous.

A common example is a multi-step sequence of requests and responses of an interaction with a user through a web client.

A long-lived transaction creates challenges of concurrency control and scalability.

In information technology and computer science, especially in the fields of computer programming, operating systems, multiprocessors, and databases, concurrency control ensures that correct results for concurrent operations are generated, while getting those results as quickly as possible.

Scalability

Scalability is the property of a system to handle a growing amount of work by adding resources to the system.

A chief strategy in designing long-lived transactions is optimistic concurrency control with versioning.

Optimistic concurrency control (OCC) is a concurrency control method applied to transactional systems such as relational database management systems and software transactional memory. OCC assumes that multiple transactions can frequently complete without interfering with each other. While running, transactions use data resources without acquiring locks on those resources. Before committing, each transaction verifies that no other transaction has modified the data it has read. If the check reveals conflicting modifications, the committing transaction rolls back and can be restarted. Optimistic concurrency control was first proposed by H.T. Kung and John T. Robinson.

A component of software configuration management, version control, also known as revision control or source control, is the management of changes to documents, computer programs, large web sites, and other collections of information. Changes are usually identified by a number or letter code, termed the "revision number", "revision level", or simply "revision". For example, an initial set of files is "revision 1". When the first change is made, the resulting set is "revision 2", and so on. Each revision is associated with a timestamp and the person making the change. Revisions can be compared, restored, and with some types of files, merged.

See also


Related Research Articles

In computer science, ACID is a set of properties of database transactions intended to guarantee validity even in the event of errors, power failures, etc. In the context of databases, a sequence of database operations that satisfies the ACID properties is called a transaction. For example, a transfer of funds from one bank account to another, even involving multiple changes such as debiting one account and crediting another, is a single transaction.

Transaction processing is information processing in computer science that is divided into individual, indivisible operations called transactions. Each transaction must succeed or fail as a complete unit; it can never be only partially complete.

Multiversion concurrency control, is a concurrency control method commonly used by database management systems to provide concurrent access to the database and in programming languages to implement transactional memory.

In database systems, isolation determines how transaction integrity is visible to other users and systems. For example, when a user is creating a Purchase Order and has created the header, but not the Purchase Order lines, is the header available for other systems/users to see?

In the fields of databases and transaction processing, a schedule of a system is an abstract model to describe execution of transactions running in the system. Often it is a list of operations (actions) ordered by time, performed by a set of transactions that are executed together in the system. If the order in time between certain operations is not determined by the system, then a partial order is used. Examples of such operations are requesting a read operation, reading, writing, aborting, committing, requesting a lock, locking, etc. Not all transaction operation types should be included in a schedule, and typically only selected operation types are included, as needed to reason about and describe certain phenomena. Schedules and schedule properties are fundamental concepts in database concurrency control theory.

A distributed transaction is a database transaction in which two or more network hosts are involved. Usually, hosts provide transactional resources, while the transaction manager is responsible for creating and managing a global transaction that encompasses all operations against such resources. Distributed transactions, as any other transactions, must have all four ACID properties, where atomicity guarantees all-or-nothing outcomes for the unit of work.

In databases an index is a data structure, part of the database, used by a database system to efficiently navigate access to user data. Index data are system data distinct from user data, and consist primarily of pointers. Changes in a database, may require indexes to be updated to maintain accurate user data accesses. Index locking is a technique used to maintain index integrity. A portion of an index is locked during a database transaction when this portion is being accessed by the transaction as a result of attempt to access related user data. Additionally, special database system transactions may be invoked to maintain and modify an index, as part of a system's self-maintenance activities. When a portion of an index is locked by a transaction, other transactions may be blocked from accessing this index portion. Index Locking Protocol guarantees that Phantom Phenomenon won't occur. Index locking protocol states:

Extensible Storage Engine (ESE), also known as JET Blue, is an ISAM data storage technology from Microsoft. ESE is the core of Microsoft Exchange Server, Active Directory, and Windows Search. It's also used by a number of Windows components including Windows Update client and Help and Support Center. Its purpose is to allow applications to store and retrieve data via indexed and sequential access.

In Online transaction processing (OLTP), information systems typically facilitate and manage transaction-oriented applications.

Transaction processing is a way of computing that divides work into individual, indivisible operations, called transactions. A transaction processing system (TPS) is a software system, or software/hardware combination, that supports transaction processing.

In concurrency control of databases, transaction processing, and various transactional applications, both centralized and distributed, a transaction schedule is serializable if its outcome is equal to the outcome of its transactions executed serially, i.e. without overlapping in time. Transactions are normally executed concurrently, since this is the most efficient way. Serializability is the major correctness criterion for concurrent transactions' executions. It is considered the highest level of isolation between transactions, and plays an essential role in concurrency control. As such it is supported in all general purpose database systems. Strong strict two-phase locking (SS2PL) is a popular serializability mechanism utilized in most of the database systems since their early days in the 1970s.

Commitment ordering (CO) is a class of interoperable serializability techniques in concurrency control of databases, transaction processing, and related applications. It allows optimistic (non-blocking) implementations. With the proliferation of multi-core processors, CO has been also increasingly utilized in concurrent programming, transactional memory, and especially in software transactional memory (STM) for achieving serializability optimistically. CO is also the name of the resulting transaction schedule (history) property, which was originally defined in 1988 with the name dynamic atomicity. In a CO compliant schedule the chronological order of commitment events of transactions is compatible with the precedence order of the respective transactions. CO is a broad special case of conflict serializability, and effective means to achieve global serializability across any collection of database systems that possibly use different concurrency control mechanisms.

In databases, and transaction processing, snapshot isolation is a guarantee that all reads made in a transaction will see a consistent snapshot of the database, and the transaction itself will successfully commit only if no updates it has made conflict with any concurrent updates made since that snapshot.

Long-running transactions are computer database transactions that avoid locks on non-local resources, use compensation to handle failures, potentially aggregate smaller ACID transactions, and typically use a coordinator to complete or abort the transaction. In contrast to rollback in ACID transactions, compensation restores the original state, or an equivalent, and is business-specific. For example, the compensating action for making a hotel reservation is canceling that reservation, possibly with a penalty.

In concurrency control of databases, transaction processing, and other transactional distributed applications, global serializability is a property of a global schedule of transactions. A global schedule is the unified schedule of all the individual database schedules in a multidatabase environment. Complying with global serializability means that the global schedule is serializable, has the serializability property, while each component database (module) has a serializable schedule as well. In other words, a collection of serializable components provides overall system serializability, which is usually incorrect. A need in correctness across databases in multidatabase systems makes global serializability a major goal for global concurrency control. With the proliferation of the Internet, Cloud computing, Grid computing, and small, portable, powerful computing devices, as well as increase in systems management sophistication, the need for atomic distributed transactions and thus effective global serializability techniques, to ensure correctness in and among distributed transactional applications, seems to increase.

Distributed concurrency control is the concurrency control of a system distributed over a computer network.

FoundationDB is a free and open-source multi-model distributed NoSQL database developed by Apple Inc. with a shared nothing architecture. The product was designed around a "core" database, with additional features supplied in "layers." The core database exposes an ordered key-value store with transactions. The transactions are able to read or write multiple keys stored on any machine in the cluster while fully supporting ACID properties. Transactions are used to implement a variety of data models via layers.