X/Open XA

Last updated

For transaction processing in computing, the X/Open XA standard (short for "eXtended Architecture") is a specification released in 1991 [1] by X/Open (which later merged with The Open Group) for distributed transaction processing (DTP).

Contents

Goals

The goal of XA is to guarantee atomicity in "global transactions" that are executed across heterogeneous components. A transaction is a unit of work such as transferring money from one person to another. Distributed transactions update multiple data stores (such as databases, application servers, message queues, transactional caches, etc.). To guarantee integrity, XA uses a two-phase commit (2PC) to ensure that all of a transaction's changes either take effect (commit) or do not (roll back), i.e., atomically.

Architecture

Specifically, XA describes the interface between a global transaction manager and a specific application. An application that wants to use XA engages an XA transaction manager using a library or separate service. The transaction manager tracks the participants in the transaction (i.e. the various data stores to which the application writes), and works with them to carry out the two-phase commit. In other words, the XA transaction manager is separate from an application's interactions with servers. XA maintains a log of its decisions to commit or roll back, which it can use to recover in case of a system outage. [1]

Many software vendors support XA (meaning the software can participate in XA transactions), including a variety of relational databases and message brokers. [1]

Advantages and disadvantages

Since XA uses two-phase commit, the advantages and disadvantages of that protocol generally apply to XA. The main advantage is that XA (using 2PC) allows an atomic transaction across multiple heterogeneous technologies (e.g. a single transaction could encompass multiple databases from different vendors as well as an email server and a message broker), whereas traditional database transactions are limited to a single database.

The main disadvantage is that 2PC is a blocking protocol: the other servers need to wait for the transaction manager to issue a decision about whether to commit or abort each transaction. If the transaction manager goes offline while transactions are waiting for its final decision, they will be stuck and hold their database locks until the transaction manager comes online again and issues its decision. This extended holding of locks may be disruptive to other applications that are using the same databases. [1]

Moreover, if the transaction manager crashes and its record of decisions cannot be recovered (e.g. due to a bug in how the decisions were logged, or due to data corruption on the server), manual intervention may be necessary. Many XA implementations provide an "escape hatch" for transactions to independently decide whether to commit or abort (without waiting to hear from the transaction manager), but this risks violating the atomicity guarantee and is therefore reserved for emergencies. [1]

Specification

The XA specification describes what a resource manager must do to support transactional access. Resource managers that follow this specification are said to be XA-compliant.

The XA specification was based on an interface used in the Tuxedo system developed in the 1980s, but adopted by several systems since then. [2]

See also

Related Research Articles

The Jakarta Transactions, one of the Jakarta EE APIs, enables distributed transactions to be done across multiple X/Open XA resources in a Java environment. JTA was a specification developed under the Java Community Process as JSR 907. JTA provides for:

In computer science, ACID is a set of properties of database transactions intended to guarantee data validity despite errors, power failures, and other mishaps. In the context of databases, a sequence of database operations that satisfies the ACID properties is called a transaction. For example, a transfer of funds from one bank account to another, even involving multiple changes such as debiting one account and crediting another, is a single transaction.

In information technology and computer science, especially in the fields of computer programming, operating systems, multiprocessors, and databases, concurrency control ensures that correct results for concurrent operations are generated, while getting those results as quickly as possible.

Optimistic concurrency control (OCC), also known as optimistic locking, is a non-locking concurrency control method applied to transactional systems such as relational database management systems and software transactional memory. OCC assumes that multiple transactions can frequently complete without interfering with each other. While running, transactions use data resources without acquiring locks on those resources. Before committing, each transaction verifies that no other transaction has modified the data it has read. If the check reveals conflicting modifications, the committing transaction rolls back and can be restarted. Optimistic concurrency control was first proposed in 1979 by H. T. Kung and John T. Robinson.

A database transaction symbolizes a unit of work, performed within a database management system against a database, that is treated in a coherent and reliable way independent of other transactions. A transaction generally represents any change in a database. Transactions in a database environment have two main purposes:

  1. To provide reliable units of work that allow correct recovery from failures and keep a database consistent even in cases of system failure. For example: when execution prematurely and unexpectedly stops in which case many operations upon a database remain uncompleted, with unclear status.
  2. To provide isolation between programs accessing a database concurrently. If this isolation is not provided, the programs' outcomes are possibly erroneous.

In databases and transaction processing, two-phase locking (2PL) is a pessimistic concurrency control method that guarantees conflict-serializability. It is also the name of the resulting set of database transaction schedules (histories). The protocol uses locks, applied by a transaction to data, which may block other transactions from accessing the same data during the transaction's life.

In database systems, durability is the ACID property that guarantees that the effects of transactions that have been committed will survive permanently, even in case of failures, including incidents and catastrophic events. For example, if a flight booking reports that a seat has successfully been booked, then the seat will remain booked even if the system crashes.

In database systems, isolation is one of the ACID transaction properties. It determines how transaction integrity is visible to other users and systems. A lower isolation level increases the ability of many users to access the same data at the same time, but also increases the number of concurrency effects users might encounter. Conversely, a higher isolation level reduces the types of concurrency effects that users may encounter, but requires more system resources and increases the chances that one transaction will block another.

A distributed transaction is a database transaction in which two or more network hosts are involved. Usually, hosts provide transactional resources, while a transaction manager creates and manages a global transaction that encompasses all operations against such resources. Distributed transactions, as any other transactions, must have all four ACID properties, where atomicity guarantees all-or-nothing outcomes for the unit of work.

In transaction processing, databases, and computer networking, the two-phase commit protocol is a type of atomic commitment protocol (ACP). It is a distributed algorithm that coordinates all the processes that participate in a distributed atomic transaction on whether to commit or abort the transaction. This protocol achieves its goal even in many cases of temporary system failure, and is thus widely used. However, it is not resilient to all possible failure configurations, and in rare cases, manual intervention is needed to remedy an outcome. To accommodate recovery from failure the protocol's participants use logging of the protocol's states. Log records, which are typically slow to generate but survive failures, are used by the protocol's recovery procedures. Many protocol variants exist that primarily differ in logging strategies and recovery mechanisms. Though usually intended to be used infrequently, recovery procedures compose a substantial portion of the protocol, due to many possible failure scenarios to be considered and supported by the protocol.

Tuxedo is a middleware platform used to manage distributed transaction processing in distributed computing environments. Tuxedo is a transaction processing system or transaction-oriented middleware, or enterprise application server for a variety of systems and programming languages. Developed by AT&T in the 1980s, it became a software product of Oracle Corporation in 2008 when they acquired BEA Systems. Tuxedo is now part of the Oracle Fusion Middleware.

In computer science, software transactional memory (STM) is a concurrency control mechanism analogous to database transactions for controlling access to shared memory in concurrent computing. It is an alternative to lock-based synchronization. STM is a strategy implemented in software, rather than as a hardware component. A transaction in this context occurs when a piece of code executes a series of reads and writes to shared memory. These reads and writes logically occur at a single instant in time; intermediate states are not visible to other (successful) transactions. The idea of providing hardware support for transactions originated in a 1986 paper by Tom Knight. The idea was popularized by Maurice Herlihy and J. Eliot B. Moss. In 1995, Nir Shavit and Dan Touitou extended this idea to software-only transactional memory (STM). Since 2005, STM has been the focus of intense research and support for practical implementations is growing.

In computer networking and databases, the three-phase commit protocol (3PC) is a distributed algorithm which lets all nodes in a distributed system agree to commit a transaction. It is a more failure-resilient refinement of the two-phase commit protocol (2PC).

Online transaction processing (OLTP) is a type of database system used in transaction-oriented applications, such as many operational systems. "Online" refers to that such systems are expected to respond to user requests and process them in real-time. The term is contrasted with online analytical processing (OLAP) which instead focuses on data analysis.

Replication in computing involves sharing information so as to ensure consistency between redundant resources, such as software or hardware components, to improve reliability, fault-tolerance, or accessibility.

Commitment ordering (CO) is a class of interoperable serializability techniques in concurrency control of databases, transaction processing, and related applications. It allows optimistic (non-blocking) implementations. With the proliferation of multi-core processors, CO has also been increasingly utilized in concurrent programming, transactional memory, and software transactional memory (STM) to achieve serializability optimistically. CO is also the name of the resulting transaction schedule (history) property, defined in 1988 with the name dynamic atomicity. In a CO compliant schedule, the chronological order of commitment events of transactions is compatible with the precedence order of the respective transactions. CO is a broad special case of conflict serializability and effective means to achieve global serializability across any collection of database systems that possibly use different concurrency control mechanisms.

<span class="mw-page-title-main">Virtuoso Universal Server</span> Computer software

Virtuoso Universal Server is a middleware and database engine hybrid that combines the functionality of a traditional relational database management system (RDBMS), object–relational database (ORDBMS), virtual database, RDF, XML, free-text, web application server and file server functionality in a single system. Rather than have dedicated servers for each of the aforementioned functionality realms, Virtuoso is a "universal server"; it enables a single multithreaded server process that implements multiple protocols. The free and open source edition of Virtuoso Universal Server is also known as OpenLink Virtuoso. The software has been developed by OpenLink Software with Kingsley Uyi Idehen and Orri Erling as the chief software architects.

A quorum is the minimum number of votes that a distributed transaction has to obtain in order to be allowed to perform an operation in a distributed system. A quorum-based technique is implemented to enforce consistent operation in a distributed system.

Enduro/X is an open-source middleware platform for distributed transaction processing. It is built on proven APIs such as X/Open group's XATMI and XA. The platform is designed for building real-time microservices based applications with a clusterization option. Enduro/X functions as an extended drop-in replacement for Oracle Tuxedo. The platform uses in-memory POSIX Kernel queues which insures high interprocess communication throughput.

Database scalability is the ability of a database to handle changing demands by adding/removing resources. Databases use a host of techniques to cope.

References

  1. 1 2 3 4 5 Kleppmann, Martin (April 2, 2017). Designing Data-Intensive Applications (1 ed.). O'Reilly Media. pp. 361–364. ISBN   978-1449373320.
  2. Philip A. Bernstein; Eric Newcomer (2009). Principles of transaction processing. Morgan Kaufmann. pp. 330–336. ISBN   978-1-55860-623-4.