CAP theorem

Last updated

In database theory, the CAP theorem, also named Brewer's theorem after computer scientist Eric Brewer, states that any distributed data store can provide only two of the following three guarantees: [1] [2] [3]

Contents

Consistency
Every read receives the most recent write or an error.
Availability
Every request received by a non-failing node in the system must result in a response. This is definition of availability in CAP theorem as defined by Gilbert and Lynch. [1] But note that there are different notions of availability: in distributed systems, availability means the percentage of requests that clients see as successful, or something close to that. [4]
Partition tolerance
The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes.

When a network partition failure happens, it must be decided whether to do one of the following:

CAP theorem Euler diagram CAP Theorem Venn Diagram.png
CAP theorem Euler diagram

Thus, if there is a network partition, one has to choose between consistency or availability. Note that consistency as defined in the CAP theorem is quite different from the consistency guaranteed in ACID database transactions. [5]

Explanation

No distributed system is safe from network failures, thus network partitioning generally has to be tolerated. [6] [7] In the presence of a partition, one is then left with two options: consistency or availability. When choosing consistency over availability, the system will return an error or a time out if particular information cannot be guaranteed to be up to date due to network partitioning. When choosing availability over consistency, the system will always process the query and try to return the most recent available version of the information, even if it cannot guarantee it is up to date due to network partitioning.

In the absence of a partition, both availability and consistency can be satisfied. [8]

Database systems designed with traditional ACID guarantees in mind such as RDBMS choose consistency over availability, whereas systems designed around the BASE philosophy, common in the NoSQL movement for example, choose availability over consistency. [9]

Marc Brooker from Amazon Web Services argues CAP Available doesn’t mean ‘highly available to clients’ and availability suffers whether you choose Availability or Consistency. In practice, picking an Availability over Consistency means that it’s going to be more available to some clients in a fairly limited set of circumstances. That may very well be worth it, but it’s in no way a panacea for availability. [10]

History

According to computer scientist Eric Brewer of the University of California, Berkeley, the theorem first appeared in autumn 1998. [9] It was published as the CAP principle in 1999 [11] and presented as a conjecture by Brewer at the 2000 Symposium on Principles of Distributed Computing (PODC). [12] In 2002, Seth Gilbert and Nancy Lynch of MIT published a formal proof of Brewer's conjecture, rendering it a theorem. [1]

In 2012, Brewer clarified some of his positions, including why the often-used "two out of three" concept can be somewhat misleading because system designers only need to sacrifice consistency or availability in the presence of partitions; partition management and recovery techniques exist. Brewer also noted the different definition of consistency used in the CAP theorem relative to the definition used in ACID. [9] [13]

A similar theorem stating the trade-off between consistency and availability in distributed systems was published by Birman and Friedman in 1996. [14] Birman and Friedman's result restricted this lower bound to non-commuting operations.

The PACELC theorem, introduced in 2010, [8] builds on CAP by stating that even in the absence of partitioning, there is another trade-off between latency and consistency. PACELC means, if partition (P) happens, the trade-off is between availability (A) and consistency (C); Else (E), the trade-off is between latency (L) and consistency (C).

See also

Related Research Articles

<span class="mw-page-title-main">Database</span> Organized collection of data in computing

In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and analyze the data. The DBMS additionally encompasses the core facilities provided to administer the database. The sum total of the database, the DBMS and the associated applications can be referred to as a database system. Often the term "database" is also used loosely to refer to any of the DBMS, the database system or an application associated with the database.

In computer science, ACID is a set of properties of database transactions intended to guarantee data validity despite errors, power failures, and other mishaps. In the context of databases, a sequence of database operations that satisfies the ACID properties is called a transaction. For example, a transfer of funds from one bank account to another, even involving multiple changes such as debiting one account and crediting another, is a single transaction.

In information technology and computer science, especially in the fields of computer programming, operating systems, multiprocessors, and databases, concurrency control ensures that correct results for concurrent operations are generated, while getting those results as quickly as possible.

A distributed data store is a computer network where information is stored on more than one node, often in a replicated fashion. It is usually specifically used to refer to either a distributed database where users store information on a number of nodes, or a computer network in which users store information on a number of peer network nodes.

In database systems, consistency refers to the requirement that any given database transaction must change affected data only in allowed ways. Any data written to the database must be valid according to all defined rules, including constraints, cascades, triggers, and any combination thereof. This does not guarantee correctness of the transaction in all ways the application programmer might have wanted but merely that any programming errors cannot result in the violation of any defined database constraints.

Eventual consistency is a consistency model used in distributed computing to achieve high availability that informally guarantees that, if no new updates are made to a given data item, eventually all accesses to that item will return the last updated value. Eventual consistency, also called optimistic replication, is widely deployed in distributed systems and has origins in early mobile computing projects. A system that has achieved eventual consistency is often said to have converged, or achieved replica convergence. Eventual consistency is a weak guarantee – most stronger models, like linearizability, are trivially eventually consistent.

Replication in computing involves sharing information so as to ensure consistency between redundant resources, such as software or hardware components, to improve reliability, fault-tolerance, or accessibility.

Causal consistency is one of the major memory consistency models. In concurrent programming, where concurrent processes are accessing a shared memory, a consistency model restricts which accesses are legal. This is useful for defining correct data structures in distributed shared memory or distributed transactions.

M. Dale Skeen is an American computer scientist. He specializes in designing and implementing large-scale computing systems, distributed computing and database management systems.

In computer science, state machine replication (SMR) or state machine approach is a general method for implementing a fault-tolerant service by replicating servers and coordinating client interactions with server replicas. The approach also provides a framework for understanding and designing replication management protocols.

Paxos is a family of protocols for solving consensus in a network of unreliable or fallible processors. Consensus is the process of agreeing on one result among a group of participants. This problem becomes difficult when the participants or their communications may experience failures.

<span class="mw-page-title-main">Eric Brewer (scientist)</span>

Eric Allen Brewer is professor emeritus of computer science at the University of California, Berkeley and vice-president of infrastructure at Google. His research interests include operating systems and distributed computing. He is known for formulating the CAP theorem about distributed network applications in the late 1990s.

Fault Tolerant Messaging in the context of computer systems and networks, refers to a design approach and set of techniques aimed at ensuring reliable and continuous communication between components or nodes even in the presence of errors or failures. This concept is especially critical in distributed systems, where components may be geographically dispersed and interconnected through networks, making them susceptible to various potential points of failure.

A network partition is a division of a computer network into relatively independent subnets, either by design, to optimize them separately, or due to the failure of network devices. Distributed software must be designed to be partition-tolerant, that is, even after the network is partitioned, it still works correctly.

NewSQL is a class of relational database management systems that seek to provide the scalability of NoSQL systems for online transaction processing (OLTP) workloads while maintaining the ACID guarantees of a traditional database system.

Kenneth P. Birman is a professor in the Department of Computer Science at Cornell University. He currently holds the N. Rama Rao Chair in Computer Science.

In distributed computing, a conflict-free replicated data type (CRDT) is a data structure that is replicated across multiple computers in a network, with the following features:

  1. The application can update any replica independently, concurrently and without coordinating with other replicas.
  2. An algorithm automatically resolves any inconsistencies that might occur.
  3. Although replicas may have different state at any particular point in time, they are guaranteed to eventually converge.
<span class="mw-page-title-main">PACELC theorem</span> Theorem in theoretical computer science

In database theory, the PACELC theorem is an extension to the CAP theorem. It states that in case of network partitioning (P) in a distributed computer system, one has to choose between availability (A) and consistency (C), but else (E), even when the system is running normally in the absence of partitions, one has to choose between latency (L) and loss of consistency (C).

A distributed SQL database is a single relational database which replicates data across multiple servers. Distributed SQL databases are strongly consistent and most support consistency across racks, data centers, and wide area networks including cloud availability zones and cloud geographic zones. Distributed SQL databases typically use the Paxos or Raft algorithms to achieve consensus across multiple nodes.

Daniel Abadi is the Darnell-Kanal Professor of Computer Science at University of Maryland, College Park. His primary area of research is database systems, with contributions to stream databases, distributed databases, graph databases, and column-store databases. He helped create C-Store, a column-oriented database, and HadoopDB, a hybrid of relational databases and Hadoop. Both database systems were commercialized by companies.

References

  1. 1 2 3 Gilbert, Seth; Lynch, Nancy (2002). "Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services". ACM SIGACT News. 33 (2). Association for Computing Machinery (ACM): 51–59. doi:10.1145/564585.564601. ISSN   0163-5700. S2CID   15892169.
  2. "Brewer's CAP Theorem". julianbrowne.com. 2009-01-11.
  3. "Brewers CAP Theorem on distributed systems". royans.net. 2010-02-14.
  4. "Availability and availability - Marc's Blog". brooker.co.za. Retrieved 2024-05-12.
  5. Liochon, Nicolas. "The confusing CAP and ACID wording". This long run. Retrieved 1 February 2019.
  6. Kleppmann, Martin (2015-09-18). A Critique of the CAP Theorem (Report). Apollo - University of Cambridge Repository. arXiv: 1509.05393 . Bibcode:2015arXiv150905393K. doi:10.17863/CAM.13083. S2CID   1991487 . Retrieved 24 November 2019.
  7. Martin, Kleppmann. "Please stop calling databases CP or AP". Martin Kleppmann's Blog. Retrieved 24 November 2019.
  8. 1 2 Abadi, Daniel (2010-04-23). "DBMS Musings: Problems with CAP, and Yahoo's little known NoSQL system". DBMS Musings. Retrieved 2018-01-23.
  9. 1 2 3 Brewer, Eric (2012). "CAP twelve years later: How the "rules" have changed". Computer. 45 (2). Institute of Electrical and Electronics Engineers (IEEE): 23–29. doi:10.1109/mc.2012.37. ISSN   0018-9162. S2CID   890105.
  10. "Availability and availability - Marc's Blog". brooker.co.za. Retrieved 2024-05-12.
  11. Armando Fox; Eric Brewer (1999). Harvest, Yield and Scalable Tolerant Systems. Proc. 7th Workshop Hot Topics in Operating Systems (HotOS 99). IEEE CS. pp. 174–178. doi:10.1109/HOTOS.1999.798396.
  12. Eric Brewer. "Towards Robust Distributed Systems" (PDF).
  13. Carpenter, Jeff; Hewitt, Eben (July 2016). Cassandra: The Definitive Guide (2nd ed.). Oreilly. ISBN   9781491933657. In February 2012, Eric Brewer provided an updated perspective on his CAP theorem [..] Brewer now describes the "2 out of 3" axiom as somewhat misleading. He notes that designers only need sacrifice consistency or availability in the presence of partitions, and that advances in partition recovery techniques have made it possible for designers to achieve high levels of both consistency and availability.
  14. Ken Birman; Roy Friedman (April 1996). "Trading Consistency for Availability in Distributed Systems". hdl:1813/7235.