PACELC theorem

Last updated December 06, 2024

In database theory, the PACELC theorem is an extension to the CAP theorem. It states that in case of network partitioning (P) in a distributed computer system, one has to choose between availability (A) and consistency (C) (as per the CAP theorem), but else (E), even when the system is running normally in the absence of partitions, one has to choose between latency (L) and loss of consistency (C).

Overview

The CAP theorem can be phrased as "PAC", the impossibility theorem that no distributed data store can be both consistent and available in executions that contains partitions. This can be proved by examining latency: if a system ensures consistency, then operation latencies grow with message delays, and hence operations cannot terminate eventually if the network is partitioned, i.e. the system cannot ensure availability.^[1]

In the absence of partitions, both consistency and availability can be satisfied.^[2] PACELC therefore goes further and examines how the system replicates data. Specifically, in the absence of partitions, an additional trade-off (ELC) exists between latency and consistency.^[3] If the store is atomically consistent, then the sum of the read and write delay is at least the message delay. In practice, most systems rely on explicit acknowledgments rather than timed delays to ensure delivery, requiring a full network round trip and therefore message delay on both reads and writes to ensure consistency.^[1] In low latency systems, in contrast, consistency is relaxed in order to reduce latency.^[2]

There are four configurations or tradeoffs in the PACELC space:

PA/EL - prioritize availability and latency over consistency
PA/EC - when there is a partition, choose availability; else, choose consistency
PC/EL - when there is a partition, choose consistency; else, choose latency
PC/EC - choose consistency at all times

PC/EC and PA/EL provide natural cognitive models for an application developer. A PC/EC system provides a firm guarantee of atomic consistency, as in ACID, while PA/EL provides high availability and low latency with a more complex consistency model. In contrast, PA/EC and PC/EL systems only make conditional guarantees of consistency. The developer still has to write code to handle the cases where the guarantee is not upheld. PA/EC systems are rare outside of the in-memory data grid industry, where systems are localized to geographic regions and the latency vs. consistency tradeoff is not significant.^[4] PC/EL is even more tricky to understand. PC does not indicate that the system is fully consistent; rather it indicates that the system does not reduce consistency beyond the baseline consistency level when a network partition occurs—instead, it reduces availability.^[3]

Some experts like Marc Brooker argue that the CAP theorem is particularly relevant in intermittently connected environments, such as those related to the Internet of Things (IoT) and mobile applications. In these contexts, devices may become partitioned due to challenging physical conditions, such as power outages or when entering confined spaces like elevators. For distributed systems, such as cloud applications, it is more appropriate to use the PACELC theorem, which is more comprehensive and considers trade-offs such as latency and consistency even in the absence of network partitions.^[5]

History

The PACELC theorem was first described by Daniel Abadi from Yale University in 2010 in a blog post,^[2] which he later clarified in a paper in 2012.^[3] The purpose of PACELC is to address his thesis that "Ignoring the consistency/latency trade-off of replicated systems is a major oversight [in CAP], as it is present at all times during system operation, whereas CAP is only relevant in the arguably rare case of a network partition." The PACELC theorem was proved formally in 2018 in a SIGACT News article.^[1]

Database PACELC ratings

^[3] Original database PACELC ratings are from.^[6] Subsequent updates contributed by wikipedia community.

The default versions of Amazon's early (internal) Dynamo, Cassandra, Riak, and Cosmos DB are PA/EL systems: if a partition occurs, they give up consistency for availability, and under normal operation they give up consistency for lower latency.
Fully ACID systems such as VoltDB/H-Store, Megastore, MySQL Cluster, and PostgreSQL are PC/EC: they refuse to give up consistency, and will pay the availability and latency costs to achieve it. Bigtable and related systems such as HBase are also PC/EC.
Amazon DynamoDB (launched January 2012) is quite different from the early (Amazon internal) Dynamo which was considered for the PACELC paper.^[6] DynamoDB follows a strong leader model, where every write is strictly serialized (and conditional writes carry no penalty) and supports read-after-write consistency. This guarantee does not apply to "Global Tables^[7]" across regions. The DynamoDB SDKs use eventually consistent reads by default (improved availability and throughput), but when a consistent read is requested the service will return either a current view to the item or an error.
Couchbase provides a range of consistency and availability options during a partition, and equally a range of latency and consistency options with no partition. Unlike most other databases, Couchbase doesn't have a single API set nor does it scale/replicate all data services homogeneously. For writes, Couchbase favors Consistency over Availability making it formally CP, but on read there is more user-controlled variability depending on index replication, desired consistency level and type of access (single document lookup vs range scan vs full-text search, etc.). On top of that, there is then further variability depending on cross-datacenter-replication (XDCR) which takes multiple CP clusters and connects them with asynchronous replication and Couchbase Lite which is an embedded database and creates a fully multi-master (with revision tracking) distributed topology.
Cosmos DB supports five tunable consistency levels that allow for tradeoffs between C/A during P, and L/C during E. Cosmos DB never violates the specified consistency level, so it's formally CP.
MongoDB can be classified as a PA/EC system. In the baseline case, the system guarantees reads and writes to be consistent.
PNUTS is a PC/EL system.
Hazelcast IMDG and indeed most in-memory data grids are an implementation of a PA/EC system; Hazelcast can be configured to be EL rather than EC.^[8] Concurrency primitives (Lock, AtomicReference, CountDownLatch, etc.) can be either PC/EC or PA/EC.^[9]
FaunaDB implements Calvin, a transaction protocol created by Dr. Daniel Abadi, the author^[3] of the PACELC theorem, and offers users adjustable controls for LC tradeoff. It is PC/EC for strictly serializable transactions, and EL for serializable reads.

DDBS	P+C	E+L	E+C
Aerospike^[10]	paid only	optional
Bigtable/HBase
Cassandra		^[a]	^[a]
Cosmos DB		^[b]
Couchbase
Dynamo		^[a]
DynamoDB
FaunaDB^[12]
Hazelcast IMDG^[8]^[9]
Megastore
MongoDB
MySQL Cluster
PNUTS
PostgreSQL
Riak		^[a]
SpiceDB^[13]
VoltDB/H-Store

Notes

1 2 3 4 Dynamo, Cassandra, and Riak have user-adjustable settings to control the LC tradeoff.^[6]
↑ Cosmos DB has five selectable consistency levels to control the LC tradeoff.^[11]

Related Research Articles

In database systems, consistency refers to the requirement that any given database transaction must change affected data only in allowed ways. Any data written to the database must be valid according to all defined rules, including constraints, cascades, triggers, and any combination thereof. This does not guarantee correctness of the transaction in all ways the application programmer might have wanted but merely that any programming errors cannot result in the violation of any defined database constraints.

Multi-master replication is a method of database replication which allows data to be stored by a group of computers, and updated by any member of the group. All members are responsive to client data queries. The multi-master replication system is responsible for propagating the data modifications made by each member to the rest of the group and resolving any conflicts that might arise between concurrent changes made by different members.

Replication in computing involves sharing information so as to ensure consistency between redundant resources, such as software or hardware components, to improve reliability, fault-tolerance, or accessibility.

Apache CouchDB is an open-source document-oriented NoSQL database, implemented in Erlang.

Amazon SimpleDB is a distributed database written in Erlang by Amazon.com. It is used as a web service in concert with Amazon Elastic Compute Cloud (EC2) and Amazon S3 and is part of Amazon Web Services. It was announced on December 13, 2007.

Apache Cassandra is a free and open-source database management system designed to handle large volumes of data across multiple commodity servers. Cassandra supports computer clusters and spans multiple data centers, featuring asynchronous and masterless replication. It enables low-latency operations for all clients and incorporates Amazon's Dynamo distributed storage and replication techniques, combined with Google's Bigtable data storage engine model.

NoSQL is an approach to database design that focuses on providing a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Instead of the typical tabular structure of a relational database, NoSQL databases house data within one data structure. Since this non-relational database design does not require a schema, it offers rapid scalability to manage large and typically unstructured data sets. NoSQL systems are also sometimes called "Not only SQL" to emphasize that they may support SQL-like query languages or sit alongside SQL databases in polyglot-persistent architectures.

In database theory, the CAP theorem, also named Brewer's theorem after computer scientist Eric Brewer, states that any distributed data store can provide only two of the following three guarantees:

Couchbase Server, originally known as Membase, is a source-available, distributed multi-model NoSQL document-oriented database software package optimized for interactive applications. These applications may serve many concurrent users by creating, storing, retrieving, aggregating, manipulating and presenting data. In support of these kinds of application needs, Couchbase Server is designed to provide easy-to-scale key-value, or JSON document access, with low latency and high sustainability throughput. It is designed to be clustered from a single machine to very large-scale deployments spanning many machines.

Split-brain is a computer term, based on an analogy with the medical split-brain syndrome. It indicates data or availability inconsistencies originating from the maintenance of two separate data sets with overlap in scope, either because of servers in a network design, or a failure condition based on servers not communicating and synchronizing their data to each other. This last case is also commonly referred to as a network partition.

H-Store is an experimental database management system (DBMS). It was designed for online transaction processing applications. H-Store was developed by a team at Brown University, Carnegie Mellon University, the Massachusetts Institute of Technology, and Yale University in 2007 by researchers Michael Stonebraker, Sam Madden, Andy Pavlo and Daniel Abadi.

Sherpa is a cloud storage platform developed by Yahoo!. It is a hosted, distributed, and geographically replicated key-value data store. The service is a NoSQL system that address the scalability, availability, and latency needs of the conglomerate's websites. Sherpa has abilities such as elastic growth, multi-tenancy, global footprint for local low-latency access, asynchronous replication, representational state transfer (REST) based web service APIs, novel per-record consistency knobs, high availability, compression, secondary indexes, and record-level replication.

<span class="mw-page-title-main">Amazon DynamoDB</span> NoSQL database service

Amazon DynamoDB is a managed NoSQL database service provided by Amazon Web Services (AWS). It supports key-value and document data structures and is designed to handle a wide range of applications requiring scalability and performance.

<span class="mw-page-title-main">Oracle NoSQL Database</span> Distributed database

Oracle NoSQL Database is a NoSQL-type distributed key-value database from Oracle Corporation. It provides transactional semantics for data manipulation, horizontal scalability, and simple administration and monitoring.

NewSQL is a class of relational database management systems that seek to provide the scalability of NoSQL systems for online transaction processing (OLTP) workloads while maintaining the ACID guarantees of a traditional database system.

Aerospike Database is a real-time, high performance NoSQL database. Designed for applications that cannot experience any downtime and require high read & write throughput. Aerospike is optimized to run on NVMe SSDs capable of efficiently storing large datasets. Aerospike can also be deployed as a fully in-memory cache database. Aerospike offers Key-Value, JSON Document, Graph data, and Vector Search models. Aerospike is an open source distributed NoSQL database management system, marketed by the company also named Aerospike.

FoundationDB is a free and open-source multi-model distributed NoSQL database developed by Apple Inc. with a shared-nothing architecture. The product was designed around a "core" database, with additional features supplied in "layers." The core database exposes an ordered key–value store with transactions. The transactions are able to read or write multiple keys stored on any machine in the cluster while fully supporting ACID properties. Transactions are used to implement a variety of data models via layers.

Azure Cosmos DB is a globally distributed, multi-model database service offered by Microsoft. It is designed to provide high availability, scalability, and low-latency access to data for modern applications. Unlike traditional relational databases, Cosmos DB is a NoSQL and vector database, which means it can handle unstructured, semi-structured, structured, and vector data types.

CockroachDB is a source-available distributed SQL database management system developed by Cockroach Labs. The relational functionality is built on top of a distributed, transactional, consistent key-value store that can survive a variety of different underlying infrastructure failures, and is wire-compatible with PostgreSQL which means users can take advantage of a wide range of drivers and tools from the extensive PostgreSQL ecosystem. A CockroachDB cluster consists of a number of nodes that can be spread across failure domains such as data centres or public cloud regions. A cluster can be scaled both horizontally and vertically. It can provide high levels of resilience and availability and can be run in a variety of environments such as bare metal, VMs, containers and Kubernetes, both in private data centers and in the cloud. CockroachDB gets its name from cockroaches, as they are known for being disaster-resistant.

A distributed SQL database is a single relational database which replicates data across multiple servers. Distributed SQL databases are strongly consistent and most support consistency across racks, data centers, and wide area networks including cloud availability zones and cloud geographic zones. Distributed SQL databases typically use the Paxos or Raft algorithms to achieve consensus across multiple nodes.

References

1 2 3 Golab, Wojciech (2018). "Proving PACELC". ACM SIGACT News. 49 (1): 73–81. doi:10.1145/3197406.3197420. S2CID 3989621.
1 2 3 Abadi, Daniel J. (2010-04-23). "DBMS Musings: Problems with CAP, and Yahoo's little known NoSQL system" . Retrieved 2016-09-11.
1 2 3 4 5 Abadi, Daniel J. "Consistency Tradeoffs in Modern Distributed Database System Design" (PDF). Yale University.
↑ Abadi, Daniel (15 July 2019). "The dangers of conditional consistency guarantees". DBMS Musings. Retrieved 29 August 2024.
↑ Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems. O'Reilly Media. ISBN 978-1449373320.
1 2 3 Abadi, Daniel J.; Murdopo, Arinto (2012-04-17). "Consistency Tradeoffs in Modern Distributed Database System Design" . Retrieved 2022-07-18.
↑ "Global tables - multi-Region replication for DynamoDB". AWS Documentation. Retrieved 4 January 2023.
1 2 Abadi, Daniel (2017-10-08). "DBMS Musings: Hazelcast and the Mythical PA/EC System". DBMS Musings. Retrieved 2017-10-20.
1 2 "Hazelcast IMDG Reference Manual". docs.hazelcast.org. Retrieved 2020-09-17.
↑ Porter, Kevin (29 March 2023). "Where does aerospike fall in PACELC?". Aerospike Community Forum. Retrieved 30 March 2023.
↑ "Consistency Levels in Azure Cosmos DB" . Retrieved 2021-06-21.
↑ Abadi, Daniel (2018-09-21). "DBMS Musings: NewSQL database systems are failing to guarantee consistency, and I blame Spanner". DBMS Musings. Retrieved 2019-02-23.
↑ Zelinskie, Jimmy (2024-04-23). "SpiceDB Concepts: Consistency". SpiceDB documentation. Retrieved 2024-05-02.

External links

"Consistency Tradeoffs in Modern Distributed Database System Design", by Daniel J. Abadi, Yale University Original paper that formalized PACELC
"Problems with CAP, and Yahoo's little known NoSQL system", by Daniel J. Abadi, Yale University. Original blog post that first described PACELC
"Proving PACELC", by Wojciech Golab, University of Waterloo Formal proof of the PACELC theorem

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[lctradeoff-11] 1 2 3 4 Dynamo, Cassandra, and Riak have user-adjustable settings to control the LC tradeoff.^[6]

[csmsclvl-13] Cosmos DB has five selectable consistency levels to control the LC tradeoff.^[11]

[Golab-1] 1 2 3 Golab, Wojciech (2018). "Proving PACELC". ACM SIGACT News. 49 (1): 73–81. doi:10.1145/3197406.3197420. S2CID 3989621.

[Musings2020-2] 1 2 3 Abadi, Daniel J. (2010-04-23). "DBMS Musings: Problems with CAP, and Yahoo's little known NoSQL system" . Retrieved 2016-09-11.

[Abadi-3] 1 2 3 4 5 Abadi, Daniel J. "Consistency Tradeoffs in Modern Distributed Database System Design" (PDF). Yale University.

[4] Abadi, Daniel (15 July 2019). "The dangers of conditional consistency guarantees". DBMS Musings. Retrieved 29 August 2024.

[5] Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems. O'Reilly Media. ISBN 978-1449373320.

[ctmddsd-6] 1 2 3 Abadi, Daniel J.; Murdopo, Arinto (2012-04-17). "Consistency Tradeoffs in Modern Distributed Database System Design" . Retrieved 2022-07-18.

[7] "Global tables - multi-Region replication for DynamoDB". AWS Documentation. Retrieved 4 January 2023.

[dbmsmusings201710-8] 1 2 Abadi, Daniel (2017-10-08). "DBMS Musings: Hazelcast and the Mythical PA/EC System". DBMS Musings. Retrieved 2017-10-20.

[:0-9] 1 2 "Hazelcast IMDG Reference Manual". docs.hazelcast.org. Retrieved 2020-09-17.

[10] Porter, Kevin (29 March 2023). "Where does aerospike fall in PACELC?". Aerospike Community Forum. Retrieved 30 March 2023.

[12] "Consistency Levels in Azure Cosmos DB" . Retrieved 2021-06-21.

[14] Abadi, Daniel (2018-09-21). "DBMS Musings: NewSQL database systems are failing to guarantee consistency, and I blame Spanner". DBMS Musings. Retrieved 2019-02-23.

[15] Zelinskie, Jimmy (2024-04-23). "SpiceDB Concepts: Consistency". SpiceDB documentation. Retrieved 2024-05-02.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[a]

[b]

[12]

[13]

[11]

v t e Database management systems
Types	Object-oriented comparison Relational list comparison Key–value Column-oriented list Document-oriented Wide-column store Graph NoSQL NewSQL In-memory list Multi-model comparison Cloud Blockchain-based database
Concepts	Database ACID Armstrong's axioms Codd's 12 rules CAP theorem CRUD Null Candidate key Foreign key PACELC theorem Superkey Surrogate key Unique key
Objects	Relation table column row View Transaction Transaction log Trigger Index Stored procedure Cursor Partition
Components	Concurrency control Data dictionary JDBC XQJ ODBC Query language Query optimizer Query rewriting system Query plan
Functions	Administration Query optimization Replication Sharding
Related topics	Database models Database normalization Database storage Distributed database Federated database system Referential integrity Relational algebra Relational calculus Relational model Object–relational database Transaction processing
Category Outline