Eventual consistency

Last updated

Eventual consistency is a consistency model used in distributed computing to achieve high availability that informally guarantees that, if no new updates are made to a given data item, eventually all accesses to that item will return the last updated value. [1] Eventual consistency, also called optimistic replication, [2] is widely deployed in distributed systems and has origins in early mobile computing projects. [3] A system that has achieved eventual consistency is often said to have converged, or achieved replica convergence. [4] Eventual consistency is a weak guarantee – most stronger models, like linearizability, are trivially eventually consistent.

Contents

Eventually-consistent services are often classified as providing BASE semantics (basically-available, soft-state, eventual consistency), in contrast to traditional ACID (atomicity, consistency, isolation, durability). [5] [6] In chemistry, a base is the opposite of an acid, which helps in remembering the acronym. [7] According to the same resource, these are the rough definitions of each term in BASE:

Eventual consistency is sometimes criticized [8] as increasing the complexity of distributed software applications. This is partly because eventual consistency is purely a liveness guarantee (reads eventually return the same value) and does not guarantee safety: an eventually consistent system can return any value before it converges.

Conflict resolution

In order to ensure replica convergence, a system must reconcile differences between multiple copies of distributed data. This consists of two parts:

The most appropriate approach to reconciliation depends on the application. A widespread approach is "last writer wins". [1] Another is to invoke a user-specified conflict handler. [4] Timestamps and vector clocks are often used to detect concurrency between updates. Some people use "first writer wins" in situations where "last writer wins" is unacceptable. [10]

Reconciliation of concurrent writes must occur sometime before the next read, and can be scheduled at different instants: [3] [11]

Strong eventual consistency

Whereas eventual consistency is only a liveness guarantee (updates will be observed eventually), strong eventual consistency (SEC) adds the safety guarantee that any two nodes that have received the same (unordered) set of updates will be in the same state. If, furthermore, the system is monotonic, the application will never suffer rollbacks. A common approach to ensure SEC is conflict-free replicated data types. [12]

See also

Related Research Articles

In computer science, ACID is a set of properties of database transactions intended to guarantee data validity despite errors, power failures, and other mishaps. In the context of databases, a sequence of database operations that satisfies the ACID properties is called a transaction. For example, a transfer of funds from one bank account to another, even involving multiple changes such as debiting one account and crediting another, is a single transaction.

In information technology and computer science, especially in the fields of computer programming, operating systems, multiprocessors, and databases, concurrency control ensures that correct results for concurrent operations are generated, while getting those results as quickly as possible.

A database transaction symbolizes a unit of work, performed within a database management system against a database, that is treated in a coherent and reliable way independent of other transactions. A transaction generally represents any change in a database. Transactions in a database environment have two main purposes:

  1. To provide reliable units of work that allow correct recovery from failures and keep a database consistent even in cases of system failure. For example: when execution prematurely and unexpectedly stops in which case many operations upon a database remain uncompleted, with unclear status.
  2. To provide isolation between programs accessing a database concurrently. If this isolation is not provided, the programs' outcomes are possibly erroneous.

In database systems, durability is the ACID property that guarantees that the effects of transactions that have been committed will survive permanently, even in case of failures, including incidents and catastrophic events. For example, if a flight booking reports that a seat has successfully been booked, then the seat will remain booked even if the system crashes.

In computer science, a consistency model specifies a contract between the programmer and a system, wherein the system guarantees that if the programmer follows the rules for operations on memory, memory will be consistent and the results of reading, writing, or updating memory will be predictable. Consistency models are used in distributed systems like distributed shared memory systems or distributed data stores. Consistency is different from coherence, which occurs in systems that are cached or cache-less, and is consistency of data with respect to all processors. Coherence deals with maintaining a global order in which writes to a single location or single variable are seen by all processors. Consistency deals with the ordering of operations to multiple locations with respect to all processors.

In computer science, an algorithm is called non-blocking if failure or suspension of any thread cannot cause failure or suspension of another thread; for some operations, these algorithms provide a useful alternative to traditional blocking implementations. A non-blocking algorithm is lock-free if there is guaranteed system-wide progress, and wait-free if there is also guaranteed per-thread progress. "Non-blocking" was used as a synonym for "lock-free" in the literature until the introduction of obstruction-freedom in 2003.

In database systems, consistency refers to the requirement that any given database transaction must change affected data only in allowed ways. Any data written to the database must be valid according to all defined rules, including constraints, cascades, triggers, and any combination thereof. This does not guarantee correctness of the transaction in all ways the application programmer might have wanted but merely that any programming errors cannot result in the violation of any defined database constraints.

Google File System is a proprietary distributed file system developed by Google to provide efficient, reliable access to data using large clusters of commodity hardware. Google file system was replaced by Colossus in 2010.

Multi-master replication is a method of database replication which allows data to be stored by a group of computers, and updated by any member of the group. All members are responsive to client data queries. The multi-master replication system is responsible for propagating the data modifications made by each member to the rest of the group and resolving any conflicts that might arise between concurrent changes made by different members.

Replication in computing involves sharing information so as to ensure consistency between redundant resources, such as software or hardware components, to improve reliability, fault-tolerance, or accessibility.

In databases, and transaction processing, snapshot isolation is a guarantee that all reads made in a transaction will see a consistent snapshot of the database, and the transaction itself will successfully commit only if no updates it has made conflict with any concurrent updates made since that snapshot.

Causal consistency is one of the major memory consistency models. In concurrent programming, where concurrent processes are accessing a shared memory, a consistency model restricts which accesses are legal. This is useful for defining correct data structures in distributed shared memory or distributed transactions.

Paxos is a family of protocols for solving consensus in a network of unreliable or fallible processors. Consensus is the process of agreeing on one result among a group of participants. This problem becomes difficult when the participants or their communications may experience failures.

Operational transformation (OT) is a technology for supporting a range of collaboration functionalities in advanced collaborative software systems. OT was originally invented for consistency maintenance and concurrency control in collaborative editing of plain text documents. Its capabilities have been extended and its applications expanded to include group undo, locking, conflict resolution, operation notification and compression, group-awareness, HTML/XML and tree-structured document editing, collaborative office productivity tools, application-sharing, and collaborative computer-aided media design tools. In 2009 OT was adopted as a core technique behind the collaboration features in Apache Wave and Google Docs.

In concurrency control of databases, transaction processing, and other transactional distributed applications, global serializability is a property of a global schedule of transactions. A global schedule is the unified schedule of all the individual database schedules in a multidatabase environment. Complying with global serializability means that the global schedule is serializable, has the serializability property, while each component database (module) has a serializable schedule as well. In other words, a collection of serializable components provides overall system serializability, which is usually incorrect. A need in correctness across databases in multidatabase systems makes global serializability a major goal for global concurrency control. With the proliferation of the Internet, Cloud computing, Grid computing, and small, portable, powerful computing devices, as well as increase in systems management sophistication, the need for atomic distributed transactions and thus effective global serializability techniques, to ensure correctness in and among distributed transactional applications, seems to increase.

Optimistic replication, also known as lazy replication, is a strategy for replication, in which replicas are allowed to diverge.

<span class="mw-page-title-main">CAP theorem</span> Need to sacrifice consistency or availability in the presence of network partitions

In theoretical computer science, the CAP theorem, also named Brewer's theorem after computer scientist Eric Brewer, states that any distributed data store can provide only two of the following three guarantees:

A distributed file system for cloud is a file system that allows many clients to have access to data and supports operations on that data. Each data file may be partitioned into several parts called chunks. Each chunk may be stored on different remote machines, facilitating the parallel execution of applications. Typically, data is stored in files in a hierarchical tree, where the nodes represent directories. There are several ways to share files in a distributed architecture: each solution must be suitable for a certain type of application, depending on how complex the application is. Meanwhile, the security of the system must be ensured. Confidentiality, availability and integrity are the main keys for a secure system.

In distributed computing, a conflict-free replicated data type (CRDT) is a data structure that is replicated across multiple computers in a network, with the following features:

  1. The application can update any replica independently, concurrently and without coordinating with other replicas.
  2. An algorithm automatically resolves any inconsistencies that might occur.
  3. Although replicas may have different state at any particular point in time, they are guaranteed to eventually converge.
<span class="mw-page-title-main">PACELC theorem</span> Theorem in theoretical computer science

In theoretical computer science, the PACELC theorem is an extension to the CAP theorem. It states that in case of network partitioning (P) in a distributed computer system, one has to choose between availability (A) and consistency (C), but else (E), even when the system is running normally in the absence of partitions, one has to choose between latency (L) and loss of consistency (C).

References

  1. 1 2 Vogels, W. (2009). "Eventually consistent". Communications of the ACM. 52: 40–44. doi: 10.1145/1435417.1435432 .
  2. Vogels, W. (2008). "Eventually Consistent". Queue. 6 (6): 14–19. doi: 10.1145/1466443.1466448 .
  3. 1 2 Terry, D. B.; Theimer, M. M.; Petersen, K.; Demers, A. J.; Spreitzer, M. J.; Hauser, C. H. (1995). "Managing update conflicts in Bayou, a weakly connected replicated storage system". Proceedings of the fifteenth ACM symposium on Operating systems principles - SOSP '95. p. 172. CiteSeerX   10.1.1.12.7323 . doi:10.1145/224056.224070. ISBN   978-0897917155. S2CID   7834967.
  4. 1 2 Petersen, K.; Spreitzer, M. J.; Terry, D. B.; Theimer, M. M.; Demers, A. J. (1997). "Flexible update propagation for weakly consistent replication". ACM SIGOPS Operating Systems Review. 31 (5): 288. CiteSeerX   10.1.1.17.555 . doi:10.1145/269005.266711.
  5. Pritchett, D. (2008). "Base: An Acid Alternative". Queue. 6 (3): 48–55. doi: 10.1145/1394127.1394128 .
  6. Bailis, P.; Ghodsi, A. (2013). "Eventual Consistency Today: Limitations, Extensions, and Beyond". Queue. 11 (3): 20. doi: 10.1145/2460276.2462076 .
  7. Roe, Charles (March 2012). "ACID vs. BASE: The Shifting pH of Database Transaction Processing". DATAVERSITY. DATAVERSITY Education, LLC. Retrieved 29 August 2019.
  8. H Yaniv Pessach (2013), Distributed Storage (Distributed Storage: Concepts, Algorithms, and Implementations ed.), Amazon, OL   25423189M, Systems using Eventual Consistency result in decreased system load and increased system availability but result in increased cognitive complexity for users and developers
  9. Demers, A.; Greene, D.; Hauser, C.; Irish, W.; Larson, J. (1987). "Epidemic algorithms for replicated database maintenance". Proceedings of the sixth annual ACM Symposium on Principles of distributed computing - PODC '87. p. 1. doi:10.1145/41840.41841. ISBN   978-0-89791-239-6. S2CID   1889203.
  10. Rockford Lhotka. "Concurrency techniques" Archived 2018-05-11 at the Wayback Machine . 2003.
  11. Olivier Mallassi (2010-06-09). "Let's play with Cassandra… (Part 1/3)". OCTO Talks!. Retrieved 2011-03-23. Of course, at a given time, chances are high that each node has its own version of the data. Conflict resolution is made during the read requests (called read-repair) and the current version of Cassandra does not provide a Vector Clock conflict resolution mechanisms [sic] (should be available in the version 0.7). Conflict resolution is so based on timestamp (the one set when you insert the row or the column): the higher timestamp win[s] and the node you are reading the data [from] is responsible for that. This is an important point because the timestamp is specified by the client, at the moment the column is inserted. Thus, all Cassandra clients' [sic] need to be synchronized...
  12. Shapiro, Marc; Preguiça, Nuno; Baquero, Carlos; Zawirski, Marek (2011-10-10). "Conflict-free replicated data types". SSS'11 Proceedings of the 13th International Conference on Stabilization, Safety, and the Security of Distributed Systems. Springer-Verlag Berlin, Heidelberg: 386–400.