Eventual consistency

Last updated

Eventual consistency is a consistency model used in distributed computing to achieve high availability that informally guarantees that, if no new updates are made to a given data item, eventually all accesses to that item will return the last updated value. [1] Eventual consistency, also called optimistic replication, [2] is widely deployed in distributed systems, and has origins in early mobile computing projects. [3] A system that has achieved eventual consistency is often said to have converged, or achieved replica convergence. [4] Eventual consistency is a weak guarantee – most stronger models, like linearizability, are trivially eventually consistent, but a system that is merely eventually consistent does not usually fulfill these stronger constraints.

Contents

Eventually-consistent services are often classified as providing BASE (Basically Available, Soft state, Eventual consistency) semantics, in contrast to traditional ACID (Atomicity, Consistency, Isolation, Durability) guarantees. [5] [6] In chemistry BASE is opposite to ACID, which helps remembering the acronym. [7] According to the same resource, these are the rough definitions of each term in BASE:

Eventual consistency is sometimes criticized [8] as increasing the complexity of distributed software applications. This is partly because eventual consistency is purely a liveness guarantee (reads eventually return the same value) and does not make safety guarantees: an eventually consistent system can return any value before it converges.

Conflict resolution

In order to ensure replica convergence, a system must reconcile differences between multiple copies of distributed data. This consists of two parts:

The most appropriate approach to reconciliation depends on the application. A widespread approach is "last writer wins". [1] Another is to invoke a user-specified conflict handler. [4] Timestamps and vector clocks are often used to detect concurrency between updates. Some people use "first writer wins" in situations where "last writer wins" is unacceptable. [10]

Reconciliation of concurrent writes must occur sometime before the next read, and can be scheduled at different instants: [3] [11]

Strong eventual consistency

Whereas eventual consistency is only a liveness guarantee (updates will be observed eventually), strong eventual consistency (SEC) adds the safety guarantee that any two nodes that have received the same (unordered) set of updates will be in the same state. If, furthermore, the system is monotonic, the application will never suffer rollbacks. Conflict-free replicated data types are a common approach to ensuring SEC. [12]

See also

Related Research Articles

In computer science, ACID is a set of properties of database transactions intended to guarantee data validity despite errors, power failures, and other mishaps. In the context of databases, a sequence of database operations that satisfies the ACID properties is called a transaction. For example, a transfer of funds from one bank account to another, even involving multiple changes such as debiting one account and crediting another, is a single transaction.

Scalability Ability of a system to handle an increasing amount of work by adding resources to it

Scalability is the property of a system to handle a growing amount of work by adding resources to the system.

In information technology and computer science, especially in the fields of computer programming, operating systems, multiprocessors, and databases, concurrency control ensures that correct results for concurrent operations are generated, while getting those results as quickly as possible.

Multiversion concurrency control, is a concurrency control method commonly used by database management systems to provide concurrent access to the database and in programming languages to implement transactional memory.

A database transaction symbolizes a unit of work performed within a database management system against a database, and treated in a coherent and reliable way independent of other transactions. A transaction generally represents any change in a database. Transactions in a database environment have two main purposes:

  1. To provide reliable units of work that allow correct recovery from failures and keep a database consistent even in cases of system failure, when execution stops and many operations upon a database remain uncompleted, with unclear status.
  2. To provide isolation between programs accessing a database concurrently. If this isolation is not provided, the programs' outcomes are possibly erroneous.

In computer science, consistency models are used in distributed systems like distributed shared memory systems or distributed data stores. The system is said to support a given model if operations on memory follow specific rules. The data consistency model specifies a contract between programmer and system, wherein the system guarantees that if the programmer follows the rules, memory will be consistent and the results of reading, writing, or updating memory will be predictable. This is different from coherence, which occurs in systems that are cached or cache-less, and is consistency of data with respect to all processors. Coherence deals with maintaining a global order in which writes to a single location or single variable are seen by all processors. Consistency deals with the ordering of operations to multiple locations with respect to all processors.

In computer science, an algorithm is called non-blocking if failure or suspension of any thread cannot cause failure or suspension of another thread; for some operations, these algorithms provide a useful alternative to traditional blocking implementations. A non-blocking algorithm is lock-free if there is guaranteed system-wide progress, and wait-free if there is also guaranteed per-thread progress. "Non-blocking" was used as a synonym for "lock-free" in the literature until the introduction of obstruction-freedom in 2003.

Google File System is a proprietary distributed file system developed by Google to provide efficient, reliable access to data using large clusters of commodity hardware. The last version of Google File System codenamed Colossus was released in 2010.

Multi-master replication is a method of database replication which allows data to be stored by a group of computers, and updated by any member of the group. All members are responsive to client data queries. The multi-master replication system is responsible for propagating the data modifications made by each member to the rest of the group and resolving any conflicts that might arise between concurrent changes made by different members.

Replication in computing involves sharing information so as to ensure consistency between redundant resources, such as software or hardware components, to improve reliability, fault-tolerance, or accessibility.

In databases, and transaction processing, snapshot isolation is a guarantee that all reads made in a transaction will see a consistent snapshot of the database, and the transaction itself will successfully commit only if no updates it has made conflict with any concurrent updates made since that snapshot.

Causal consistency is one of the major memory consistency models. In concurrent programming, where concurrent processes are accessing a shared memory, a consistency model restricts which accesses are legal. This is useful for defining correct data structures in distributed shared memory or distributed transactions.

Paxos is a family of protocols for solving consensus in a network of unreliable or fallible processors. Consensus is the process of agreeing on one result among a group of participants. This problem becomes difficult when the participants or their communications may experience failures.

Operational transformation (OT) is a technology for supporting a range of collaboration functionalities in advanced collaborative software systems. OT was originally invented for consistency maintenance and concurrency control in collaborative editing of plain text documents. Its capabilities have been extended and its applications expanded to include group undo, locking, conflict resolution, operation notification and compression, group-awareness, HTML/XML and tree-structured document editing, collaborative office productivity tools, application-sharing, and collaborative computer-aided media design tools. In 2009 OT was adopted as a core technique behind the collaboration features in Apache Wave and Google Docs.

In concurrency control of databases, transaction processing, and other transactional distributed applications, global serializability is a property of a global schedule of transactions. A global schedule is the unified schedule of all the individual database schedules in a multidatabase environment. Complying with global serializability means that the global schedule is serializable, has the serializability property, while each component database (module) has a serializable schedule as well. In other words, a collection of serializable components provides overall system serializability, which is usually incorrect. A need in correctness across databases in multidatabase systems makes global serializability a major goal for global concurrency control. With the proliferation of the Internet, Cloud computing, Grid computing, and small, portable, powerful computing devices, as well as increase in systems management sophistication, the need for atomic distributed transactions and thus effective global serializability techniques, to ensure correctness in and among distributed transactional applications, seems to increase.

Optimistic replication, also known as lazy replication, is a strategy for replication, in which replicas are allowed to diverge.

In theoretical computer science, the CAP theorem, also named Brewer's theorem after computer scientist Eric Brewer, states that it is impossible for a distributed data store to simultaneously provide more than two out of the following three guarantees:

NewSQL is a class of relational database management systems that seek to provide the scalability of NoSQL systems for online transaction processing (OLTP) workloads while maintaining the ACID guarantees of a traditional database system.

A distributed file system for cloud is a file system that allows many clients to have access to data and supports operations on that data. Each data file may be partitioned into several parts called chunks. Each chunk may be stored on different remote machines, facilitating the parallel execution of applications. Typically, data is stored in files in a hierarchical tree, where the nodes represent directories. There are several ways to share files in a distributed architecture: each solution must be suitable for a certain type of application, depending on how complex the application is. Meanwhile, the security of the system must be ensured. Confidentiality, availability and integrity are the main keys for a secure system.

In distributed computing, a conflict-free replicated data type (CRDT) is a data structure which can be replicated across multiple computers in a network, where the replicas can be updated independently and concurrently without coordination between the replicas, and where it is always mathematically possible to resolve inconsistencies that might come up.

References

  1. 1 2 Vogels, W. (2009). "Eventually consistent". Communications of the ACM. 52: 40–44. doi: 10.1145/1435417.1435432 .
  2. Vogels, W. (2008). "Eventually Consistent". Queue. 6 (6): 14–19. doi: 10.1145/1466443.1466448 .
  3. 1 2 Terry, D. B.; Theimer, M. M.; Petersen, K.; Demers, A. J.; Spreitzer, M. J.; Hauser, C. H. (1995). "Managing update conflicts in Bayou, a weakly connected replicated storage system". Proceedings of the fifteenth ACM symposium on Operating systems principles - SOSP '95. p. 172. CiteSeerX   10.1.1.12.7323 . doi:10.1145/224056.224070. ISBN   978-0897917155. S2CID   7834967.
  4. 1 2 Petersen, K.; Spreitzer, M. J.; Terry, D. B.; Theimer, M. M.; Demers, A. J. (1997). "Flexible update propagation for weakly consistent replication". ACM SIGOPS Operating Systems Review. 31 (5): 288. CiteSeerX   10.1.1.17.555 . doi:10.1145/269005.266711.
  5. Pritchett, D. (2008). "Base: An Acid Alternative". Queue. 6 (3): 48–55. doi: 10.1145/1394127.1394128 .
  6. Bailis, P.; Ghodsi, A. (2013). "Eventual Consistency Today: Limitations, Extensions, and Beyond". Queue. 11 (3): 20. doi: 10.1145/2460276.2462076 .
  7. Roe, Charles (March 2012). "ACID vs. BASE: The Shifting pH of Database Transaction Processing". DATAVERSITY. DATAVERSITY Education, LLC. Retrieved 29 August 2019.
  8. H Yaniv Pessach (2013), Distributed Storage (Distributed Storage: Concepts, Algorithms, and Implementations ed.), Amazon, OL   25423189M, Systems using Eventual Consistency result in decreased system load and increased system availability but result in increased cognitive complexity for users and developers
  9. Demers, A.; Greene, D.; Hauser, C.; Irish, W.; Larson, J. (1987). "Epidemic algorithms for replicated database maintenance". Proceedings of the sixth annual ACM Symposium on Principles of distributed computing - PODC '87. p. 1. doi:10.1145/41840.41841. ISBN   978-0-89791-239-6. S2CID   1889203.
  10. Rockford Lhotka. "Concurrency techniques". 2003.
  11. Olivier Mallassi (2010-06-09). "Let's play with Cassandra… (Part 1/3)". http://blog.octo.com/en/: OCTO Talks!. Retrieved 2011-03-23. Of course, at a given time, chances are high that each node has its own version of the data. Conflict resolution is made during the read requests (called read-repair) and the current version of Cassandra does not provide a Vector Clock conflict resolution mechanisms [sic] (should be available in the version 0.7). Conflict resolution is so based on timestamp (the one set when you insert the row or the column): the higher timestamp win[s] and the node you are reading the data [from] is responsible for that. This is an important point because the timestamp is specified by the client, at the moment the column is inserted. Thus, all Cassandra clients’ [sic] need to be synchronized...
  12. Shapiro, Marc; Preguiça, Nuno; Baquero, Carlos; Zawirski, Marek (2011-10-10). "Conflict-free replicated data types". SSS'11 Proceedings of the 13th International Conference on Stabilization, Safety, and the Security of Distributed Systems. Springer-Verlag Berlin, Heidelberg: 386–400.