Operational transformation (OT) is a technology for supporting a range of collaboration functionalities in advanced collaborative software systems. OT was originally invented for consistency maintenance and concurrency control in collaborative editing of plain text documents. Its capabilities have been extended and its applications expanded to include group undo, locking, conflict resolution, operation notification and compression, group-awareness, HTML/XML and tree-structured document editing, collaborative office productivity tools, application-sharing, and collaborative computer-aided media design tools. [1] In 2009 OT was adopted as a core technique behind the collaboration features in Apache Wave and Google Docs.
Operational Transformation was pioneered by C. Ellis and S. Gibbs [2] in the GROVE (GRoup Outline Viewing Edit) system in 1989. Several years later, some correctness issues were identified and several approaches [3] [4] [5] [6] were independently proposed to solve these issues, which was followed by another decade of continuous efforts of extending and improving OT by a community of dedicated researchers. In 1998, a Special Interest Group on Collaborative Editing [7] was set up to promote communication and collaboration among CE and OT researchers. Since then, SIGCE holds annual CE workshops in conjunction with major CSCW (Computer Supported Cooperative Work) conferences, such as ACM, CSCW, GROUP and ECSCW.
Collaboration systems utilizing Operational Transformations typically use replicated document storage, where each client has their own copy of the document; clients operate on their local copies in a lock-free, non-blocking manner, and the changes are then propagated to the rest of the clients; this ensures the client high responsiveness in an otherwise high-latency environment such as the Internet. When a client receives the changes propagated from another client, it typically transforms the changes before executing them; the transformation ensures that application-dependent consistency criteria (invariants) are maintained by all sites. This mode of operation results in a system particularly suited for implementing collaboration features, like simultaneous document editing, in a high-latency environment such as the web.
The basic idea of OT can be illustrated by using a simple text editing scenario as follows. Given a text document with a string "abc" replicated at two collaborating sites; and two concurrent operations:
generated by two users at collaborating sites 1 and 2, respectively. Suppose the two operations are executed in the order of O1 and O2 (at site 1). After executing O1, the document becomes "xabc". To execute O2 after O1, O2 must be transformed against O1 to become: O2' = Delete[3, "c"], whose positional parameter is incremented by one due to the insertion of one character "x" by O1. Executing O2' on "xabc" deletes the correct character "c" and the document becomes "xab". However, if O2 is executed without transformation, it incorrectly deletes character "b" rather than "c". The basic idea of OT is to transform (or adjust) the parameters of an editing operation according to the effects of previously executed concurrent operations so that the transformed operation can achieve the correct effect and maintain document consistency.
One functionality of OT is to support consistency maintenance in collaborative editing systems. A number of consistency models have been proposed in the research community, some generally for collaborative editing systems, and some specifically for OT algorithms.
In Ellis and Gibbs 1989 paper "Concurrency control in groupware systems", [2] two consistency properties are required for collaborative editing systems:
Since concurrent operations may be executed in different orders and editing operations are not commutative in general, copies of the document at different sites may diverge (inconsistent). The first OT algorithm was proposed in Ellis and Gibbs's paper [2] to achieve convergence in a group text editor; the state-vector (or vector clock in classic distributed computing) was used to preserve the precedence property.
The CCI model was proposed as a consistency management in collaborative editing systems. [4] [8] Under the CCI model, three consistency properties are grouped together:
The CCI model extends the CC model with a new criterion: intention preservation. The essential difference between convergence and intention preservation is that the former can always be achieved by a serialization protocol, but the latter may not be achieved by any serialization protocol if operations were always executed in their original forms. Achieving the nonserialisable intention preservation property has been a major technical challenge. OT has been found particularly suitable for achieving convergence and intention preservation in collaborative editing systems.
The CCI model is independent of document types or data models, operation types, or supporting techniques (OT, multi-versioning, serialization, undo/redo). It was not intended for correctness verification for techniques (e.g. OT) that are designed for specific data and operation models and for specific applications. In, [4] the notion of intention preservation was defined and refined at three levels: First, it was defined as a generic consistency requirement for collaborative editing systems; Second, it was defined as operation context-based pre- and post- transformation conditions for generic OT functions; Third, it was defined as specific operation verification criteria to guide the design of OT functions for two primitive operations: string-wise insert and delete, in collaborative plain text editors.
The condition of intention preservation was not formally specified in the CCI model for purposes of formal proofs. The SDT [9] and LBT [10] approaches try to formalize an alternative conditions that can be proved. The consistency model proposed in these two approaches consist of the following formal conditions:
The above CSM model requires that a total order of all objects in the system be specified. Effectively, the specification is reduced to new objects introduced by insert operations. However, specification of the total order entails application-specific policies such as those to break insertion ties (i.e., new objects inserted by two current operations at the same position). Consequently, the total order becomes application specific. Moreover, in the algorithm, the total order must be maintained in the transformation functions and control procedure, which increases time/space complexities of the algorithm.
Alternatively, the CA model is based on the admissibility theory. [11] The CA model includes two aspects:
These two conditions imply convergence. All cooperating sites converge in a state in which there is a same set of objects that are in the same order. Moreover, the ordering is effectively determined by the effects of the operations when they are generated. Since the two conditions also impose additional constraints on object ordering, they are actually stronger than convergence. The CA model and the design/prove approach are elaborated in the 2005 paper. [11] It no longer requires that a total order of objects be specified in the consistency model and maintained in the algorithm, which hence results in reduced time/space complexities in the algorithm.
OT is a system of multiple components. One established strategy of designing OT systems [2] [3] [4] [5] [12] [13] is to separate the high-level transformation control (or integration) algorithms from the low-level transformation functions.
OT Control Algorithms (determine which operations are transformed against others according to their concurrency/context relations) |
OT properties and conditions (divide responsibilities between algorithms and functions) |
OT Transformation Functions (determine how to transform a pair of primitive operations according to operation types, positions, and other parameters) |
The transformation control algorithm is concerned with determining:
The control algorithm invokes a corresponding set of transformation functions, which determine how to transform one operation against another according to the operation types, positions, and other parameters. The correctness responsibilities of these two layers are formally specified by a set of transformation properties and conditions. Different OT systems with different control algorithms, functions, and communication topologies require maintaining different sets of transformation properties. The separation of an OT system into these two layers allows for the design of generic control algorithms that are applicable to different kinds of application with different data and operation models.
The other alternative approach was proposed in. [11] In their approach, an OT algorithm is correct if it satisfies two formalized correctness criteria:
As long as these two criteria are satisfied, the data replicas converge (with additional constraints) after all operations are executed at all sites. There is no need to enforce a total order of execution for the sake of achieving convergence. Their approach is generally to first identify and prove sufficient conditions for a few transformation functions, and then design a control procedure to ensure those sufficient conditions. This way the control procedure and transformation functions work synergistically to achieve correctness, i.e., causality and admissibility preservation. In their approach, there is no need to satisfy transformation properties such as TP2 because it does not require that the (inclusive) transformation functions work in all possible cases.
There exist two underlying models in each OT system: the data model that defines the way data objects in a document are addressed by operations, and the operation model that defines the set of operations that can be directly transformed by OT functions. Different OT systems may have different data and operation models. For example, the data model of the first OT system [2] is a single linear address space; and its operation model consists of two primitive operations: character-wise insert and delete. The basic operation model has been extended to include a third primitive operation update to support collaborative Word document processing [14] and 3D model editing. [15] The basic OT data model has been extended into a hierarchy of multiple linear addressing domains, [16] [17] [18] which is capable of modeling a broad range of documents. A data adaptation process is often required to map application-specific data models to an OT-compliant data model. [19] [20]
There exist two approaches to supporting application level operations in an OT system:
Various OT functions have been designed for OT systems with different capabilities and used for different applications. OT functions used in different OT systems may be named differently, but they can be classified into two categories:
For example, suppose a type String with an operation ins(p, c, sid) where p is the position of insertion, c is the character to insert and sid is the identifier of the site that has generated the operation. We can write the following inclusion transformation function: [23]
T(ins(),ins()) :- if () return ins() else if ( and ) return ins() else return ins()
We can also write the following exclusion transformation function: [23]
(ins(),ins()) :- if () return ins() else if ( and ) return ins() else return ins()
Some OT systems use both IT and ET functions, and some use only IT functions. The complexity of OT function design is determined by various factors:
Various transformation properties for ensuring OT system correctness have been identified. These properties can be maintained by either the transformation control algorithm [4] [5] [13] [20] [28] [29] or by the transformation functions. [30] Different OT system designs have different division of responsibilities among these components. The specifications of these properties and preconditions of requiring them are given below.
The following two properties are related to achieving convergence.
The following three properties are related to achieving the desired group undo effect. They are:
Various OT control algorithms have been designed for OT systems with different capabilities and for different applications. The complexity of OT control algorithm design is determined by multiple factors. A key differentiating factor is whether an algorithm is capable of supporting concurrency control (do) and/or group undo. [3] [8] [12] [29] [31] In addition, different OT control algorithm designs make different tradeoffs in:
Most existing OT control algorithms for concurrency control adopt the theory of causality/concurrency as the theoretical basis: causally related operations must be executed in their causal order; concurrent operations must be transformed before their execution. However, it was well known that concurrency condition alone cannot capture all OT transformation conditions. [3] [4] [5] [8] [32] In a recent work, the theory of operation context has been proposed to explicitly represent the notion of a document state, which can be used to formally express OT transformation conditions for supporting the design and verification of OT control algorithms. [29]
The following table gives an overview of some existing OT control/integration algorithms
OT control/integration algorithms (systems) | Required transformation function types | Support OT-based Do? | Support OT-based Undo? | Transformation properties supported by control algorithm | Transformation properties supported by transformation functions | Transformation ordering and propagation constraints | Timestamp |
---|---|---|---|---|---|---|---|
dOPT [2] (GROVE) | T (IT) | Yes | No | None | CP1/TP1, CP2/TP2 | Causal order | State vector |
selective-undo [12] (DistEdit) | Transpose (IT and ET) | No | Selective Undo | NA | CP1/TP1, CP2/TP2, RP, IP1, IP2, IP3 | Causal order | ?? |
adOPTed [3] [31] (JOINT EMACS) | LTransformation (IT) | Yes | Chronological Undo | IP2, IP3 | CP1/TP1, CP2/TP2, IP1 | Causal order | State vector |
Jupiter [5] | xform (IT) | Yes | No | CP2/TP2 | CP1/TP1 | Causal order + Central transformation server | Scalar |
Google Wave OT [20] | transform and composition(IT) | Yes | No | CP2/TP2 | CP1/TP1 | Causal order + Central transformation server + stop'n'wait propagation protocol | Scalar |
GOT [4] (REDUCE) | IT and ET | Yes | No | CP1/TP1, CP2/TP2 | None | Causal order + Discontinuous total order | State vector |
GOTO [6] (REDUCE, CoWord, CoPPT, CoMaya) | IT and ET | Yes | No | None | CP1/TP1, CP2/TP2 | Causal order | State vector |
AnyUndo [8] (REDUCE, CoWord, CoPPT, CoMaya) | IT and ET | No | Undo any operation | IP2, IP3, RP | IP1, CP1/TP1, CP2/TP2 | Causal order | State vector |
SCOP [28] (NICE) | IT | Yes | No | CP2/TP2 | CP1/TP1 | Causal order + Central transformation server | Scalar |
COT [29] (REDUCE, CoWord, CoPPT, CoMaya) | IT | Yes | Undo any operation | CP2/TP2, IP2, IP3 | CP1/TP1, (no ET therefore no IP1 necessary) | Causal order + Discontinuous total order | Context vector |
TIBOT [33] | IT | Yes | No | CP2/TP2 | CP1/TP1 | Causal order | Scalar |
SOCT4 [13] | Forward transformation (IT) | Yes | No | CP2/TP2 | CP1/TP1 | Causal order + Continuous total order | Scalar |
SOCT2 [32] | Forward Transformation(IT) and Backward Transformation(ET) | Yes | No | None | CP1/TP1, CP2/TP2, RP | Causal order | State vector |
MOT2 [34] | Forward transformation (IT) | Yes | No | ?? | CP1/TP1, CP2/TP2 | ?? | scalar |
A continuous total order is a strict total order where it is possible to detect a missing element i.e. 1,2,3,4,... is a continuous total order, 1,2,3,5,... is not a continuous total order.
The transformation-based algorithms proposed in [10] [11] are based on the alternative consistency models "CSM" and "CA" as described above. Their approaches differ from those listed in the table. They use vector timestamps for causality preservation. The other correctness conditions are "single-"/"multi-" operation effects relation preservation or "admissibility" preservation. Those conditions are ensured by the control procedure and transformation functions synergistically. There is no need to discuss TP1/TP2 in their work. Hence they are not listed in the above table.
There exist some other optimistic consistency control algorithms that seek alternative ways to design transformation algorithms, but do not fit well with the above taxonomy and characterization. For example, Mark and Retrace [35]
The correctness problems of OT led to introduction of transformationless post-OT schemes, such as WOOT, [36] Logoot [37] and Causal Trees (CT). [38] "Post-OT" schemes decompose the document into atomic operations, but they workaround the need to transform operations by employing a combination of unique symbol identifiers, vector timestamps and/or tombstones.
While the classic OT approach of defining operations through their offsets in the text seems to be simple and natural, real-world distributed systems raise serious issues. Namely, that operations propagate with finite speed, states of participants are often different, thus the resulting combinations of states and operations are extremely hard to foresee and understand. As Li and Li put it, "Due to the need to consider complicated case coverage, formal proofs are very complicated and error-prone, even for OT algorithms that only treat two characterwise primitives (insert and delete)". [39]
Similarly, Joseph Gentle who is a former Google Wave engineer and an author of the Share.JS library wrote, "Unfortunately, implementing OT sucks. There's a million algorithms with different tradeoffs, mostly trapped in academic papers. […] Wave took 2 years to write and if we rewrote it today, it would take almost as long to write a second time." [40] But later he amends his comment with "I no longer believe that wave would take 2 years to implement now - mostly because of advances in web frameworks and web browsers." [41]
For OT to work, every single change to the data needs to be captured: "Obtaining a snapshot of the state is usually trivial, but capturing edits is a different matter altogether. […] The richness of modern user interfaces can make this problematic, especially within a browser-based environment." An alternative to OT is differential synchronization. [42]
Another alternative to OT is using sequence types of conflict-free replicated data type.
Collaborative software or groupware is application software designed to help people working on a common task to attain their goals. One of the earliest definitions of groupware is "intentional group processes plus software to support them."
Exclusive or, exclusive disjunction, exclusive alternation, logical non-equivalence, or logical inequality is a logical operator whose negation is the logical biconditional. With two inputs, XOR is true if and only if the inputs differ. With multiple inputs, XOR is true if and only if the number of true inputs is odd.
Constraint programming (CP) is a paradigm for solving combinatorial problems that draws on a wide range of techniques from artificial intelligence, computer science, and operations research. In constraint programming, users declaratively state the constraints on the feasible solutions for a set of decision variables. Constraints differ from the common primitives of imperative programming languages in that they do not specify a step or sequence of steps to execute, but rather the properties of a solution to be found. In addition to constraints, users also need to specify a method to solve these constraints. This typically draws upon standard methods like chronological backtracking and constraint propagation, but may use customized code like a problem-specific branching heuristic.
Constraint satisfaction problems (CSPs) are mathematical questions defined as a set of objects whose state must satisfy a number of constraints or limitations. CSPs represent the entities in a problem as a homogeneous collection of finite constraints over variables, which is solved by constraint satisfaction methods. CSPs are the subject of research in both artificial intelligence and operations research, since the regularity in their formulation provides a common basis to analyze and solve problems of many seemingly unrelated families. CSPs often exhibit high complexity, requiring a combination of heuristics and combinatorial search methods to be solved in a reasonable time. Constraint programming (CP) is the field of research that specifically focuses on tackling these kinds of problems. Additionally, the Boolean satisfiability problem (SAT), satisfiability modulo theories (SMT), mixed integer programming (MIP) and answer set programming (ASP) are all fields of research focusing on the resolution of particular forms of the constraint satisfaction problem.
Computer-supported cooperative work (CSCW) is the study of how people utilize technology collaboratively, often towards a shared goal. CSCW addresses how computer systems can support collaborative activity and coordination. More specifically, the field of CSCW seeks to analyze and draw connections between currently understood human psychological and social behaviors and available collaborative tools, or groupware. Often the goal of CSCW is to help promote and utilize technology in a collaborative way, and help create new tools to succeed in that goal. These parallels allow CSCW research to inform future design patterns or assist in the development of entirely new tools.
A race condition or race hazard is the condition of an electronics, software, or other system where the system's substantive behavior is dependent on the sequence or timing of other uncontrollable events, leading to unexpected or inconsistent results. It becomes a bug when one or more of the possible behaviors is undesirable.
In mathematics, iterated function systems (IFSs) are a method of constructing fractals; the resulting fractals are often self-similar. IFS fractals are more related to set theory than fractal geometry. They were introduced in 1981.
In concurrent programming, an operation is linearizable if it consists of an ordered list of invocation and response events, that may be extended by adding response events such that:
In machine learning, backpropagation is a gradient estimation method used to train neural network models. The gradient estimate is used by the optimization algorithm to compute the network parameter updates.
Eventual consistency is a consistency model used in distributed computing to achieve high availability that informally guarantees that, if no new updates are made to a given data item, eventually all accesses to that item will return the last updated value. Eventual consistency, also called optimistic replication, is widely deployed in distributed systems and has origins in early mobile computing projects. A system that has achieved eventual consistency is often said to have converged, or achieved replica convergence. Eventual consistency is a weak guarantee – most stronger models, like linearizability, are trivially eventually consistent.
A collaborative real-time editor is a type of collaborative software or web application which enables real-time collaborative editing, simultaneous editing, or live editing of the same digital document, computer file or cloud-stored data – such as an online spreadsheet, word processing document, database or presentation – at the same time by different users on different computers or mobile devices, with automatic and nearly instantaneous merging of their edits.
In mathematics, the Hjelmslev transformation is an effective method for mapping an entire hyperbolic plane into a circle with a finite radius. The transformation was invented by Danish mathematician Johannes Hjelmslev. It utilizes Nikolai Ivanovich Lobachevsky's 23rd theorem from his work Geometrical Investigations on the Theory of Parallels.
ACE is a platform-independent, collaborative real-time editor. It is a real-time cooperative editing system that allows multiple geographically dispersed users to view and edit a shared text document at the same time.
A fundamental problem in distributed computing and multi-agent systems is to achieve overall system reliability in the presence of a number of faulty processes. This often requires coordinating processes to reach consensus, or agree on some data value that is needed during computation. Example applications of consensus include agreeing on what transactions to commit to a database in which order, state machine replication, and atomic broadcasts. Real-world applications often requiring consensus include cloud computing, clock synchronization, PageRank, opinion formation, smart power grids, state estimation, control of UAVs, load balancing, blockchain, and others.
In computer science, coinduction is a technique for defining and proving properties of systems of concurrent interacting objects.
Richard Jay Lipton is an American computer scientist who is Associate Dean of Research, Professor, and the Frederick G. Storey Chair in Computing in the College of Computing at the Georgia Institute of Technology. He has worked in computer science theory, cryptography, and DNA computing.
CoWord is a software add-on to Microsoft Word to enable multiple users to edit the same document over the Internet with MS Word. It is a part of the CoOffice suite of collaboration tools for Microsoft Office.
In mathematics, the Johnson–Lindenstrauss lemma is a result named after William B. Johnson and Joram Lindenstrauss concerning low-distortion embeddings of points from high-dimensional into low-dimensional Euclidean space. The lemma states that a set of points in a high-dimensional space can be embedded into a space of much lower dimension in such a way that distances between the points are nearly preserved. In the classical proof of the lemma, the embedding is a random orthogonal projection.
Learning to rank or machine-learned ranking (MLR) is the application of machine learning, typically supervised, semi-supervised or reinforcement learning, in the construction of ranking models for information retrieval systems. Training data may, for example, consist of lists of items with some partial order specified between items in each list. This order is typically induced by giving a numerical or ordinal score or a binary judgment for each item. The goal of constructing the ranking model is to rank new, unseen lists in a similar way to rankings in the training data.
In distributed computing, a conflict-free replicated data type (CRDT) is a data structure that is replicated across multiple computers in a network, with the following features:
{{cite conference}}
: CS1 maint: multiple names: authors list (link){{cite conference}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite journal}}
: CS1 maint: multiple names: authors list (link){{cite conference}}
: CS1 maint: multiple names: authors list (link){{cite conference}}
: CS1 maint: multiple names: authors list (link)