Safe semantics

Last updated

Safe semantics is a computer hardware consistency model. It describes one type of guarantee that a data register provides when it is shared by several processors in a parallel computer or in a network of computers working together.

Contents

History

Safe semantics was first defined by Leslie Lamport in 1985. [1] It was formally defined in Lamport's "On Interprocess Communication" in 1986. [2]

Safe register has been implemented in many distributed systems.

Description

Safe semantics are defined for a variable with a single writer but multiple readers (SWMR). A SWMR register is safe if each read operation satisfies these properties:

Safe register-no overlapping Safe register-no overalapping.jpg
Safe register-no overlapping
  1. A read operation not concurrent with any write operation returns the value written by the latest write operation.
  2. A read operation that is concurrent with a write operation may return any value within the register's allowed range of values (for example, 0,1,2,...).
    safe register-overlapping Safe register-overallping.jpg
    safe register-overlapping

In particular, given concurrency of a read and a write operation, the read can return a value that has not been written by a write. The return value need only belong to the register domain.

A binary safe register can be seen as modeling a bit flickering. Whatever the previous value of the register is, its value could flicker until the write finishes. Therefore, the read that overlaps with a write could return 0 or 1.

Churn refers to the entry and exit of servers to/from a distributed system. Baldoni et al. show that no register can have the stronger property of regular semantics in a synchronous system under continuous churn. [3] However, a safe register can be implemented under continuous churn in a non-synchronous system. [4] Modeling and implementing a type of storage memory (Safe Register) under non-quiescent churn requires some system models such as client and server systems. [4] Client systems contains a finite, arbitrary number of processes that are responsible for reading and writing the server system. However, the server system must ensure that read and write operations happen properly.

Implementation

Safe register implementation involves:

Safe register is maintained by the set of active servers.

Clients maintain no register information.

Eventually synchronous system

Quora (set of server or client systems)

Size of the Read and Write operation executed on quora = n – f – J (n is the number of servers, J is the number of servers that enter and exit, and f is the number of Byzantine failures.

Algorithms such as join, read, and write. [4]

Join

A server (si) that wants to enter a server system broadcasts an inquiry message to other servers to inform them of its entry, si requests a current value of the register. Once other server receive this inquiry they send reply messages to si. After si receives enough replies from other servers, it collects the replies and saves them into a reply set. Si waits until it gets enough replies (n-f-j) from other servers then it picks the most frequently received value. Si also:

Read

The read algorithm is a basic version of join. The difference is the broadcast mechanism used by the read operation. A client (cw) broadcasts a message to the system and once a server receives the inquiry, it sends a reply message to the client. Once the client receives enough replies (n-f-j) it stops sending an inquiry.

read operation Read operation.jpg
read operation

Write

Client (cw) sends an inquiry into the system in different rounds and waits until it receives two acknowledgment. (sn =sequence number)

write operation Write operation.jpg
write operation
write operation Write2.jpg
write operation

The reason for receiving two acknowledgments is to avoid danger in a system. When a process sends an acknowledgement (ack), it may die after one millisecond. Therefore, no confirmation is received by the client.

The validity of the safe register (If a read is not concurrent with any write, return the last value written) was proved based on the quorum system. [4] Given two quorum systems (Qw, Qr) Qw indicates the servers that know about the latest value, and Qr indicates values of read responses. The size of each quorum is equal to n-f-j. [4] Proving the safe register's validity requires proving

were B is the number of Byzantine failures.

Proof : Red region indicates (Qw∩Qr)\B and the blue region indicates Qr∩B. From the assumption, the size of each quorum is n-f-j, so the red region has n-3f-2j active servers. Therefore,

is strictly greater than f.

validity Validity.jpg
validity

Notes

  1. Lamport, Leslie (June 1986). "On interprocess communication: Part I: Basic formalism" (PDF). Distributed Computing. 1 (2): 77–85. doi:10.1007/BF01786227. ISSN   0178-2770. S2CID   22834932.
  2. Lamport, Leslie (June 1986). "On interprocess communication: Part I: Basic formalism". Distributed Computing. 1 (2): 77–85. doi:10.1007/BF01786227. ISSN   0178-2770. S2CID   22834932.
  3. Baldoni, Roberto; Bonomi, Silvia; Raynal, Michel (January 2012). "Implementing a Regular Register in an Eventually Synchronous Distributed System Prone to Continuous Churn". IEEE Transactions on Parallel and Distributed Systems. 23 (1): 102–109. doi:10.1109/TPDS.2011.97. ISSN   1045-9219. S2CID   12717004.
  4. 1 2 3 4 5 Baldoni, Roberto; Bonomi, Silvia; Nezhad, Amir Soltani (November 2013). "A protocol for implementing byzantine storage in churn-prone distributed systems". Theoretical Computer Science. 512: 28–40. doi: 10.1016/j.tcs.2013.04.005 .

See also

Related Research Articles

<span class="mw-page-title-main">Atomic semantics</span>

Atomic semantics is a type of guarantee provided by a data register shared by several processors in a parallel machine or in a network of computers working together. Atomic semantics are very strong. An atomic register provides strong guarantees even when there is concurrency and failures.

<span class="mw-page-title-main">FIFO (computing and electronics)</span> Scheduling algorithm, the first piece of data inserted into a queue is processed first

In computing and in systems theory, FIFO is an acronym for first in, first out, a method for organizing the manipulation of a data structure where the oldest (first) entry, or "head" of the queue, is processed first.

Regular semantics is a computing term which describes one type of guarantee provided by a data register shared by several processors in a parallel machine or in a network of computers working together. Regular semantics are defined for a variable with a single writer but multiple readers. These semantics are stronger than safe semantics but weaker than atomic semantics: they guarantee that there is a total order to the write operations which is consistent with real-time and that read operations return either the value of the last write completed before the read begins, or that of one of the writes which are concurrent with the read.

Coda is a distributed file system developed as a research project at Carnegie Mellon University since 1987 under the direction of Mahadev Satyanarayanan. It descended directly from an older version of Andrew File System (AFS-2) and offers many similar features. The InterMezzo file system was inspired by Coda.

<span class="mw-page-title-main">Inter-process communication</span> How computer operating systems enable data sharing

In computer science, inter-process communication or interprocess communication (IPC) refers specifically to the mechanisms an operating system provides to allow the processes to manage shared data. Typically, applications can use IPC, categorized as clients and servers, where the client requests data and the server responds to client requests. Many applications are both clients and servers, as commonly seen in distributed computing.

In computer science, a consistency model specifies a contract between the programmer and a system, wherein the system guarantees that if the programmer follows the rules for operations on memory, memory will be consistent and the results of reading, writing, or updating memory will be predictable. Consistency models are used in distributed systems like distributed shared memory systems or distributed data stores. Consistency is different from coherence, which occurs in systems that are cached or cache-less, and is consistency of data with respect to all processors. Coherence deals with maintaining a global order in which writes to a single location or single variable are seen by all processors. Consistency deals with the ordering of operations to multiple locations with respect to all processors.

The V operating system is a discontinued microkernel distributed operating system that was developed by faculty and students in the Distributed Systems Group at Stanford University from 1981 to 1988, led by Professors David Cheriton and Keith A. Lantz. V was the successor to the Thoth operating system and Verex kernel that Cheriton had developed in the 1970s. Despite similar names and close development dates, it is unrelated to UNIX System V.

In computer science, message passing is a technique for invoking behavior on a computer. The invoking program sends a message to a process and relies on that process and its supporting infrastructure to then select and run some appropriate code. Message passing differs from conventional programming where a process, subroutine, or function is directly invoked by name. Message passing is key to some models of concurrency and object-oriented programming.

Eventual consistency is a consistency model used in distributed computing to achieve high availability that informally guarantees that, if no new updates are made to a given data item, eventually all accesses to that item will return the last updated value. Eventual consistency, also called optimistic replication, is widely deployed in distributed systems and has origins in early mobile computing projects. A system that has achieved eventual consistency is often said to have converged, or achieved replica convergence. Eventual consistency is a weak guarantee – most stronger models, like linearizability, are trivially eventually consistent.

Replication in computing involves sharing information so as to ensure consistency between redundant resources, such as software or hardware components, to improve reliability, fault-tolerance, or accessibility.

Remote File Sharing (RFS) is a Unix operating system component for sharing resources, such as files, devices, and file system directories, across a network, in a network-independent manner, similar to a distributed file system. It was developed at Bell Laboratories of AT&T in the 1980s, and was first delivered with UNIX System V Release 3 (SVR3). RFS relied on the STREAMS Transport Provider Interface feature of this operating system. It was also included in UNIX System V Release 4, but as that also included the Network File System (NFS) which was based on TCP/IP and more widely supported in the computing industry, RFS was little used. Some licensees of AT&T UNIX System V Release 4 did not include RFS support in SVR4 distributions, and Sun Microsystems removed it from Solaris 2.4.

.NET Remoting is a Microsoft application programming interface (API) for interprocess communication released in 2002 with the 1.0 version of .NET Framework. It is one in a series of Microsoft technologies that began in 1990 with the first version of Object Linking and Embedding (OLE) for 16-bit Windows. Intermediate steps in the development of these technologies were Component Object Model (COM) released in 1993 and updated in 1995 as COM-95, Distributed Component Object Model (DCOM), released in 1997, and COM+ with its Microsoft Transaction Server (MTS), released in 2000. It is now superseded by Windows Communication Foundation (WCF), which is part of the .NET Framework 3.0.

A fundamental problem in distributed computing and multi-agent systems is to achieve overall system reliability in the presence of a number of faulty processes. This often requires coordinating processes to reach consensus, or agree on some data value that is needed during computation. Example applications of consensus include agreeing on what transactions to commit to a database in which order, state machine replication, and atomic broadcasts. Real-world applications often requiring consensus include cloud computing, clock synchronization, PageRank, opinion formation, smart power grids, state estimation, control of UAVs, load balancing, blockchain, and others.

In computer science, state machine replication (SMR) or state machine approach is a general method for implementing a fault-tolerant service by replicating servers and coordinating client interactions with server replicas. The approach also provides a framework for understanding and designing replication management protocols.

Paxos is a family of protocols for solving consensus in a network of unreliable or fallible processors. Consensus is the process of agreeing on one result among a group of participants. This problem becomes difficult when the participants or their communications may experience failures.

In computing, a memory model describes the interactions of threads through memory and their shared use of the data.

Synchronous Interprocess Messaging Project for LINUX (SIMPL) is a free and open-source project that allows QNX-style synchronous message passing by adding a Linux library using user space techniques like shared memory and Unix pipes to implement SendMssg/ReceiveMssg/ReplyMssg inter-process messaging mechanisms.

Join-patterns provides a way to write concurrent, parallel and distributed computer programs by message passing. Compared to the use of threads and locks, this is a high level programming model using communication constructs model to abstract the complexity of concurrent environment and to allow scalability. Its focus is on the execution of a chord between messages atomically consumed from a group of channels.

In distributed computing, a shared snapshot object is a type of data structure, which is shared between several threads or processes. For many tasks, it is important to have a data structure, that can provide a consistent view of the state of the memory. In practice, it turns out that it is not possible to get such a consistent state of the memory by just accessing one shared register after another, since the values stored in individual registers can be changed at any time during this process. To solve this problem, snapshot objects store a vector of n components and provide the following two atomic operations: update(i,v) changes the value in the ith component to v, and scan returns the values stored in all n components. Snapshot objects can be constructed using atomic single-writer multi-reader shared registers.

In distributed computing, shared-memory systems and message-passing systems are two means of interprocess communication which have been heavily studied. In shared-memory systems, processes communicate by accessing shared data structures. A shared (read/write) register, sometimes just called a register, is a fundamental type of shared data structure which stores a value and has two operations: read, which returns the value stored in the register, and write, which updates the value stored. Other types of shared data structures include read–modify–write, test-and-set, compare-and-swap etc. The memory location which is concurrently accessed is sometimes called a register.