Berkeley algorithm

Last updated

The Berkeley algorithm is a method of clock synchronisation in distributed computing which assumes no machine has an accurate time source. It was developed by Gusella and Zatti at the University of California, Berkeley in 1989. [1] Like Cristian's algorithm, it is intended for use within intranets.

The algorithm

Unlike Cristian's algorithm, the server process in the Berkeley algorithm, called the leader, periodically polls other follower processes. Generally speaking, the algorithm is:

  1. A leader is chosen via an election process such as Chang and Roberts algorithm.
  2. The leader polls the followers who reply with their time in a similar way to Cristian's algorithm.
  3. The leader observes the round-trip time (RTT) of the messages and estimates the time of each follower and its own.
  4. The leader then averages the clock times, ignoring any values it receives far outside the values of the others.
  5. Instead of sending the updated current time back to the other process, the leader then sends out the amount (positive or negative) that each follower must adjust its clock. This avoids further uncertainty due to RTT at the follower processes.

With this method the average cancels out individual clock's tendencies to drift. Gusella and Zatti released results involving 15 computers whose clocks were synchronised to within about 20-25 milliseconds using their protocol.

Computer systems normally avoid rewinding their clock when they receive a negative clock alteration from the leader. Doing so would break the property of monotonic time, which is a fundamental assumption in certain algorithms in the system itself or in programs such as make. A simple solution to this problem is to halt the clock for the duration specified by the leader, but this simplistic solution can also cause problems, although they are less severe. For minor corrections, most systems slow the clock, applying the correction over a longer period of time.

Often, any client whose clock differs by a value outside of a given tolerance is disregarded when averaging the results. This prevents the overall system time from being drastically skewed due to one erroneous clock.

Related Research Articles

Distributed computing is a field of computer science that studies distributed systems. A distributed system is a system whose components are located on different networked computers, which communicate and coordinate their actions by passing messages to one another from any system. The components interact with one another in order to achieve a common goal. Three significant characteristics of distributed systems are: concurrency of components, lack of a global clock, and independent failure of components. Examples of distributed systems vary from SOA-based systems to massively multiplayer online games to peer-to-peer applications.

The Transmission Control Protocol (TCP) is one of the main protocols of the Internet protocol suite. It originated in the initial network implementation in which it complemented the Internet Protocol (IP). Therefore, the entire suite is commonly referred to as TCP/IP. TCP provides reliable, ordered, and error-checked delivery of a stream of octets (bytes) between applications running on hosts communicating via an IP network. Major internet applications such as the World Wide Web, email, remote administration, and file transfer rely on TCP, which is part of the Transport Layer of the TCP/IP suite. SSL/TLS often runs on top of TCP.

Genetic algorithm Competitive algorithm for searching a problem space

In computer science and operations research, a genetic algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic algorithms are commonly used to generate high-quality solutions to optimization and search problems by relying on biologically inspired operators such as mutation, crossover and selection.

Load balancing (computing) Set of techniques to improve the distribution of workloads across multiple computing resources

In computing, load balancing refers to the process of distributing a set of tasks over a set of resources, with the aim of making their overall processing more efficient. Load balancing can optimize the response time and avoid unevenly overloading some compute nodes while other compute nodes are left idle.

Theoretical computer science Subfield of computer science and mathematics

Theoretical computer science (TCS) is a subset of general computer science and mathematics that focuses on mathematical aspects of computer science such as the theory of computation, lambda calculus, and type theory.

FAST TCP is a TCP congestion avoidance algorithm especially targeted at long-distance, high latency links, developed at the Netlab, California Institute of Technology and now being commercialized by FastSoft. FastSoft was acquired by Akamai Technologies in 2012.

End-to-end delay or one-way delay (OWD) refers to the time taken for a packet to be transmitted across a network from source to destination. It is a common term in IP network monitoring, and differs from round-trip time (RTT) in that only path in the one direction from source to destination is measured.

Clock synchronization is a topic in computer science and engineering that aims to coordinate otherwise independent clocks. Even when initially set accurately, real clocks will differ after some amount of time due to clock drift, caused by clocks counting time at slightly different rates. There are several problems that occur as a result of clock rate differences and several solutions, some being more appropriate than others in certain contexts.

In data communications, flow control is the process of managing the rate of data transmission between two nodes to prevent a fast sender from overwhelming a slow receiver. It provides a mechanism for the receiver to control the transmission speed, so that the receiving node is not overwhelmed with data from transmitting node. Flow control should be distinguished from congestion control, which is used for controlling the flow of data when congestion has actually occurred. Flow control mechanisms can be classified by whether or not the receiving node sends feedback to the sending node.

Clock skew is a phenomenon in synchronous digital circuit systems in which the same sourced clock signal arrives at different components at different times. The instantaneous difference between the readings of any two clocks is called their skew.

TCP Vegas is a TCP congestion avoidance algorithm that emphasizes packet delay, rather than packet loss, as a signal to help determine the rate at which to send packets. It was developed at the University of Arizona by Lawrence Brakmo and Larry L. Peterson and introduced in 1994.

The Precision Time Protocol (PTP) is a protocol used to synchronize clocks throughout a computer network. On a local area network, it achieves clock accuracy in the sub-microsecond range, making it suitable for measurement and control systems. PTP is currently employed to synchronize financial transactions, mobile phone tower transmissions, sub-sea acoustic arrays, and networks that require precise timing but lack access to satellite navigation signals.

Multilateration is a technique for determining a 'vehicle's' position based on measurement of the times of arrival (TOAs) of energy waves having a known waveform and speed when propagating either from (navigation) or to (surveillance) multiple system stations. These stations are at known locations and have synchronized 'clocks'. Prior to computing a solution, the time of transmission (TOT) of the waves is unknown to the receiver on the 'vehicle' (navigation) or the receivers at the stations (surveillance); consequently, also unknown is the wave time of flight (TOF).

A fundamental problem in distributed computing and multi-agent systems is to achieve overall system reliability in the presence of a number of faulty processes. This often requires coordinating processes to reach consensus, or agree on some data value that is needed during computation. Example applications of consensus include agreeing on what transactions to commit to a database in which order, state machine replication, and atomic broadcasts. Real-world applications often requiring consensus include cloud computing, clock synchronization, PageRank, opinion formation, smart power grids, state estimation, control of UAVs, load balancing, blockchain, and others.

The random neural network (RNN) is a mathematical representation of an interconnected network of neurons or cells which exchange spiking signals. It was invented by Erol Gelenbe and is linked to the G-network model of queueing networks as well as to Gene Regulatory Network models. Each cell state is represented by an integer whose value rises when the cell receives an excitatory spike and drops when it receives an inhibitory spike. The spikes can originate outside the network itself, or they can come from other cells in the networks. Cells whose internal excitatory state has a positive value are allowed to send out spikes of either kind to other cells in the network according to specific cell-dependent spiking rates. The model has a mathematical solution in steady-state which provides the joint probability distribution of the network in terms of the individual probabilities that each cell is excited and able to send out spikes. Computing this solution is based on solving a set of non-linear algebraic equations whose parameters are related to the spiking rates of individual cells and their connectivity to other cells, as well as the arrival rates of spikes from outside the network. The RNN is a recurrent model, i.e. a neural network that is allowed to have complex feedback loops.

In distributed computing, leader election is the process of designating a single process as the organizer of some task distributed among several computers (nodes). Before the task has begun, all network nodes are either unaware which node will serve as the "leader" of the task, or unable to communicate with the current coordinator. After a leader election algorithm has been run, however, each node throughout the network recognizes a particular, unique node as the task leader.

Cristian's algorithm is a method for clock synchronization which can be used in many fields of distributive computer science but is primarily used in low-latency intranets. Cristian observed that this simple algorithm is probabilistic, in that it only achieves synchronization if the round-trip time (RTT) of the request is short compared to required accuracy. It also suffers in implementations using a single server, making it unsuitable for many distributive applications where redundancy may be crucial.

Stochastic computing is a collection of techniques that represent continuous values by streams of random bits. Complex computations can then be computed by simple bit-wise operations on the streams. Stochastic computing is distinct from the study of randomized algorithms.

Ghosting is a visual artifact that occurs in magnetic resonance imaging (MRI) scan. The these artifact are a consequence of environmental factors or human body. Ghosting is a multidimensional artifact that occurs in the MRI in the phase encoded direction after applying the Fourier transform.

timed is an operating system program that maintains the system time in synchronization with time servers using the Time Synchronization Protocol (TSP) developed by Riccardo Gusella and Stefano Zatti. Gusella and Zatti had done earlier related work on their TEMPO algorithm. The Time Synchronization Protocol specification refers an election algorithm and a synchronization mechanism specified in other technical reports listed as "to appear".

References

  1. Gusella, R.; Zatti, S. (1989), "The accuracy of the clock synchronization achieved by TEMPO in Berkeley UNIX 4.3BSD", IEEE Transactions on Software Engineering, IEEE, 15 (7): 847–853, doi:10.1109/32.29484