Population protocol

Last updated October 09, 2022

A population protocol is a distributed computing model formed by resource-limited mobile agents which meet in a random way according to an interaction graph. Functions are computed by updating the state of agents whenever they meet based on their previous state, and the result of the computation can be read in the states of the agents once the computation has converged.

Model

There is a set $N=\{1,2,\ldots ,n\}$ of nodes. Each node is a finite automaton with $s$ states. An important class of population protocols are majority algorithms, where the goal is to compute the majority bit: each node starts with a belief bit in $\{0,1\}$ and the goal is to design a protocol at the end of which the belief bit of every node is the correct initial majority bit.

The discrete time version of the model is as follows: at each point $t=1,2,\ldots$ in time, some node $i$ is selected uniformly at random. Then the node is matched with another node $j$ , which is chosen uniformly at random from the set of neighbors of node $i$ . Afterwards, nodes $i$ and $j$ exchange memory contents and update their states. Alternatively, one can consider a continuous time model where each node $i$ has a Poisson clock that rings at unit rate. When the clock of a node rings, that node communicates with a random neighbor.

Protocols are often designed to minimize the convergence time or the amount of memory required per node or both.^[1]

Three State Protocol

For the problem of computing the majority (consensus), there is a well-known protocol that requires only three memory states per node and has been analyzed for complete interaction graphs.^[2]^[3] This protocol works as follows. Let each node $i$ initialize its memory state to their initial belief bit $b_{i}\in \{0,1\}.$ At each point in time, when two nodes communicate, they update their state according to the following table. The row labels give the initiator’s state and the column labels the responder’s state.

Interaction rules of 3-state protocol
	0	?	1
0	(0,0)	(0,0)	(0,?)
?	(?,0)	(?,?)	(?,1)
1	(1,?)	(1,1)	(1,1)

In words, if a node with belief $0$ gets matched with a node with belief $0$ , then both nodes keep their belief; the update is similar if both beliefs are $1$ or both are ${\displaystyle$ . However, if the initiator's belief is $0$ and the responder's belief is ${\displaystyle$ , then the respondent updates their belief to $0$ . If on the other hand the initiator has belief $0$ and the responder has belief $1$ , then the responder changes their belief to ${\displaystyle$ . Note that this protocol is one-way: every interaction changes at most the responder’s state; thus it can be implemented with one-way communication.

Angluin, Aspnes, and Eisenstat ^[2] showed that, from any initial configuration that does not consist of all " ${\displaystyle$ "s, the three-state approximate majority protocol converges to either all nodes having belief $0$ or all nodes having belief $1$ within $O(n\cdot \log n)$ interactions with high probability. Additionally, the value chosen will be the majority non-" ${\displaystyle$ " initial value, provided it exceeds the minority by a sufficient margin.

The following picture shows the evolution of the three state protocol on a set of $n=500$ nodes, where one third of the nodes have initial belief bit $0$ , while the remaining two thirds have initial belief bit $1$ . The fraction of " ${\displaystyle$ " nodes (in orange) starts at zero, increases for a while, and then goes again to zero.

History

Population protocols were introduced by Dana Angluin et al.^[4] as one of the first models of computation to be fully decentralized and to involve agents with highly limited resources, e.g., those found in sensor networks. Since then, this abstract computation model found applications in robotics ^[5] and chemistry.^[6]

Related Research Articles

Distributed computing is a field of computer science that studies distributed systems. A distributed system is a system whose components are located on different networked computers, which communicate and coordinate their actions by passing messages to one another from any system. The components interact with one another in order to achieve a common goal. Three significant challenges of distributed systems are: maintaining concurrency of components, overcoming the lack of a global clock, and managing the independent failure of components. When a component of one system fails, the entire system does not fail. Examples of distributed systems vary from SOA-based systems to massively multiplayer online games to peer-to-peer applications.

In computational complexity theory, an interactive proof system is an abstract machine that models computation as the exchange of messages between two parties: a prover and a verifier. The parties interact by exchanging messages in order to ascertain whether a given string belongs to a language or not. The prover possesses unlimited computational resources but cannot be trusted, while the verifier has bounded computation power but is assumed to be always honest. Messages are sent between the verifier and prover until the verifier has an answer to the problem and has "convinced" itself that it is correct.

A Bayesian network is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Bayesian networks are ideal for taking an event that occurred and predicting the likelihood that any one of several possible known causes was the contributing factor. For example, a Bayesian network could represent the probabilistic relationships between diseases and symptoms. Given symptoms, the network can be used to compute the probabilities of the presence of various diseases.

A commitment scheme is a cryptographic primitive that allows one to commit to a chosen value while keeping it hidden to others, with the ability to reveal the committed value later. Commitment schemes are designed so that a party cannot change the value or statement after they have committed to it: that is, commitment schemes are binding. Commitment schemes have important applications in a number of cryptographic protocols including secure coin flipping, zero-knowledge proofs, and secure computation.

A graphical model or probabilistic graphical model (PGM) or structured probabilistic model is a probabilistic model for which a graph expresses the conditional dependence structure between random variables. They are commonly used in probability theory, statistics—particularly Bayesian statistics—and machine learning.

A randomized algorithm is an algorithm that employs a degree of randomness as part of its logic or procedure. The algorithm typically uses uniformly random bits as an auxiliary input to guide its behavior, in the hope of achieving good performance in the "average case" over all possible choices of random determined by the random bits; thus either the running time, or the output are random variables.

Kademlia is a distributed hash table for decentralized peer-to-peer computer networks designed by Petar Maymounkov and David Mazières in 2002. It specifies the structure of the network and the exchange of information through node lookups. Kademlia nodes communicate among themselves using UDP. A virtual or overlay network is formed by the participant nodes. Each node is identified by a number or node ID. The node ID serves not only as identification, but the Kademlia algorithm uses the node ID to locate values. In fact, the node ID provides a direct map to file hashes and that node stores information on where to obtain the file or resource.

Belief propagation, also known as sum–product message passing, is a message-passing algorithm for performing inference on graphical models, such as Bayesian networks and Markov random fields. It calculates the marginal distribution for each unobserved node, conditional on any observed nodes. Belief propagation is commonly used in artificial intelligence and information theory, and has demonstrated empirical success in numerous applications, including low-density parity-check codes, turbo codes, free energy approximation, and satisfiability.

<span class="mw-page-title-main">Maximal independent set</span> Independent set which is not a subset of any other independent set

In graph theory, a maximal independent set (MIS) or maximal stable set is an independent set that is not a subset of any other independent set. In other words, there is no vertex outside the independent set that may join it because it is maximal with respect to the independent set property.

A fundamental problem in distributed computing and multi-agent systems is to achieve overall system reliability in the presence of a number of faulty processes. This often requires coordinating processes to reach consensus, or agree on some data value that is needed during computation. Example applications of consensus include agreeing on what transactions to commit to a database in which order, state machine replication, and atomic broadcasts. Real-world applications often requiring consensus include cloud computing, clock synchronization, PageRank, opinion formation, smart power grids, state estimation, control of UAVs, load balancing, blockchain, and others.

In computational complexity theory and circuit complexity, a Boolean circuit is a mathematical model for combinational digital logic circuits. A formal language can be decided by a family of Boolean circuits, one circuit for each possible input length.

The Watts–Strogatz model is a random graph generation model that produces graphs with small-world properties, including short average path lengths and high clustering. It was proposed by Duncan J. Watts and Steven Strogatz in their article published in 1998 in the Nature scientific journal. The model also became known as the (Watts) beta model after Watts used $to formulate it in his popular science book Six Degrees .$

In the study of scale-free networks, a copying mechanism is a process by which such a network can form and grow, by means of repeated steps in which nodes are duplicated with mutations from existing nodes. Several variations have been studied. In the general copying model, a growing network starts as a small initial graph and, at each time step, a new vertex is added with a given number k of new outgoing edges. As a result of a stochastic selection, the neighbors of the new vertex are either chosen randomly among the existing vertices, or one existing vertex is randomly selected and k of its neighbors are "copied" as heads of the new edges.

Routing in delay-tolerant networking concerns itself with the ability to transport, or route, data from a source to a destination, which is a fundamental ability all communication networks must have. Delay- and disruption-tolerant networks (DTNs) are characterized by their lack of connectivity, resulting in a lack of instantaneous end-to-end paths. In these challenging environments, popular ad hoc routing protocols such as AODV and DSR fail to establish routes. This is due to these protocols trying to first establish a complete route and then, after the route has been established, forward the actual data. However, when instantaneous end-to-end paths are difficult or impossible to establish, routing protocols must take to a "store and forward" approach, where data is incrementally moved and stored throughout the network in hopes that it will eventually reach its destination. A common technique used to maximize the probability of a message being successfully transferred is to replicate many copies of the message in hopes that one will succeed in reaching its destination.

Quantum complexity theory is the subfield of computational complexity theory that deals with complexity classes defined using quantum computers, a computational model based on quantum mechanics. It studies the hardness of computational problems in relation to these complexity classes, as well as the relationship between quantum complexity classes and classical complexity classes.

Michael Ezra Saks is an American mathematician. He is currently the Department Chair of the Mathematics Department at Rutgers University (2017–) and from 2006 until 2010 was director of the Mathematics Graduate Program at Rutgers University. Saks received his Ph.D. from the Massachusetts Institute of Technology in 1980 after completing his dissertation titled Duality Properties of Finite Set Systems under his advisor Daniel J. Kleitman.

Dana Angluin is a professor emeritus of computer science at Yale University. She is known for foundational work in computational learning theory and distributed computing.

Copying network models are network generation models that use a copying mechanism to form a network, by repeatedly duplicating and mutating existing nodes of the network. Such a network model has first been proposed in 1999 to explain the network of links between web pages, but since has been used to model biological and citation networks as well.

Federated learning is a machine learning technique that trains an algorithm across multiple decentralized edge devices or servers holding local data samples, without exchanging them. This approach stands in contrast to traditional centralized machine learning techniques where all the local datasets are uploaded to one server, as well as to more classical decentralized approaches which often assume that local data samples are identically distributed.

In network theory, collective classification is the simultaneous prediction of the labels for multiple objects, where each label is predicted using information about the object's observed features, the observed features and labels of its neighbors, and the unobserved labels of its neighbors. Collective classification problems are defined in terms of networks of random variables, where the network structure determines the relationship between the random variables. Inference is performed on multiple random variables simultaneously, typically by propagating information between nodes in the network to perform approximate inference. Approaches that use collective classification can make use of relational information when performing inference. Examples of collective classification include predicting attributes of individuals in a social network, classifying webpages in the World Wide Web, and inferring the research area of a paper in a scientific publication dataset.

References

↑ Alistarh, Dan; Aspnes, James; Eisenstat, David; Gelashvili, Rati; Rivest, Ronald L. (2017-01-16). "Time-space trade-offs in population protocols". Soda '17. Society for Industrial and Applied Mathematics: 2560–2579. arXiv: 1602.08032 . Bibcode:2016arXiv160208032A.{{cite journal}}: Cite journal requires |journal= (help)
1 2 Angluin, Dana; Aspnes, James; Eisenstat, David (2007), "A Simple Population Protocol for Fast Robust Approximate Majority", Distributed Computing, Lecture Notes in Computer Science, vol. 4731, Springer Berlin Heidelberg, pp. 20–32, doi:10.1007/978-3-540-75142-7_5, ISBN 9783540751410
↑ Perron, E.; Vasudevan, D.; Vojnovic, M. (April 2009). "Using Three States for Binary Consensus on Complete Graphs". IEEE INFOCOM 2009 - the 28th Conference on Computer Communications. IEEE: 2527–2535. doi:10.1109/infcom.2009.5062181. ISBN 9781424435128. S2CID 12683772.
↑ Dana Angluin, James Aspnes, Zoë Diamadi, Michael J. Fischer, René Peralta. Computation in networks of passively mobile finite-state sensors. Distributed Computing, 2006.
↑ Gregory Dudek, Michael Jenkin. Computational Principles of Mobile Robotics, Chapter 10.
↑ Ho-Lin Chen, David Doty, David Soloveichik. Deterministic function computation with chemical reaction networks. Natural Computing, 2014.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Alistarh, Dan; Aspnes, James; Eisenstat, David; Gelashvili, Rati; Rivest, Ronald L. (2017-01-16). "Time-space trade-offs in population protocols". Soda '17. Society for Industrial and Applied Mathematics: 2560–2579. arXiv: 1602.08032 . Bibcode:2016arXiv160208032A.{{cite journal}}: Cite journal requires |journal= (help)

[:0-2] 1 2 Angluin, Dana; Aspnes, James; Eisenstat, David (2007), "A Simple Population Protocol for Fast Robust Approximate Majority", Distributed Computing, Lecture Notes in Computer Science, vol. 4731, Springer Berlin Heidelberg, pp. 20–32, doi:10.1007/978-3-540-75142-7_5, ISBN 9783540751410

[3] Perron, E.; Vasudevan, D.; Vojnovic, M. (April 2009). "Using Three States for Binary Consensus on Complete Graphs". IEEE INFOCOM 2009 - the 28th Conference on Computer Communications. IEEE: 2527–2535. doi:10.1109/infcom.2009.5062181. ISBN 9781424435128. S2CID 12683772.

[4] Dana Angluin, James Aspnes, Zoë Diamadi, Michael J. Fischer, René Peralta. Computation in networks of passively mobile finite-state sensors. Distributed Computing, 2006.

[5] Gregory Dudek, Michael Jenkin. Computational Principles of Mobile Robotics, Chapter 10.

[6] Ho-Lin Chen, David Doty, David Soloveichik. Deterministic function computation with chemical reaction networks. Natural Computing, 2014.

[1]

[2]

[3]

[4]

[5]

[6]

v t e Parallel computing
General	Distributed computing Parallel computing Massively parallel Cloud computing High-performance computing Multiprocessing Manycore processor GPGPU Computer network Systolic array
Levels	Bit Instruction Thread Task Data Memory Loop Pipeline
Multithreading	Temporal Simultaneous (SMT) Speculative (SpMT) Preemptive Cooperative Clustered multi-thread (CMT) Hardware scout
Theory	PRAM model PEM model Analysis of parallel algorithms Amdahl's law Gustafson's law Cost efficiency Karp–Flatt metric Slowdown Speedup
Elements	Process Thread Fiber Instruction window Array
Coordination	Multiprocessing Memory coherence Cache coherence Cache invalidation Barrier Synchronization Application checkpointing
Programming	Stream processing Dataflow programming Models Implicit parallelism Explicit parallelism Concurrency Non-blocking algorithm
Hardware	Flynn's taxonomy SISD SIMD Array processing (SIMT) Pipelined processing Associative processing MISD MIMD Dataflow architecture Pipelined processor Superscalar processor Vector processor Multiprocessor symmetric asymmetric Memory shared distributed distributed shared UMA NUMA COMA Massively parallel computer Computer cluster Beowulf cluster Grid computer Hardware acceleration
APIs	Ateji PX Boost Chapel HPX Charm++ Cilk Coarray Fortran CUDA Dryad C++ AMP Global Arrays GPUOpen MPI OpenMP OpenCL OpenHMPP OpenACC Parallel Extensions PVM pthreads RaftLib ROCm UPC TBB ZPL
Problems	Automatic parallelization Deadlock Deterministic algorithm Embarrassingly parallel Parallel slowdown Race condition Software lockout Scalability Starvation
Category: Parallel computing