Gossip protocol

Last updated November 26, 2024

A gossip protocol or epidemic protocol is a procedure or process of computer peer-to-peer communication that is based on the way epidemics spread.^[1] Some distributed systems use peer-to-peer gossip to ensure that data is disseminated to all members of a group. Some ad-hoc networks have no central registry and the only way to spread common data is to rely on each member to pass it along to their neighbors.

Communication

The concept of gossip communication can be illustrated by the analogy of office workers spreading rumors. Let's say each hour the office workers congregate around the water cooler. Each employee pairs off with another, chosen at random, and shares the latest gossip. At the start of the day, Dave starts a new rumor: he comments to Bob that he believes that Charlie dyes his mustache. At the next meeting, Bob tells Alice, while Dave repeats the idea to Eve. After each water cooler rendezvous, the number of individuals who have heard the rumor roughly doubles (though this doesn't account for gossiping twice to the same person; perhaps Dave tries to tell the story to Frank, only to find that Frank already heard it from Alice). Computer systems typically implement this type of protocol with a form of random "peer selection": with a given frequency, each machine picks another machine at random and shares any rumors.

Variants and styles

There are probably hundreds of variants of specific gossip-like protocols because each use-scenario is likely to be customized to the organization's specific needs.

For example, a gossip protocol might employ some of these ideas:

The core of the protocol involves periodic, pairwise, inter-process interactions.
The information exchanged during these interactions is of bounded size.
When agents interact, the state of at least one agent changes to reflect the state of the other.
Reliable communication is not assumed.
The frequency of the interactions is low compared to typical message latencies so that the protocol costs are negligible.
There is some form of randomness in the peer selection. Peers might be selected from the full set of nodes or from a smaller set of neighbors.
Due to the replication there is an implicit redundancy of the delivered information.

Protocol types

It is useful to distinguish two prevailing styles of gossip protocol:^[2]

Dissemination protocols (or rumor-mongering protocols). These use gossip to spread information; they basically work by flooding agents in the network, but in a manner that produces bounded worst-case loads:
1. Event dissemination protocols use gossip to carry out multicasts. They report events, but the gossip occurs periodically and events don't actually trigger the gossip. One concern here is the potentially high latency from when the event occurs until it is delivered.
2. Background data dissemination protocols continuously gossip about information associated with the participating nodes. Typically, propagation latency isn't a concern, perhaps because the information in question changes slowly or there is no significant penalty for acting upon slightly stale data.
Protocols that compute aggregates. These compute a network-wide aggregate by sampling information at the nodes in the network and combining the values to arrive at a system-wide value – the largest value for some measurement nodes are making, smallest, etc. The key requirement is that the aggregate must be computable by fixed-size pairwise information exchanges; these typically terminate after a number of rounds of information exchange logarithmic in the system size, by which time an all-to-all information flow pattern will have been established. As a side effect of aggregation, it is possible to solve other kinds of problems using gossip; for example, there are gossip protocols that can arrange the nodes in a gossip overlay into a list sorted by node-id (or some other attribute) in logarithmic time using aggregation-style exchanges of information. Similarly, there are gossip algorithms that arrange nodes into a tree and compute aggregates such as "sum" or "count" by gossiping in a pattern biased to match the tree structure.

Many protocols that predate the earliest use of the term "gossip" fall within this rather inclusive definition. For example, Internet routing protocols often use gossip-like information exchanges. A gossip substrate can be used to implement a standard routed network: nodes "gossip" about traditional point-to-point messages, effectively pushing traffic through the gossip layer. Bandwidth permitting, this implies that a gossip system can potentially support any classic protocol or implement any classical distributed service. However, such a broadly inclusive interpretation is rarely intended. More typically gossip protocols are those that specifically run in a regular, periodic, relatively lazy, symmetric and decentralized manner; the high degree of symmetry among nodes is particularly characteristic. Thus, while one could run a 2-phase commit protocol over a gossip substrate, doing so would be at odds with the spirit, if not the wording, of the definition.

The term convergently consistent is sometimes used to describe protocols that achieve exponentially rapid spread of information. For this purpose, a protocol must propagate any new information to all nodes that will be affected by the information within time logarithmic in the size of the system (the "mixing time" must be logarithmic in system size).

Examples

Suppose that we want to find the object that most closely matches some search pattern, within a network of unknown size, but where the computers are linked to one another and where each machine is running a small agent program that implements a gossip protocol.

To start the search, a user would ask the local agent to begin to gossip about the search string. (We're assuming that agents either start with a known list of peers, or retrieve this information from some kind of a shared store.)
Periodically, at some rate (let's say ten times per second, for simplicity), each agent picks some other agent at random, and gossips with it. Search strings known to A will now also be known to B, and vice versa. In the next "round" of gossip A and B will pick additional random peers, maybe C and D. This round-by-round doubling phenomenon makes the protocol very robust, even if some messages get lost, or some of the selected peers are the same or already know about the search string.
On receipt of a search string for the first time, each agent checks its local machine for matching documents.
Agents also gossip about the best match, to date. Thus, if A gossips with B, after the interaction, A will know of the best matches known to B, and vice versa. Best matches will "spread" through the network.

If the messages might get large (for example, if many searches are active all at the same time), a size limit should be introduced. Also, searches should "age out" of the network.

It follows that within logarithmic time in the size of the network (the number of agents), any new search string will have reached all agents. Within an additional delay of the same approximate length, every agent will learn where the best match can be found. In particular, the agent that started the search will have found the best match.

For example, in a network with 25,000 machines, we can find the best match after about 30 rounds of gossip: 15 to spread the search string and 15 more to discover the best match. A gossip exchange could occur as often as once every tenth of a second without imposing undue load, hence this form of network search could search a big data center in about three seconds.

In this scenario, searches might automatically age out of the network after, say, 10 seconds. By then, the initiator knows the answer and there is no point in further gossip about that search.

Gossip protocols have also been used for achieving and maintaining distributed database consistency or with other types of data in consistent states, counting the number of nodes in a network of unknown size, spreading news robustly, organizing nodes according to some structuring policy, building so-called overlay networks, computing aggregates, sorting the nodes in a network, electing leaders, etc.

Epidemic algorithms

Gossip protocols can be used to propagate information in a manner rather similar to the way that a viral infection spreads in a biological population. Indeed, the mathematics of epidemics are often used to model the mathematics of gossip communication. The term epidemic algorithm is sometimes employed when describing a software system in which this kind of gossip-based information propagation is employed.

Related Research Articles

<span class="mw-page-title-main">Client–server model</span> Distributed application structure in computing

The client–server model is a distributed application structure that partitions tasks or workloads between the providers of a resource or service, called servers, and service requesters, called clients. Often clients and servers communicate over a computer network on separate hardware, but both client and server may be on the same device. A server host runs one or more server programs, which share their resources with clients. A client usually does not share any of its resources, but it requests content or service from a server. Clients, therefore, initiate communication sessions with servers, which await incoming requests. Examples of computer applications that use the client–server model are email, network printing, and the World Wide Web.

Hyphanet is a peer-to-peer platform for censorship-resistant, anonymous communication. It uses a decentralized distributed data store to keep and deliver information, and has a suite of free software for publishing and communicating on the Web without fear of censorship. Both Freenet and some of its associated tools were originally designed by Ian Clarke, who defined Freenet's goal as providing freedom of speech on the Internet with strong anonymity protection.

Peer-to-peer (P2P) computing or networking is a distributed application architecture that partitions tasks or workloads between peers. Peers are equally privileged, equipotent participants in the network, forming a peer-to-peer network of nodes. In addition, a personal area network (PAN) is also in nature a type of decentralized peer-to-peer network typically between two devices.

Network architecture is the design of a computer network. It is a framework for the specification of a network's physical components and their functional organization and configuration, its operational principles and procedures, as well as communication protocols used.

A distributed hash table (DHT) is a distributed system that provides a lookup service similar to a hash table. Key–value pairs are stored in a DHT, and any participating node can efficiently retrieve the value associated with a given key. The main advantage of a DHT is that nodes can be added or removed with minimum work around re-distributing keys. Keys are unique identifiers which map to particular values, which in turn can be anything from addresses, to documents, to arbitrary data. Responsibility for maintaining the mapping from keys to values is distributed among the nodes, in such a way that a change in the set of participants causes a minimal amount of disruption. This allows a DHT to scale to extremely large numbers of nodes and to handle continual node arrivals, departures, and failures.

In computing, Chord is a protocol and algorithm for a peer-to-peer distributed hash table. A distributed hash table stores key-value pairs by assigning keys to different computers ; a node will store the values for all the keys for which it is responsible. Chord specifies how keys are assigned to nodes, and how a node can discover the value for a given key by first locating the node responsible for that key.

An anonymous P2P communication system is a peer-to-peer distributed application in which the nodes, which are used to share resources, or participants are anonymous or pseudonymous. Anonymity of participants is usually achieved by special routing overlay networks that hide the physical location of each node from other participants.

In distributed data storage, a P-Grid is a self-organizing structured peer-to-peer system, which can accommodate arbitrary key distributions, still providing storage load-balancing and efficient search by using randomized routing.

Delay-tolerant networking (DTN) is an approach to computer network architecture that seeks to address the technical issues in heterogeneous networks that may lack continuous network connectivity. Examples of such networks are those operating in mobile or extreme terrestrial environments, or planned networks in space.

A computer network is a set of computers sharing resources located on or provided by network nodes. Computers use common communication protocols over digital interconnections to communicate with each other. These interconnections are made up of telecommunication network technologies based on physically wired, optical, and wireless radio-frequency methods that may be arranged in a variety of network topologies.

Özalp Babaoğlu, is a Turkish computer scientist. He is currently professor of computer science at the University of Bologna, Italy. He received a Ph.D. in 1981 from the University of California at Berkeley. He is the recipient of 1982 Sakrison Memorial Award, 1989 UNIX InternationalRecognition Award and 1993 USENIX AssociationLifetime Achievement Award for his contributions to the UNIX system community and to Open Industry Standards. Before moving to Bologna in 1988, Babaoğlu was an associate professor in the Department of Computer Science at Cornell University. He has participated in several European research projects in distributed computing and complex systems. Babaoğlu is an ACM Fellow and has served as a resident fellow of the Institute of Advanced Studies at the University of Bologna and on the editorial boards for ACM Transactions on Computer Systems, ACM Transactions on Autonomous and Adaptive Systems and Springer-Verlag Distributed Computing.

A Sybil attack is a type of attack on a computer network service in which an attacker subverts the service's reputation system by creating a large number of pseudonymous identities and uses them to gain a disproportionately large influence. It is named after the subject of the book Sybil, a case study of a woman diagnosed with dissociative identity disorder. The name was suggested in or before 2002 by Brian Zill at Microsoft Research. The term pseudospoofing had previously been coined by L. Detweiler on the Cypherpunks mailing list and used in the literature on peer-to-peer systems for the same class of attacks prior to 2002, but this term did not gain as much influence as "Sybil attack".

Peer-to-peer file sharing (P2P) systems like Gnutella, KaZaA, and eDonkey/eMule, have become extremely popular in recent years, with the estimated user population in the millions. An academic research paper analyzed Gnutella and eMule protocols and found weaknesses in the protocol; many of the issues found in these networks are fundamental and probably common on other P2P networks. Users of file sharing networks, such as eMule and Gnutella, are subject to monitoring of their activity. Clients may be tracked by IP address, DNS name, software version they use, files they share, queries they initiate, and queries they answer to. Clients may also share their private files to the network without notice due to inappropriate settings.

Peer-to-peer SIP (P2P-SIP) is an implementation of a distributed voice over Internet Protocol (VoIP) or instant messaging communications application using a peer-to-peer (P2P) architecture in which session control between communication end points is facilitated with the Session Initiation Protocol (SIP).

Scalable Source Routing (SSR) is a routing protocol for unstructured networks such as mobile ad hoc networks, mesh networks, or sensor networks. It combines source routing with routing along a virtual ring, and is based on the idea of "pushing Chord into the underlay".

A distributed file system for cloud is a file system that allows many clients to have access to data and supports operations on that data. Each data file may be partitioned into several parts called chunks. Each chunk may be stored on different remote machines, facilitating the parallel execution of applications. Typically, data is stored in files in a hierarchical tree, where the nodes represent directories. There are several ways to share files in a distributed architecture: each solution must be suitable for a certain type of application, depending on how complex the application is. Meanwhile, the security of the system must be ensured. Confidentiality, availability and integrity are the main keys for a secure system.

In mathematics, an event that occurs with high probability is one whose probability depends on a certain number n and goes to 1 as n goes to infinity, i.e. the probability of the event occurring can be made as close to 1 as desired by making n big enough.

Social cloud computing, also peer-to-peer social cloud computing, is an area of computer science that generalizes cloud computing to include the sharing, bartering and renting of computing resources across peers whose owners and operators are verified through a social network or reputation system. It expands cloud computing past the confines of formal commercial data centers operated by cloud providers to include anyone interested in participating within the cloud services sharing economy. This in turn leads to more options, greater economies of scale, while bearing additional advantages for hosting data and computing services closer to the edge where they may be needed most.

Anne-Marie Kermarrec is a French computer scientist. She is Professor at EPFL, where she heads the Scalable Computing Systems Laboratory at the School of Computer and Communication Sciences. Her research concerns distributed computing, epidemic algorithms, peer-to-peer networks, and systematic support for machine learning.

Rachid Guerraoui is a Moroccan-Swiss-French computer scientist and a Full Professor in the School of Computer and Communication Sciences (IC) at EPFL, known for his contributions in the fields of concurrent and distributed computing. He is an ACM Fellow and an associate (area) editor of the Journal of the ACM. In 2018–19, he held the position of Chair in Informatics and Computational Science for Distributed Computing at the Collège de France.

References

↑ Demers, Alan; Greene, Dan; Hauser, Carl; Irish, Wes; Larson, John (1987). "Epidemic algorithms for replicated database maintenance". Proceedings of the sixth annual ACM Symposium on Principles of distributed computing - PODC '87. pp. 1–12. doi:10.1145/41840.41841. ISBN 978-0-89791-239-6. OCLC 8876960204. S2CID 1889203.
↑ Jelasity, Márk (2011-01-01). "Gossip" (PDF). In Serugendo, Giovanna Di Marzo; Gleizes, Marie-Pierre; Karageorgos, Anthony (eds.). Self-organising Software. Natural Computing Series. Springer Berlin Heidelberg. pp. 139–162. doi:10.1007/978-3-642-17348-6_7. ISBN 978-3-642-17347-9. S2CID 214970849.

Allavena, André; Demers, Alan; Hopcroft, John E. (2005). "Correctness of a gossip based membership protocol". Proceedings of the twenty-fourth annual ACM SIGACT-SIGOPS symposium on Principles of distributed computing - PODC '05. p. 292. doi:10.1145/1073814.1073871. ISBN 978-1-58113-994-5. OCLC 8876665695. S2CID 9378092.
Birman, Kenneth P.; Hayden, Mark; Ozkasap, Oznur; Xiao, Zhen; Budiu, Mihai; Minsky, Yaron (May 1999). "Bimodal multicast". ACM Transactions on Computer Systems. 17 (2): 41–88. doi: 10.1145/312203.312207 . S2CID 207744063.
Eugster, P. Th.; Guerraoui, R.; Handurukande, S. B.; Kouznetsov, P.; Kermarrec, A.-M. (November 2003). "Lightweight probabilistic broadcast" (PDF). ACM Transactions on Computer Systems. 21 (4): 341–374. doi:10.1145/945506.945507. S2CID 6875620.
Gupta, Indranil; Birman, Ken; Linga, Prakash; Demers, Al; Van Renesse, Robbert (2003). "Kelips: Building an Efficient and Stable P2P DHT through Increased Memory and Background Overhead". Peer-to-Peer Systems II. Lecture Notes in Computer Science. Vol. 2735. pp. 160–169. doi:10.1007/978-3-540-45172-3_15. ISBN 978-3-540-40724-9.
Systematic Design of P2P Technologies for Distributed Systems. Indranil Gupta, Global Data Management, eds: R. Baldoni, G. Cortese, F. Davide and A. Melpignano, 2006.
Leitao, Joao; Pereira, Jose; Rodrigues, Luis (2007). "HyParView: A Membership Protocol for Reliable Gossip-Based Broadcast". 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07). pp. 419–429. doi:10.1109/DSN.2007.56. hdl:1822/38895. ISBN 978-0-7695-2855-7. S2CID 9060122.
Gupta, I.; Kermarrec, A.-M.; Ganesh, A.J. (July 2006). "Efficient and adaptive epidemic-style protocols for reliable and scalable multicast". IEEE Transactions on Parallel and Distributed Systems. 17 (7): 593–605. doi:10.1109/TPDS.2006.85. S2CID 1148979.
Jelasity, Márk; Montresor, Alberto; Babaoglu, Ozalp (August 2009). "T-Man: Gossip-based fast overlay topology construction" (PDF). Computer Networks. 53 (13): 2321–2339. doi:10.1016/j.comnet.2009.03.013.
Leitao, Joao; Pereira, Jose; Rodrigues, Luis (2007). "Epidemic Broadcast Trees". 2007 26th IEEE International Symposium on Reliable Distributed Systems (SRDS 2007). pp. 301–310. doi:10.1109/SRDS.2007.27. hdl:1822/38894. ISBN 978-0-7695-2995-0. S2CID 7210467.
Jelasity, Márk; Montresor, Alberto; Babaoglu, Ozalp (August 2005). "Gossip-based aggregation in large dynamic networks" (PDF). ACM Transactions on Computer Systems. 23 (3): 219–252. doi:10.1145/1082469.1082470. S2CID 2608879.
Ordered slicing of very large overlay networks. Márk Jelasity and Anne-Marie Kermarrec. IEEE P2P, 2006.
Proximity-aware superpeer overlay topologies. Gian Paolo Jesi, Alberto Montresor, and Ozalp Babaoglu. IEEE Transactions on Network and Service Management, 4(2):74–83, September 2007.
X-BOT: A Protocol for Resilient Optimization of Unstructured Overlays. João Leitão, João Marques, José Pereira, Luís Rodrigues. Proc. 28th IEEE International Symposium on Reliable Distributed Systems (SRDS'09).
Spatial gossip and resource location protocols. David Kempe, Jon Kleinberg, Alan Demers. Journal of the ACM (JACM) 51: 6 (Nov 2004).
Gossip-Based Computation of Aggregate Information. David Kempe, Alin Dobra, Johannes Gehrke. Proc. 44th Annual IEEE Symposium on Foundations of Computer Science (FOCS). 2003.
Active and Passive Techniques for Group Size Estimation in Large-Scale and Dynamic Distributed Systems. Dionysios Kostoulas, Dimitrios Psaltoulis, Indranil Gupta, Ken Birman, Al Demers. Elsevier Journal of Systems and Software, 2007.
Build One, Get One Free: Leveraging the Coexistence of Multiple P2P Overlay Networks. Balasubramaneyam Maniymaran, Marin Bertier and Anne-Marie Kermarrec. Proc. ICDCS, June 2007.
Peer counting and sampling in overlay networks: random walk methods. Laurent Massoulié, Erwan Le Merrer, Anne-Marie Kermarrec, Ayalvadi Ganesh. Proc. 25th ACM PODC. Denver, 2006.
Chord on Demand. Alberto Montresor, Márk Jelasity, and Ozalp Babaoglu. Proc. 5th Conference on Peer-to-Peer Computing (P2P), Konstanz, Germany, August 2005.
Nielsen, Michael A. (2005). "Introduction to expander graphs" (PDF). Michael Nielsen. S2CID 3045708. Archived from the original (PDF) on 2011-07-15.
Building low-diameter P2P networks. G. Pandurangan, P. Raghavan, Eli Upfal. In Proceedings of the 42nd Symposium on Foundations of Computer Science (FOCS), 2001.
Van Renesse, Robbert; Birman, Kenneth P.; Vogels, Werner (May 2003). "Astrolabe: A robust and scalable technology for distributed system monitoring, management, and data mining". ACM Transactions on Computer Systems. 21 (2): 164–206. doi:10.1145/762483.762485. S2CID 6204358.
Voulgaris, S.; Kermarrec, A.-M.; Massoulie, L.; Van Steen, M. (2004). "Exploiting semantic proximity in peer-to-peer content searching". Proceedings. 10th IEEE International Workshop on Future Trends of Distributed Computing Systems, 2004. FTDCS 2004. pp. 238–243. doi:10.1109/FTDCS.2004.1316622. hdl:1871/12832. ISBN 0-7695-2118-5. S2CID 9464168.
Gupta, Ruchir; Singh, Yatindra Nath (2015). "Reputation Aggregation in Peer-to-Peer Network Using Differential Gossip Algorithm". IEEE Transactions on Knowledge and Data Engineering. 27 (10): 2812–2823. arXiv: 1210.4301 . doi:10.1109/TKDE.2015.2427793. S2CID 650473.
Bailey, Norman T. J. (1957). The Mathematical Theory of Epidemics. Hafner. ISBN 978-0-85264-113-2.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Demers, Alan; Greene, Dan; Hauser, Carl; Irish, Wes; Larson, John (1987). "Epidemic algorithms for replicated database maintenance". Proceedings of the sixth annual ACM Symposium on Principles of distributed computing - PODC '87. pp. 1–12. doi:10.1145/41840.41841. ISBN 978-0-89791-239-6. OCLC 8876960204. S2CID 1889203.

[2] Jelasity, Márk (2011-01-01). "Gossip" (PDF). In Serugendo, Giovanna Di Marzo; Gleizes, Marie-Pierre; Karageorgos, Anthony (eds.). Self-organising Software. Natural Computing Series. Springer Berlin Heidelberg. pp. 139–162. doi:10.1007/978-3-642-17348-6_7. ISBN 978-3-642-17347-9. S2CID 214970849.

[1]

[2]