Query flooding

Last updated

Query flooding is a method to search for a resource on a peer-to-peer network. It is simple and scales very poorly and thus is rarely used. Early versions of the Gnutella protocol operated by query flooding; newer versions use more efficient search algorithms.

Contents

Operation

A peer-to-peer network generally consists of a large number of nodes each connected to a small subset of the nodes and not all nodes in the network. If a node wants to find a resource on the network, which may be on a node it does not know about, it could simply broadcast its search query to its immediate neighbours. If the neighbours do not have the resource, it then asks its neighbours to forward the query to their neighbours in turn. This is repeated until the resource is found or all the nodes have been contacted, or perhaps a network-imposed hop limit is reached.

Query flooding is simple to implement and is practical for small networks with few requests. It contacts all reachable nodes in the network and so can precisely determine whether a resource can be found in the network (Freenet, for example, only returns a probabilistic result).

On the other hand, every request may cause every node to be contacted. Each node might generate a small number of queries; however, each such query floods the network. Thus, a larger network would generate far more traffic per node than a smaller one, making it inherently unscalable. Additionally, because a node can flood the network simply by issuing a request for a nonexistent resource, it could be possible to launch a denial-of-service attack on the network.

Alternatives

Version 0.6 of the Gnutella protocol mandates query routing. The query routing specification explains how the ideas of the original research are implemented. Other file-sharing networks, such as the Kad network, use distributed hash tables to index files and for keyword searches. BitTorrent creates individual overlay networks for sharing individual files (or archives). Searches are performed by other mechanisms, such as locating torrent files indexed on a website. A similar mechanism can be used on the Gnutella network with magnet links. For instance Bitzi provides a web interface to search for magnet links.

Earlier P2P networks, such as Napster, used a centralized database to locate files. This does not have a scaling problem, but the central server is a single point of failure.

See also

Related Research Articles

Gnutella is a peer-to-peer network protocol. Founded in 2000, it was the first decentralized peer-to-peer network of its kind, leading to other, later networks adopting the model.

<span class="mw-page-title-main">Peer-to-peer</span> Type of decentralized and distributed network architecture

Peer-to-peer (P2P) computing or networking is a distributed application architecture that partitions tasks or workloads between peers. Peers are equally privileged, equipotent participants in the network. They are said to form a peer-to-peer network of nodes.

<span class="mw-page-title-main">Shareaza</span> Peer-to-peer file sharing application

Shareaza is a peer-to-peer file sharing client running under Microsoft Windows which supports the gnutella, Gnutella2 (G2), eDonkey, BitTorrent, FTP, HTTP and HTTPS network protocols and handles magnet links, ed2k links, and the now deprecated gnutella and Piolet links. It is available in 30 languages.

eDonkey2000

eDonkey2000 was a peer-to-peer file sharing application developed by US company MetaMachine, using the Multisource File Transfer Protocol. This client supports both the eDonkey2000 network and the Overnet network.

giFT Internet File Transfer (giFT) was a computer software daemon that allows several file sharing protocols to be used with a simple client having a graphical user interface (GUI). The client dynamically loads plugins implementing the protocols, as they are required.

Scalability is the property of a system to handle a growing amount of work by adding resources to the system.

<span class="mw-page-title-main">Distributed hash table</span> Decentralized distributed system with lookup service

A distributed hash table (DHT) is a distributed system that provides a lookup service similar to a hash table: key–value pairs are stored in a DHT, and any participating node can efficiently retrieve the value associated with a given key. The main advantage of a DHT is that nodes can be added or removed with minimum work around re-distributing keys. Keys are unique identifiers which map to particular values, which in turn can be anything from addresses, to documents, to arbitrary data. Responsibility for maintaining the mapping from keys to values is distributed among the nodes, in such a way that a change in the set of participants causes a minimal amount of disruption. This allows a DHT to scale to extremely large numbers of nodes and to handle continual node arrivals, departures, and failures.

BitTorrent is a communication protocol for peer-to-peer file sharing (P2P), which enables users to distribute data and electronic files over the Internet in a decentralized manner.

gtk-gnutella

gtk-gnutella is a peer-to-peer file sharing application which runs on the gnutella network. gtk-gnutella uses the GTK+ toolkit for its graphical user interface. Released under the GNU General Public License, gtk-gnutella is free software.

<span class="mw-page-title-main">GNUnet</span> Framework for decentralized, peer-to-peer networking which is part of the GNU Project

GNUnet is a software framework for decentralized, peer-to-peer networking and an official GNU package. The framework offers link encryption, peer discovery, resource allocation, communication over many transports and various basic peer-to-peer algorithms for routing, multicast and network size estimation.

<span class="mw-page-title-main">Gnutella2</span>

Gnutella2, often referred to as G2, is a peer-to-peer protocol developed mainly by Michael Stokes and released in 2002.

Kademlia is a distributed hash table for decentralized peer-to-peer computer networks designed by Petar Maymounkov and David Mazières in 2002. It specifies the structure of the network and the exchange of information through node lookups. Kademlia nodes communicate among themselves using UDP. A virtual or overlay network is formed by the participant nodes. Each node is identified by a number or node ID. The node ID serves not only as identification, but the Kademlia algorithm uses the node ID to locate values.

<span class="mw-page-title-main">Magnet URI scheme</span> Scheme that defines the format of magnet links

Magnet is a URI scheme that defines the format of magnet links, a de facto standard for identifying files (URN) by their content, via cryptographic hash value rather than by their location.

GnucDNA was a software library for building peer-to-peer applications. It provides developers with a common layer to create their own Gnutella or Gnutella2 client or network. As a separate component, GnucDNA can be updated independently of the client, passing down improvements to the applications already using it.

<span class="mw-page-title-main">Merkle tree</span> Type of data structure

In cryptography and computer science, a hash tree or Merkle tree is a tree in which every "leaf" (node) is labelled with the cryptographic hash of a data block, and every node that is not a leaf is labelled with the cryptographic hash of the labels of its child nodes. A hash tree allows efficient and secure verification of the contents of a large data structure. A hash tree is a generalization of a hash list and a hash chain.

Tapestry is a peer-to-peer overlay network which provides a distributed hash table, routing, and multicasting infrastructure for distributed applications. The Tapestry peer-to-peer system offers efficient, scalable, self-repairing, location-aware routing to nearby resources.

Peer-to-peer file sharing (P2P) systems like Gnutella, KaZaA, and eDonkey/eMule, have become extremely popular in recent years, with the estimated user population in the millions. An academic research paper analyzed Gnutella and eMule protocols and found weaknesses in the protocol; many of the issues found in these networks are fundamental and probably common on other P2P networks. Users of file sharing networks, such as eMule and Gnutella, are subject to monitoring of their activity. Clients may be tracked by IP address, DNS name, software version they use, files they share, queries they initiate, and queries they answer to. Clients may also share their private files to the network without notice due to inappropriate settings.

<span class="mw-page-title-main">Phex</span>

Phex is a peer-to-peer file sharing client for the gnutella network, released under the terms of the GNU General Public License, so Phex is free software. Phex is based on Java SE 5.0 or later.

In the BitTorrent file distribution system, a torrent file or meta-info file is a computer file that contains metadata about files and folders to be distributed, and usually also a list of the network locations of trackers, which are computers that help participants in the system find each other and form efficient distribution groups called swarms. A torrent file does not contain the content to be distributed; it only contains information about those files, such as their names, folder structure, and sizes obtained via cryptographic hash values for verifying file integrity. The term torrent may refer either to the metadata file or to the files downloaded, depending on the context.

Mainline DHT is the name given to the Kademlia-based distributed hash table (DHT) used by BitTorrent clients to find peers via the BitTorrent protocol. The idea of using a DHT for distributed tracking in BitTorrent was first implemented in Azureus 2.3.0.0 in May 2005, from which it gained significant popularity. Unrelated but around the same time, BitTorrent, Inc. released a similar DHT into their client called Mainline DHT, and thus popularized the use of distributed tracking in the BitTorrent protocol. Measurement showed that by 2013, the concurrent number of users of Mainline DHT is from 16 million to 28 million, with intra-day changes of at least 10 million.