Gnutella2

Last updated
Shareaza2.JPG

Gnutella2, often referred to as G2, is a peer-to-peer protocol developed mainly by Michael Stokes and released in 2002.

Contents

While inspired by the gnutella protocol, G2 shares little of its design with the exception of its connection handshake and download mechanics. [1]

G2 adopts an extensible binary packet format and an entirely new search algorithm.

Furthermore, G2 has a related (but significantly different) network topology and an improved metadata system, which helps effectively to reduce fake files, such as viruses, on the network.

History

In November 2002, Michael Stokes announced the Gnutella2 protocol to the Gnutella Developers Forum. While some thought the goals stated for Gnutella2 are primarily to make a clean break with the gnutella 0.6 protocol and start over, so that some of gnutella's less clean parts would be done more elegantly and, in general, be impressive and desirable; other developers, primarily those of LimeWire and BearShare, thought it to be a "cheap publicity stunt" and discounted technical merits. Some still refuse to refer to the network as "Gnutella2", and instead, refer to it as "Mike's Protocol" ("MP"). [2]

The Gnutella2 protocol still uses the old "GNUTELLA CONNECT/0.6" handshake string for its connections [1] as defined in the gnutella 0.6 specifications. This backwardly compatible handshake method was criticized by the Gnutella Developers Forum as an attempt to use the gnutella network for bootstrapping the new, unrelated network, while proponents of the network claimed that its intent was to remain backwards-compatible with gnutella and to allow current gnutella clients to add Gnutella2 at their leisure.

With the developers entrenched in their positions, a flame war soon erupted, further cementing both sides' resolve. [3] [4] [5] [6]

The draft specifications were released on March 26, 2003, and more detailed specifications soon followed. G2 is not supported by many of the "old" gnutella network clients, however, many Gnutella2 clients still also connect to gnutella. Many Gnutella2 proponents claim that this is because of political reasons, while gnutella supporters claim that the drastic changes don't have enough merit to outweigh the cost of deep rewrites. [7]

Design

Gnutella2 divides nodes into two groups: Leaves and Hubs. Most Leaves maintain two connections to Hubs, [8] while Hubs accept hundreds of Leaf connections, and an average of 7 connections to other Hubs. When a search is initiated, the node obtains a list of Hubs, if needed, and contacts the Hubs in the list, noting which have been searched, until the list is exhausted or a predefined search limit has been reached. This allows a user to find a popular file easily without loading the network, while, theoretically, maintaining the ability for a user to find a single file located anywhere on the network.

Hubs index what files a Leaf has by means of a Query Routing Table, which is filled with single bit entries of hashes of keywords, which the Leaf uploads to the Hub, and which the Hub then combines with all the hash tables its Leaves have sent, it in order to create a version to send to their neighboring Hubs. This allows for Hubs to reduce bandwidth greatly by simply not forwarding queries to Leaves and neighboring Hubs, if the entries which match the search are not found in the routing tables.

Gnutella2 relies extensively on UDP, rather than TCP, for searches. The overhead of setting up a TCP connection would make a random walk search system, requiring the contacting of large numbers of nodes with small volumes of data, unworkable. However, UDP is not without its own drawbacks. Because UDP is connectionless, there is no standard method to inform the sending client that a message was received, and so if the packet is lost, there is no way to know. Because of this, UDP packets in Gnutella2 have a flag to enable a reliability setting. When an UDP packet with enabled reliability flag is received, the client will respond with an acknowledge packet to inform the sending client that their packet arrived at its destination. If the acknowledge packet is not sent, the reliable packet will be retransmitted in an attempt to ensure delivery. Low importance packets, which do not have the flag enabled, do not require an acknowledge packet, reducing reliability, but also reducing overhead, as no acknowledge packet needs to be sent and waited upon.

Protocol features

Gnutella2 has an extensible binary packet format, comparable to an XML document tree, which was conceived as an answer for some of gnutella's less elegant parts. The packet format was designed so that future network improvements and individual vendor features could be added without worry of causing bugs in other clients on the network. [9]

For the purpose file identification and secure integrity check of files it employs SHA-1 hashes. To allow for a file to be reliably downloaded in parallel from multiple sources, as well as to allow for the reliable uploading of parts while the file is still being downloaded (swarming), Tiger tree hashes are used. [10]

To create a more robust and complete system for searching, Gnutella2 also has a metadata system for more complete labeling, rating, and quality information to be given in the search results than would simply be gathered by the file names. [11] Nodes can even share this information after they have deleted the file, allowing users to mark viruses and worms on the network, without requiring them to keep a copy.

Gnutella2 also utilizes compression in its network connections to reduce the bandwidth used by the network. [10]

Shareaza has the additional feature to request previews of images and videos though only FilesScope takes limited advantage of this.

gtk-gnutella extended the protocol to further reduce the gap between Gnutella and G2. In particular, the semi-reliable UDP layer was enhanced to add cumulative and extended acknowledgments in a way that is backward compatible with legacy G2 clients. [12] Further extensions include the "A" string in /Q2/I [13] and the introduction of /QH2/H/ALT, /QH2/H/PART/MT, /QH2/HN, /QH2/BH and /QH2/G1 in the query hits. [14]

Differences from gnutella

Overall, the two networks are fairly similar, with the primary differences being in the packet format and the search methodology.

Protocol

Gnutella's packet format has been criticized because it was not originally designed with extensibility in mind, and has had many additions over the years, leaving the packet structure cluttered and inefficient. [15] Gnutella2 learned from this, and aside from having many of the added features of gnutella standard in Gnutella2, designed in future extensibility from the start.

Search algorithm

While gnutella uses a query flooding method of searching, Gnutella2 uses a random walk system, where a searching node gathers a list of Hubs and contacts them directly, one at a time. However, as Hub organize themselves in, so called, "Hub clusters", where each Hub mirrors the information stored by its neighbors, the Leaf is returned the information of the entire Hub cluster (usually 7 Hubs). This has several advantages over the gnutella's query flooding system. It is more efficient, as continuing a search does not increase the network traffic exponentially, queries are not routed through as many nodes, and it increases the granularity of a search, allowing a client to stop, once a pre-defined threshold of results has been obtained, more effectively than in gnutella. However, the walk system also increases the complexity of the network and the network maintenance and management required, as well as requiring safeguards to prevent a malicious attacker from using the network for denial-of-service attacks.

Terminology

There is also a difference in terminology: while the more capable nodes, which are used to condense the network, are referred to as Ultrapeers in gnutella, they are called Hubs in Gnutella2, and they are also used slightly differently in topology. In gnutella, the Ultrapeers generally maintain as many leaves as peer connections, while Gnutella2 Hubs maintain far more leaves, and fewer peer (Hub-to-Hub) connections. The reason for this is that the search methods of the various networks have different optimum topologies.

Clients

List

Free software Gnutella2 clients include:

Proprietary software implementations include:

Comparison

The following table compares general and technical information for a number of available applications supporting the G2 network.

Client Chat Handles big files (>4 GB) UKHL [20] Unicode UPnP port mapping NAT traversal Remote preview Ability to search with hashes Hub modus Spyware/ Adware/ Malware-free Other networks Based on OS Other
Foxy Yes No No Yes Yes No No Yes Foxy only No GnucDNA Cross-platform -
FileScope Yes No No No No No Yes Yes Yes Yes gnutella,

eD2k, OpenNap

- Windows [21] Development has ended in 2014. [22]
Gnucleus No No No No No No No Yes No Yes gnutella GnucDNA Windows -
gtk-gnutella No Yes No Yes Yes Yes No Yes No Yes gnutella - Cross-platform -
Morpheus Yes No No No Yes No No Yes No No gnutella,

NEOnet

GnucDNA Windows Development and hosting of the client has been stopped
Shareaza Yes Yes Yes Yes Yes No Yes Yes Yes Yes gnutella,

eD2k, BitTorrent

- Windows Includes IRC support

See also

Related Research Articles

Gnutella is a peer-to-peer network protocol. Founded in 2000, it was the first decentralized peer-to-peer network of its kind, leading to other, later networks adopting the model.

Direct Connect (DC) is a peer-to-peer file sharing protocol. Direct Connect clients connect to a central hub and can download files directly from one another. Advanced Direct Connect can be considered a successor protocol.

<span class="mw-page-title-main">Shareaza</span> Peer-to-peer file sharing application

Shareaza is a peer-to-peer file sharing client running under Microsoft Windows which supports the Gnutella, Gnutella2 (G2), eDonkey, BitTorrent, FTP, HTTP and HTTPS network protocols and handles magnet links, ed2k links, and the now deprecated gnutella and Piolet links. It is available in 30 languages.

giFT Internet File Transfer (giFT) is a computer software daemon that allows several file sharing protocols to be used with a simple client having a graphical user interface (GUI). The client dynamically loads plugins implementing the protocols, as they are required.

<span class="mw-page-title-main">Distributed hash table</span> Decentralized distributed system with lookup service

A distributed hash table (DHT) is a distributed system that provides a lookup service similar to a hash table. Key–value pairs are stored in a DHT, and any participating node can efficiently retrieve the value associated with a given key. The main advantage of a DHT is that nodes can be added or removed with minimum work around re-distributing keys. Keys are unique identifiers which map to particular values, which in turn can be anything from addresses, to documents, to arbitrary data. Responsibility for maintaining the mapping from keys to values is distributed among the nodes, in such a way that a change in the set of participants causes a minimal amount of disruption. This allows a DHT to scale to extremely large numbers of nodes and to handle continual node arrivals, departures, and failures.

gtk-gnutella

gtk-gnutella is a peer-to-peer file sharing application which runs on the gnutella network. gtk-gnutella uses the GTK+ toolkit for its graphical user interface. Released under the GNU General Public License, gtk-gnutella is free software.

<span class="mw-page-title-main">GNUnet</span> Framework for decentralized, peer-to-peer networking which is part of the GNU Project

GNUnet is a software framework for decentralized, peer-to-peer networking and an official GNU package. The framework offers link encryption, peer discovery, resource allocation, communication over many transports and various basic peer-to-peer algorithms for routing, multicast and network size estimation.

Query flooding is a method to search for a resource on a peer-to-peer network. It is simple and scales very poorly and thus is rarely used. Early versions of the Gnutella protocol operated by query flooding; newer versions use more efficient search algorithms.

Kademlia is a distributed hash table for decentralized peer-to-peer computer networks designed by Petar Maymounkov and David Mazières in 2002. It specifies the structure of the network and the exchange of information through node lookups. Kademlia nodes communicate among themselves using UDP. A virtual or overlay network is formed by the participant nodes. Each node is identified by a number or node ID. The node ID serves not only as identification, but the Kademlia algorithm uses the node ID to locate values.

The Invisible Internet Project (I2P) is an anonymous network layer that allows for censorship-resistant, peer-to-peer communication. Anonymous connections are achieved by encrypting the user's traffic, and sending it through a volunteer-run network of roughly 55,000 computers distributed around the world. Given the high number of possible paths the traffic can transit, a third party watching a full connection is unlikely. The software that implements this layer is called an "I2P router", and a computer running I2P is called an "I2P node". I2P is free and open sourced, and is published under multiple licenses.

<span class="mw-page-title-main">Magnet URI scheme</span> Scheme that defines the format of magnet links

Magnet is a URI scheme that defines the format of magnet links, a de facto standard for identifying files (URN) by their content, via cryptographic hash value rather than by their location.

The eDonkey Network is a decentralized, mostly server-based, peer-to-peer file sharing network created in 2000 by US developers Jed McCaleb and Sam Yagan that is best suited to share big files among users, and to provide long term availability of files. Like most sharing networks, it is decentralized, as there is no central hub for the network; also, files are not stored on a central server but are exchanged directly between users based on the peer-to-peer principle.

The Kad network is a peer-to-peer (P2P) network which implements the Kademlia P2P overlay protocol. The majority of users on the Kad Network are also connected to servers on the eDonkey network, and Kad Network clients typically query known nodes on the eDonkey network in order to find an initial node on the Kad network.

GnucDNA was a software library for building peer-to-peer applications. It provides developers with a common layer to create their own Gnutella or Gnutella2 client or network. As a separate component, GnucDNA can be updated independently of the client, passing down improvements to the applications already using it.

<span class="mw-page-title-main">Merkle tree</span> Type of data structure

In cryptography and computer science, a hash tree or Merkle tree is a tree in which every "leaf" node is labelled with the cryptographic hash of a data block, and every node that is not a leaf is labelled with the cryptographic hash of the labels of its child nodes. A hash tree allows efficient and secure verification of the contents of a large data structure. A hash tree is a generalization of a hash list and a hash chain.

File sharing is a method of distributing electronically stored information such as computer programs and digital media. Below is a list of file sharing applications, most of them make use of peer-to-peer file sharing technologies.

The following is a general comparison of BitTorrent clients, which are computer programs designed for peer-to-peer file sharing using the BitTorrent protocol.

Peer-to-peer file sharing (P2P) systems like Gnutella, KaZaA, and eDonkey/eMule, have become extremely popular in recent years, with the estimated user population in the millions. An academic research paper analyzed Gnutella and eMule protocols and found weaknesses in the protocol; many of the issues found in these networks are fundamental and probably common on other P2P networks. Users of file sharing networks, such as eMule and Gnutella, are subject to monitoring of their activity. Clients may be tracked by IP address, DNS name, software version they use, files they share, queries they initiate, and queries they answer to. Clients may also share their private files to the network without notice due to inappropriate settings.

<span class="mw-page-title-main">Phex</span> Peer to peer file sharing client

Phex is a peer-to-peer file sharing client for the gnutella network, released under the terms of the GNU General Public License, so Phex is free software. Phex is based on Java SE 5.0 or later.

References

  1. 1 2 "Developer discussion of similarities between Gnutella and Gnutella2". The Gnutella Developer Forum. Archived from the original on 2023-01-17. Retrieved 2006-05-10.
  2. "GDF Discussion on the Gnutella2 name". The Gnutella Developer Forum. Archived from the original on 2023-01-17. Retrieved 2006-05-10.
  3. "Part of the Gnutella/Gnutella2 Flame War (1)". The Gnutella Developer Forum. Archived from the original on 2009-02-12. Retrieved 2006-08-06.
  4. "Part of the Gnutella/Gnutella2 Flame War (2)". The Gnutella Developer Forum. Archived from the original on 2023-01-17. Retrieved 2006-08-06.
  5. "Part of the Gnutella/Gnutella2 Flame War (3)". The Gnutella Developer Forum. Archived from the original on 2023-01-17. Retrieved 2006-08-06.
  6. "Part of the Gnutella/Gnutella2 Flame War (4)". The Gnutella Developer Forum. Archived from the original on 2023-01-17. Retrieved 2006-08-06.
  7. "Developer discussion on migration to Gnutella2". The Gnutella Developer Forum. Archived from the original on 2023-01-17. Retrieved 2006-05-10.
  8. "Gnutella2 Network history". Trillinux crawler (G2paranha). Archived from the original on 2009-05-15. Retrieved 2009-04-12.
  9. "Packet Structure". Gnutella2 Wiki. Archived from the original on 2007-12-19. Retrieved 2007-11-07.
  10. 1 2 "Gnutella2 Standard". Gnutella2 wiki. Archived from the original on 2007-12-19. Retrieved 2007-11-07.
  11. "Simple Query Language and Metadata". Gnutella2 Wiki. Archived from the original on 2007-12-19. Retrieved 2007-11-07.
  12. "UDP Transceiver - Gnutella2". G2.doxu.org. Archived from the original on 2014-07-19. Retrieved 2014-08-06.
  13. "Q2 - Gnutella2". G2.doxu.org. 2014-02-25. Archived from the original on 2014-07-14. Retrieved 2014-08-06.
  14. "QH2 - Gnutella2". G2.doxu.org. 2014-03-12. Archived from the original on 2013-12-13. Retrieved 2014-08-06.
  15. "Developer discussion of Gnutella and Gnutella2 packet formats". The Gnutella Developer Forum. Archived from the original on 2023-01-17. Retrieved 2006-05-15.
  16. "Adagio download | SourceForge.net". Archived from the original on 2016-11-12. Retrieved 2016-11-11.
  17. "gtk-gnutella - The Graphical Unix Gnutella Client". Gtk-gnutella.sourceforge.net. Archived from the original on 2005-07-08. Retrieved 2014-08-06.
  18. "OtherNetworksSupported - MLDonkey". mldonkey.sourceforge.net. Archived from the original on 2016-11-12. Retrieved 2016-11-11.
  19. "Shareaza network share on the G2 network". Trillinux crawler (G2paranha). Archived from the original on 2009-01-05. Retrieved 2008-09-18.
  20. UKHL = UDP Known Hub List
  21. "FileScope website: Statement about cross-platforum compatibility". FileScope. Archived from the original on 2008-08-28. Retrieved 2008-08-22.
  22. "FileScope". SourceForge . Archived from the original on 2021-06-25. Retrieved 2021-06-25.