Tapestry (DHT)

Last updated
Tapestry
Written in C++
Type Peer-to-peer
Website current.cs.ucsb.edu/projects/chimera/

Tapestry is a peer-to-peer overlay network which provides a distributed hash table, routing, and multicasting infrastructure for distributed applications. [1] The Tapestry peer-to-peer system offers efficient, scalable, self-repairing, location-aware routing to nearby resources.

Contents

Introduction

The first generation of peer-to-peer applications, including Napster, Gnutella, had restricting limitations such as a central directory for Napster and scoped broadcast queries for Gnutella limiting scalability. To address these problems a second generation of P2P applications were developed including Tapestry, Chord, Pastry, and CAN. These overlays implement a basic key-based routing mechanism. This allows for deterministic routing of messages and adaptation to node failures in the overlay network. Of the named networks Pastry is very close to Tapestry as they both adopt the same routing algorithm by Plaxton et al.

Tapestry is an extensible infrastructure that provides decentralized object location and routing focusing on efficiency and minimizing message latency. This is achieved since Tapestry constructs locally optimal routing tables from initialization and maintains them in order to reduce routing stretch. Furthermore, Tapestry allows object distribution determination according to the needs of a given application. Similarly Tapestry allows applications to implement multicasting in the overlay network.

Algorithm

API

Each node is assigned a unique nodeID uniformly distributed in a large identifier space. Tapestry uses SHA-1 to produce a 160-bit identifier space represented by a 40 digit hex key. Application specific endpoints GUIDs are similarly assigned unique identifiers. NodeIDs and GUIDs are roughly evenly distributed in the overlay network with each node storing several different IDs. From experiments it is shown that Tapestry efficiency increases with network size, so multiple applications sharing the same overlay network increases efficiency. To differentiate between applications a unique application identifier is used. Tapestry uses best-effort to publish and route objects.

Routing

Routing mesh

Each identifier is mapped to a live node called the root. If a node's nodeID is G then it is the root else use the routing table's nodeIDs and IP addresses to find the nodes neighbors. At each hop a message is progressively routed closer to G by incremental suffix routing. Each neighbor map has multiple levels where each level contains links to nodes matching up to a certain digit position in the ID. The primary ith entry in the jth level is the ID and location of the closest node that begins with prefix (N, j-1)+i. This means that level 1 has links to nodes that have nothing in common, level 2 has the first digit in common, etc. Because of this, routing takes approximately hops in a network of size N and IDs of base B (hex: B=16). If an exact ID can not be found, the routing table will route to the closest matching node. For fault tolerance, nodes keep secondary links such that the routing table has size .

Object publication and location

Participants in the network can publish objects by periodically routing a publish message toward the root node. Each node along the path stores a pointer mapping the object. Multiple servers can publish pointers to the same object. The redundant links are prioritized by latency and/or locality.

Objects are located by routing a message towards the root of the object. Each node along the path checks the mapping and redirects the request appropriately. The effect of routing is convergence of nearby paths heading to the same destination....

Dynamic nodes

Node insertion

The new node becomes the root for its nodeID. The root finds the length of the longest prefix of the ID it shares. Then it sends a multicast message that reaches all existing nodes sharing the same prefix. These nodes then add the new node to their routing tables. The new node may take over being the root for some of the root's objects. The nodes will contact the new node to provide a temporary neighborhood list. The new node then performs an iterative nearest neighbor search to fill all levels in its routing table.

Node departure

To leave the network, a node broadcasts its intention of leaving and transmits the replacement node for each level in the routing tables of the other nodes. Objects at the leaving node are redistributed or replenished from redundant copies.

Node failure

Unexpected node failure is handled through redundancy in the network and backup pointers to reestablish damaged links.

Applications

Tapestry provides an overlay routing network that is stable under a variety of network conditions. This provides an ideal infrastructure for distributed applications and services. Applications based on tapestry are:

Developers

Tapestry was developed by Ben Y. Zhao, Ling Huang, Jeremy Stribling, Sean C. Rhea, Anthony D. Joseph and John D. Kubiatowicz.

See also

Related Research Articles

Gnutella is a peer-to-peer network protocol. Founded in 2000, it was the first decentralized peer-to-peer network of its kind, leading to other, later networks adopting the model.

<span class="mw-page-title-main">Multicast</span> Computer networking technique for transmission from one sender to multiple receivers

In computer networking, multicast is group communication where data transmission is addressed to a group of destination computers simultaneously. Multicast can be one-to-many or many-to-many distribution. Multicast should not be confused with physical layer point-to-multipoint communication.

<span class="mw-page-title-main">Peer-to-peer</span> Type of decentralized and distributed network architecture

Peer-to-peer (P2P) computing or networking is a distributed application architecture that partitions tasks or workloads between peers. Peers are equally privileged, equipotent participants in the network. They are said to form a peer-to-peer network of nodes.

<span class="mw-page-title-main">Distributed hash table</span> Decentralized distributed system with lookup service

A distributed hash table (DHT) is a distributed system that provides a lookup service similar to a hash table. Key–value pairs are stored in a DHT, and any participating node can efficiently retrieve the value associated with a given key. The main advantage of a DHT is that nodes can be added or removed with minimum work around re-distributing keys. Keys are unique identifiers which map to particular values, which in turn can be anything from addresses, to documents, to arbitrary data. Responsibility for maintaining the mapping from keys to values is distributed among the nodes, in such a way that a change in the set of participants causes a minimal amount of disruption. This allows a DHT to scale to extremely large numbers of nodes and to handle continual node arrivals, departures, and failures.

gtk-gnutella

gtk-gnutella is a peer-to-peer file sharing application which runs on the gnutella network. gtk-gnutella uses the GTK+ toolkit for its graphical user interface. Released under the GNU General Public License, gtk-gnutella is free software.

<span class="mw-page-title-main">GNUnet</span> Framework for decentralized, peer-to-peer networking which is part of the GNU Project

GNUnet is a software framework for decentralized, peer-to-peer networking and an official GNU package. The framework offers link encryption, peer discovery, resource allocation, communication over many transports and various basic peer-to-peer algorithms for routing, multicast and network size estimation.

In computing, Chord is a protocol and algorithm for a peer-to-peer distributed hash table. A distributed hash table stores key-value pairs by assigning keys to different computers ; a node will store the values for all the keys for which it is responsible. Chord specifies how keys are assigned to nodes, and how a node can discover the value for a given key by first locating the node responsible for that key.

Query flooding is a method to search for a resource on a peer-to-peer network. It is simple and scales very poorly and thus is rarely used. Early versions of the Gnutella protocol operated by query flooding; newer versions use more efficient search algorithms.

Kademlia is a distributed hash table for decentralized peer-to-peer computer networks designed by Petar Maymounkov and David Mazières in 2002. It specifies the structure of the network and the exchange of information through node lookups. Kademlia nodes communicate among themselves using UDP. A virtual or overlay network is formed by the participant nodes. Each node is identified by a number or node ID. The node ID serves not only as identification, but the Kademlia algorithm uses the node ID to locate values.

An overlay network is a computer network that is layered on top of another network.

In software architecture, publish–subscribe is a messaging pattern where senders of messages, called publishers, do not program the messages to be sent directly to specific receivers, called subscribers, but instead categorize published messages into classes without knowledge of which subscribers, if any, there may be. Similarly, subscribers express interest in one or more classes and only receive messages that are of interest, without knowledge of which publishers, if any, there are.

<span class="mw-page-title-main">RapidIO</span> Electrical connection technology

The RapidIO architecture is a high-performance packet-switched electrical connection technology. RapidIO supports messaging, read/write and cache coherency semantics. Based on industry-standard electrical specifications such as those for Ethernet, RapidIO can be used as a chip-to-chip, board-to-board, and chassis-to-chassis interconnect.

The content-addressable network (CAN) is a distributed, decentralized P2P infrastructure that provides hash table functionality on an Internet-like scale. CAN was one of the original four distributed hash table proposals, introduced concurrently with Chord, Pastry, and Tapestry.

BitVault is a content-addressable distributed storage system, developed by Microsoft Research in China. BitVault uses peer-to-peer technology to distribute the tasks of storing and managing data. As such, there is no central authority responsible for management of the system. Rather, it is self-managing, provides high availability, reliability and scales up in a self-organizing manner, with low administrative overhead, which is almost constant irrespective of the size of the distributed overlay network.

Pastry is an overlay network and routing network for the implementation of a distributed hash table (DHT) similar to Chord. The key–value pairs are stored in a redundant peer-to-peer network of connected Internet hosts. The protocol is bootstrapped by supplying it with the IP address of a peer already in the network and from then on via the routing table which is dynamically built and repaired. It is claimed that because of its redundant and decentralized nature there is no single point of failure and any single node can leave the network at any time without warning and with little or no chance of data loss. The protocol is also capable of using a routing metric supplied by an outside program, such as ping or traceroute, to determine the best routes to store in its routing table.

Peer-to-peer file sharing (P2P) systems like Gnutella, KaZaA, and eDonkey/eMule, have become extremely popular in recent years, with the estimated user population in the millions. An academic research paper analyzed Gnutella and eMule protocols and found weaknesses in the protocol; many of the issues found in these networks are fundamental and probably common on other P2P networks. Users of file sharing networks, such as eMule and Gnutella, are subject to monitoring of their activity. Clients may be tracked by IP address, DNS name, software version they use, files they share, queries they initiate, and queries they answer to. Clients may also share their private files to the network without notice due to inappropriate settings.

In peer-to-peer networks, Koorde is a distributed hash table (DHT) system based on the Chord DHT and the De Bruijn graph. Inheriting the simplicity of Chord, Koorde meets O(log n) hops per node, and hops per lookup request with O(log n) neighbors per node.

The BAlanced Tree Overlay Network (BATON) is a distributed tree structure designed for peer-to-peer (P2P) systems. Unlike other overlays that utilize a distributed hash table (DHT) like the Chord system, BATON organizes peers in a distributed tree to facilitate range search. Furthermore, BATON aims to maintain a balanced tree height, similar to the AVL tree, resulting in a bounded to for search exact and range queries as well as update operations (join/leave).

A bootstrapping node, also known as a rendezvous host, is a node in an overlay network that provides initial configuration information to newly joining nodes so that they may successfully join the overlay network. Bootstrapping nodes are predominantly found in decentralized peer-to-peer (P2P) networks because of the dynamically changing identities and configurations of member nodes in these networks.

Skip graphs are a kind of distributed data structure based on skip lists. They were invented in 2003 by James Aspnes and Gauri Shah. A nearly identical data structure called SkipNet was independently invented by Nicholas Harvey, Michael Jones, Stefan Saroiu, Marvin Theimer and Alec Wolman, also in 2003.

References

  1. Zhao, Ben Y.; Huang, Ling; Stribling, Jeremy; Rhea, Sean C.; Joseph, Anthony D.; Kubiatowicz, John D. (2004). "Tapestry: A Resilient Global-scale Overlay for Service Deployment" (PDF). IEEE Journal on Selected Areas in Communications. 22 (1): 41–53. CiteSeerX   10.1.1.71.2718 . doi:10.1109/JSAC.2003.818784. S2CID   689430 . Retrieved 13 January 2015.