Hash list

Last updated

In computer science, a hash list is typically a list of hashes of the data blocks in a file or set of files. Lists of hashes are used for many different purposes, such as fast table lookup (hash tables) and distributed databases (distributed hash tables).

Contents

A hash list with a top hash Hash list.svg
A hash list with a top hash

A hash list is an extension of the concept of hashing an item (for instance, a file). A hash list is a subtree of a Merkle tree.

Root hash

Often, an additional hash of the hash list itself (a top hash, also called root hash or master hash) is used. Before downloading a file on a p2p network, in most cases the top hash is acquired from a trusted source, for instance a friend or a web site that is known to have good recommendations of files to download. When the top hash is available, the hash list can be received from any non-trusted source, like any peer in the p2p network. Then the received hash list is checked against the trusted top hash, and if the hash list is damaged or fake, another hash list from another source will be tried until the program finds one that matches the top hash.

In some systems (for example, BitTorrent), instead of a top hash the whole hash list is available on a web site in a small file. Such a "torrent file" contains a description, file names, a hash list and some additional data.

Applications

Hash lists can be used to protect any kind of data stored, handled and transferred in and between computers. An important use of hash lists is to make sure that data blocks received from other peers in a peer-to-peer network are received undamaged and unaltered, and to check that the other peers do not "lie" and send fake blocks.

Usually a cryptographic hash function such as SHA-256 is used for the hashing. If the hash list only needs to protect against unintentional damage unsecured checksums such as CRCs can be used.

Hash lists are better than a simple hash of the entire file since, in the case of a data block being damaged, this is noticed, and only the damaged block needs to be redownloaded. With only a hash of the file, many undamaged blocks would have to be redownloaded, and the file reconstructed and tested until the correct hash of the entire file is obtained. Hash lists also protect against nodes that try to sabotage by sending fake blocks, since in such a case the damaged block can be acquired from some other source.

Protocols using hash lists

See also


Related Research Articles

<span class="mw-page-title-main">Hyphanet</span> Peer-to-peer Internet platform for censorship-resistant communication

Hyphanet is a peer-to-peer platform for censorship-resistant, anonymous communication. It uses a decentralized distributed data store to keep and deliver information, and has a suite of free software for publishing and communicating on the Web without fear of censorship. Both Freenet and some of its associated tools were originally designed by Ian Clarke, who defined Freenet's goal as providing freedom of speech on the Internet with strong anonymity protection.

<span class="mw-page-title-main">Peer-to-peer</span> Type of decentralized and distributed network architecture

Peer-to-peer (P2P) computing or networking is a distributed application architecture that partitions tasks or workloads between peers. Peers are equally privileged, equipotent participants in the network, forming a peer-to-peer network of nodes.

eDonkey2000

eDonkey2000 was a peer-to-peer file sharing application developed by US company MetaMachine, using the Multisource File Transfer Protocol. It supported both the eDonkey2000 network and the Overnet network.

<span class="mw-page-title-main">Distributed hash table</span> Decentralized distributed system with lookup service

A distributed hash table (DHT) is a distributed system that provides a lookup service similar to a hash table. Key–value pairs are stored in a DHT, and any participating node can efficiently retrieve the value associated with a given key. The main advantage of a DHT is that nodes can be added or removed with minimum work around re-distributing keys. Keys are unique identifiers which map to particular values, which in turn can be anything from addresses, to documents, to arbitrary data. Responsibility for maintaining the mapping from keys to values is distributed among the nodes, in such a way that a change in the set of participants causes a minimal amount of disruption. This allows a DHT to scale to extremely large numbers of nodes and to handle continual node arrivals, departures, and failures.

BitTorrent, also referred to as simply torrent, is a communication protocol for peer-to-peer file sharing (P2P), which enables users to distribute data and electronic files over the Internet in a decentralized manner. The protocol is developed and maintained by Rainberry, Inc., and was first released in 2001.

<span class="mw-page-title-main">Cryptographic hash function</span> Hash function that is suitable for use in cryptography

A cryptographic hash function (CHF) is a hash algorithm that has special properties desirable for a cryptographic application:

An anonymous P2P communication system is a peer-to-peer distributed application in which the nodes, which are used to share resources, or participants are anonymous or pseudonymous. Anonymity of participants is usually achieved by special routing overlay networks that hide the physical location of each node from other participants.

In cryptography, Tiger is a cryptographic hash function designed by Ross Anderson and Eli Biham in 1995 for efficiency on 64-bit platforms. The size of a Tiger hash value is 192 bits. Truncated versions can be used for compatibility with protocols assuming a particular hash size. Unlike the SHA-2 family, no distinguishing initialization values are defined; they are simply prefixes of the full Tiger/192 hash value.

<span class="mw-page-title-main">Magnet URI scheme</span> Scheme that defines the format of magnet links

Magnet is a URI scheme that defines the format of magnet links, a de facto standard for identifying files (URN) by their content, via cryptographic hash value rather than by their location.

The eDonkey Network is a decentralized, mostly server-based, peer-to-peer file sharing network created in 2000 by US developers Jed McCaleb and Sam Yagan that is best suited to share big files among users, and to provide long term availability of files. Like most sharing networks, it is decentralized, as there is no central hub for the network; also, files are not stored on a central server but are exchanged directly between users based on the peer-to-peer principle.

A BitTorrent tracker is a special type of server that assists in the communication between peers using the BitTorrent protocol.

<span class="mw-page-title-main">Merkle tree</span> Type of data structure

In cryptography and computer science, a hash tree or Merkle tree is a tree in which every "leaf" node is labelled with the cryptographic hash of a data block, and every node that is not a leaf is labelled with the cryptographic hash of the labels of its child nodes. A hash tree allows efficient and secure verification of the contents of a large data structure. A hash tree is a generalization of a hash list and a hash chain.

eMule Free peer-to-peer file sharing application for Microsoft Windows.

eMule is a free peer-to-peer file sharing application for Microsoft Windows. Started in May 2002 as an alternative to eDonkey2000, eMule now connects to both the eDonkey network and the Kad network. The distinguishing features of eMule are the direct exchange of sources between client nodes, fast recovery of corrupted downloads, and the use of a credit system to reward frequent uploaders. Furthermore, eMule transmits data in zlib-compressed form to save bandwidth.

In computing, eD2k links (ed2k://) are hyperlinks used to denote files stored on computers connected to the eDonkey filesharing P2P network.

This is a glossary of jargon related to peer-to-peer file sharing via the BitTorrent protocol.

In the BitTorrent file distribution system, a torrent file or meta-info file is a computer file that contains metadata about files and folders to be distributed, and usually also a list of the network locations of trackers, which are computers that help participants in the system find each other and form efficient distribution groups called swarms. Torrent files are normally named with the extension .torrent.

<span class="mw-page-title-main">Retroshare</span> Free software

Retroshare is a free and open-source peer-to-peer communication and file sharing app based on a friend-to-friend network built by GNU Privacy Guard (GPG). Optionally peers may exchange certificates and IP addresses to their friends and vice versa.

Torrent poisoning is intentionally sharing corrupt data or data with misleading file names using the BitTorrent protocol. This practice of uploading fake torrents is sometimes carried out by anti-infringement organisations as an attempt to prevent the peer-to-peer (P2P) sharing of copyrighted content, and to gather the IP addresses of downloaders.

<span class="mw-page-title-main">Twister (software)</span> Blog software

Twister is a decentralized, experimental peer-to-peer microblogging program which uses end-to-end encryption to safeguard communications. Based on BitTorrent- and Bitcoin-like protocols, it has been likened to a distributed version of Twitter.

<span class="mw-page-title-main">ZeroNet</span> Peer to peer web hosting

ZeroNet is a decentralized web-like network of peer-to-peer users, created by Tamas Kocsis in 2015, programming for the network was based in Budapest, Hungary; is built in Python; and is fully open source. Instead of having an IP address, sites are identified by a public key. The private key allows the owner of a site to sign and publish changes, which propagate through the network. Sites can be accessed through an ordinary web browser when using the ZeroNet application, which acts as a local webhost for such pages. In addition to using bitcoin cryptography, ZeroNet uses trackers from the BitTorrent network to negotiate connections between peers. ZeroNet is not anonymous by default, but it supports routing traffic through the Tor network.

References