Torrent file

Last updated
Torrent files
Filename extension
.torrent
Internet media type
application/x-bittorrent
Standard BEP-0003 (v1), [1] BEP-0052 (v2) [2]

In the BitTorrent file distribution system, a torrent file or meta-info file is a computer file that contains metadata about files and folders to be distributed, and usually also a list of the network locations of trackers, which are computers that help participants in the system find each other and form efficient distribution groups called swarms. [1] Torrent files are normally named with the extension .torrent.

Contents

A torrent file acts like a table of contents (index) that allows computers to find information through the use of a BitTorrent client. With the help of a torrent file, one can download small parts of the original file from computers that have already downloaded it. These "peers" allow for downloading of the file in addition to, or in place of, the primary server. A torrent file does not contain the content to be distributed; it only contains information about those files, such as their names, folder structure, sizes, and cryptographic hash values for verifying file integrity.

The BitTorrent system has been created to ease the load on central servers, as instead of having individual clients fetch files from the server, BitTorrent can crowd-source the bandwidth needed for the file transfer and reduce the time needed to download large files. Many free/freeware programs and operating systems, such as the various Linux distributions offer a torrent download option for users seeking the aforementioned benefits. Other large downloads, such as media files, are often torrented as well.

Background

Typically, Internet access is asymmetrical, supporting greater download speeds than upload speeds, limiting the bandwidth of each download, and sometimes enforcing bandwidth caps and periods where systems are not accessible. This creates inefficiency when many people want to obtain the same set of files from a single source; the source must always be online and must have massive outbound bandwidth. The BitTorrent protocol addresses this by decentralizing the distribution, leveraging the ability of people to network "peer-to-peer", among themselves.

Each file to be distributed is divided into small information chunks called pieces. Downloading peers achieve high download speeds by requesting multiple pieces from different computers simultaneously in the swarm. Once obtained, these pieces are usually immediately made available for download by others in the swarm. In this way, the burden on the network is spread among the downloaders, rather than concentrating at a central distribution hub or cluster. As long as all the pieces are available, peers (downloaders and uploaders) can come and go; no one peer needs to have all the chunks or to even stay connected to the swarm in order for distribution to continue among the other peers.

A small torrent file is created to represent a file or folder to be shared. The torrent file acts as the key to initiating downloading of the actual content. Someone interested in receiving the shared file or folder first obtains the corresponding torrent file, either by directly downloading it or by using a magnet link. The user then opens that file in a BitTorrent client, which automates the rest of the process. In order to learn the internet locations of peers who may be sharing pieces, the client connects to the trackers named in the torrent file, and/or achieves a similar result through the use of distributed hash tables. Then the client connects directly to the peers in order to request pieces and otherwise participate in a swarm. The client may also report progress to trackers, to help the tracker with its peer recommendations.

When the client has all the pieces, the BitTorrent client assembles them into a usable form. They may also continue sharing the pieces, elevating their status to that of seeder rather than an ordinary peer.

File struct

A torrent file contains a list of files and integrity metadata about all the pieces, and optionally contains a large list of trackers.

A torrent file is a bencoded dictionary with the following keys (the keys in any bencoded dictionary are lexicographically ordered):

All strings must be UTF-8 encoded, except for pieces, which contains binary data.

A torrent is uniquely identified by an infohash, a SHA-1 hash calculated over the contents of the info dictionary in bencode form. Changes to other portions of the torrent does not affect the hash. This hash is used to identify the torrent to other peers via DHT and to the tracker. It is also used in magnet links.

BitTorrent v2

The BitTorrent v2 protocol (BEP-0052) introduces a new definition of the torrent file. [2] The basic structure is:

The new format uses SHA-256 in both the piece-hashing and the infohash, replacing the broken SHA-1 hash. The "btmh" magnet link would contain the full 32-byte hash, while communication with trackers and on the DHT uses the 20-byte truncated version to fit into the old message structure. [2] It is possible to construct a torrent file with only updated new fields for a "v2" torrent, or with both the old and new fields for a "hybrid" format. However, as a torrent would have different infohashes in v1 and v2 networks, two swarms would form, requiring special handling by the client to merge the two. [3]

A core feature of the new format is its application of merkle trees, allowing for 16KiB blocks of a piece to be individually verified and re-downloaded. Each file now always occupy whole piece sizes and have an independent merkle root hash, so that it's possible to find duplicate files across unrelated torrent files of any piece length. The file size is not reduced (assuming piece size stays the same; v2's tree structure allows larger pieces with fewer ill effects), but the info dictionary required for magnet links are (only in v2-only torrents). [3]

Extensions

A torrent file can also contain additional metadata defined in extensions to the BitTorrent specification. [4] These are known as "BitTorrent Enhancement Proposals." Examples of such proposals include metadata for stating who created the torrent, and when.

Accepted extensions

These extensions have been deployed in one or more implementations as well as having been proven useful through consistent and widespread use. While they may require minor revisions, they are largely considered to be complete, only awaiting the blessing of Bram Cohen in order to be elevated to the status of Final/Active Process.

Distributed hash tables

BEP-0005 [5] extends BitTorrent to support distributed hash tables, specifically Mainline DHT.

A trackerless torrent dictionary does not have an announce key. Instead, a trackerless torrent has a nodes key:

{# ...'nodes':[["<host>",<port>],["<host>",<port>],...],# ...}

For example,

'nodes':[["127.0.0.1",6881],["your.router.node",4804]],

The specification recommends that nodes "should be set to the K closest nodes in the torrent generating client's routing table. Alternatively, the key could be set to a known good node such as one operated by the person generating the torrent."

Multiple trackers

BEP-0012 [6] extends BitTorrent to support multiple trackers.

A new key, announce-list, is placed in the top-most dictionary (i.e., with announce and info)

{# ...'announce-list':[['<tracker1-url>']['<tracker2-url>']],# ...}

HTTP seeds

BEP-0019 [7] is one of two extensions allowing HTTP seeds to be used in BitTorrent.

In BEP-0019, a new key url-list, is placed in the top-most list. The client uses the links to assemble ordinary HTTP URLs no server-side support is required. This feature is very commonly used by open source projects offering software downloads. Web seeds allow smart selection and simultaneous use of mirror sites, P2P or HTTP(S), by the client. Doing so reducing the load on the project's servers while maximizing download speed. MirrorBrain  [ de ] automatically generates torrents with web seeds.

Private torrents

BEP-0027 [8] extends BitTorrent to support private torrents.

A new key, private, is placed in the info dictionary. This key's value is 1 if the torrent is private:

{# ...'info':{# ...'private':1,# ...},# ...}

Private torrents are to be used with a private tracker. Such a tracker restricts access to torrents it tracks by checking the peer's IP, refusing to provide a peer list if the IP is unknown. The peer itself is usually registered to the tracker via a gated online community; the private tracker typically also keep statistics of data transfer for use in the community.

Decentralized methods like DHT, PeX, LSD are disabled to maintain the centralized control. A private torrent can be manually edited to remove the private flag, but doing so will change the info-hash (deterministically), forming a separate "swarm" of peers. On the other hand, changing the tracker list will not change the hash. The flag does not offer true privacy, instead operating as a gentlemen's agreement.

Draft extensions

These extensions are under consideration for standardization. Most are already widely adopted as de facto standards.

HTTP seeds

BEP-0017 [9] extends BitTorrent to support HTTP seeds, later more commonly termed "web seeds" to be inclusive of HTTPS.

In BEP-0017, a new key, httpseeds, is placed in the top-most list (i.e., with announce and info). This key's value is a list of web addresses where torrent data can be retrieved. Special server support is required. It remains at Draft status.

{# ...'httpseeds':['http://www.site1.com/source1.php','http://www.site2.com/source2.php'],# ...}

Merkle trees

BEP-0030 [10] extends BitTorrent to support Merkle trees (originally implemented in Tribler). The purpose is to reduce the file size of torrent files, which reduces the burden on those that serve torrent files.

A torrent file using Merkle trees does not have a pieces key in the info list. Instead, such a torrent file has a root_hash key in the info list. This key's value is the root hash of the Merkle hash:

{# ...'info':{# ...'roothash':<binary SHA1 hash>,# ...},# ...}

BitTorrent v2 uses a different type of Merkel tree. [3]

Examples

Single file

A de-bencoded torrent file (with piece length 256 KiB = 262,144 bytes) for a file debian-503-amd64-CD-1.iso (whose size is 678 301 696 bytes) might look like:

{'announce':'http://bttracker.debian.org:6969/announce','info':{'length':678301696,'name':'debian-503-amd64-CD-1.iso','piecelength':262144,'pieces':<binary SHA1 hashes>}}

Note: pieces here would be a 51 KiB value ().

Multiple files

A de-bencoded torrent file (with 'piece length' 256 KiB = 262144 B) for two files, 111.txt and 222.txt, might look like:

{'announce':'http://tracker.example.com/announce','info':{'files':[{'length':111,'path':['111.txt']},{'length':222,'path':['222.txt']}],'name':'directoryName','piecelength':262144,'pieces':<binary SHA1 hashes>}}

Hybrid, multiple files

See also

Related Research Articles

<span class="mw-page-title-main">Shareaza</span> Peer-to-peer file sharing application

Shareaza is a peer-to-peer file sharing client running under Microsoft Windows which supports the gnutella, Gnutella2 (G2), eDonkey, BitTorrent, FTP, HTTP and HTTPS network protocols and handles magnet links, ed2k links, and the now deprecated gnutella and Piolet links. It is available in 30 languages.

eDonkey2000

eDonkey2000 was (is) a peer-to-peer file sharing application developed by US company MetaMachine, using the Multisource File Transfer Protocol. It supported both the eDonkey2000 network and the Overnet network.

<span class="mw-page-title-main">Distributed hash table</span> Decentralized distributed system with lookup service

A distributed hash table (DHT) is a distributed system that provides a lookup service similar to a hash table. Key–value pairs are stored in a DHT, and any participating node can efficiently retrieve the value associated with a given key. The main advantage of a DHT is that nodes can be added or removed with minimum work around re-distributing keys. Keys are unique identifiers which map to particular values, which in turn can be anything from addresses, to documents, to arbitrary data. Responsibility for maintaining the mapping from keys to values is distributed among the nodes, in such a way that a change in the set of participants causes a minimal amount of disruption. This allows a DHT to scale to extremely large numbers of nodes and to handle continual node arrivals, departures, and failures.

BitTorrent, also referred to as simply torrent, is a communication protocol for peer-to-peer file sharing (P2P), which enables users to distribute data and electronic files over the Internet in a decentralized manner. The protocol is developed and maintained by Rainberry, Inc., and was first released in 2001.

Kademlia is a distributed hash table for decentralized peer-to-peer computer networks designed by Petar Maymounkov and David Mazières in 2002. It specifies the structure of the network and the exchange of information through node lookups. Kademlia nodes communicate among themselves using UDP. A virtual or overlay network is formed by the participant nodes. Each node is identified by a number or node ID. The node ID serves not only as identification, but the Kademlia algorithm uses the node ID to locate values.

<span class="mw-page-title-main">BitComet</span> BitTorrent, FTP and HTTP client

BitComet is a cross-protocol BitTorrent, HTTP and FTP client written in C++ for Microsoft Windows and available in 52 different languages. Its first public release was version 0.28. The current BitComet logo has been used since version 0.50.

<span class="mw-page-title-main">Magnet URI scheme</span> Scheme that defines the format of magnet links

Magnet is a URI scheme that defines the format of magnet links, a de facto standard for identifying files (URN) by their content, via cryptographic hash value rather than by their location.

A BitTorrent tracker is a special type of server that assists in the communication between peers using the BitTorrent protocol.

<span class="mw-page-title-main">KTorrent</span> Free and open source BitTorrent client

KTorrent is a BitTorrent client that is part of the KDE Gear.

<span class="mw-page-title-main">Hash list</span>

In computer science, a hash list is typically a list of hashes of the data blocks in a file or set of files. Lists of hashes are used for many different purposes, such as fast table lookup and distributed databases.

Protocol encryption (PE), message stream encryption (MSE) or protocol header encrypt (PHE) are related features of some peer-to-peer file-sharing clients, including BitTorrent clients. They attempt to enhance privacy and confidentiality. In addition, they attempt to make traffic harder to identify by third parties including internet service providers (ISPs). However, encryption will not protect one from DMCA notices from sharing not legal content, as one is still uploading material and the monitoring firms can merely connect to the swarm.

Peer exchange or PEX is a communications protocol that augments the BitTorrent file sharing protocol. It allows a group of users that are collaborating to share a given file to do so more swiftly and efficiently.

In computing, eD2k links (ed2k://) are hyperlinks used to denote files stored on computers connected to the eDonkey filesharing P2P network.

<span class="mw-page-title-main">Tribler</span> Peer-to-peer filesharing software and protocol

Tribler is an open source decentralized BitTorrent client which allows anonymous peer-to-peer by default. Tribler is based on the BitTorrent protocol and uses an overlay network for content searching. Due to this overlay network, Tribler does not require an external website or indexing service to discover content. The user interface of Tribler is very basic and focused on ease of use instead of diversity of features. Tribler is available for Linux, Windows, and OS X.

The following is a general comparison of BitTorrent clients, which are computer programs designed for peer-to-peer file sharing using the BitTorrent protocol.

This is a glossary of jargon related to peer-to-peer file sharing via the BitTorrent protocol.

<span class="mw-page-title-main">Phex</span>

Phex is a peer-to-peer file sharing client for the gnutella network, released under the terms of the GNU General Public License, so Phex is free software. Phex is based on Java SE 5.0 or later.

libtorrent

libtorrent is an open-source implementation of the BitTorrent protocol. It is written in and has its main library interface in C++. Its most notable features are support for Mainline DHT, IPv6, HTTP seeds and μTorrent's peer exchange. libtorrent uses Boost, specifically Boost.Asio to gain its platform independence. It is known to build on Windows and most Unix-like operating systems.

Mainline DHT is the name given to the Kademlia-based distributed hash table (DHT) used by BitTorrent clients to find peers via the BitTorrent protocol. The idea of using a DHT for distributed tracking in BitTorrent was first implemented in Azureus 2.3.0.0 in May 2005, from which it gained significant popularity. Unrelated but around the same time, BitTorrent, Inc. released a similar DHT into their client called Mainline DHT, and thus popularized the use of distributed tracking in the BitTorrent protocol. Measurement showed that by 2013, the concurrent number of users of Mainline DHT is from 16 million to 28 million, with intra-day changes of at least 10 million.

<span class="mw-page-title-main">BTDigg</span> Search engine

BTDigg is the first Mainline DHT search engine. It participated in the BitTorrent DHT network, supporting the network and making correspondence between magnet links and a few torrent attributes which are indexed and inserted into a database. For end users, BTDigg provides a full-text database search via a Web interface. The Web part of its search system retrieved proper information by a user's text query. The Web search supported queries in European and Asian languages. The project name was an acronym of BitTorrent Digger. It went offline in June 2016, reportedly due to index spam. The site returned later in 2016 at a dot-com domain, went offline again, and is now online. The btdig.com site has its torrent crawler's source code listed on GitHub, dhtcrawler2.

References

  1. 1 2 "BEP-0003: The BitTorrent Protocol Specification". Bittorrent.org. Archived from the original on 2019-07-26. Retrieved 2009-10-22.
  2. 1 2 3 "bep_0052.rst_post". bittorrent.org. Archived from the original on 2020-11-12. Retrieved 2023-02-09.
  3. 1 2 3 "BitTorrent v2". Libtorrent. September 2020. Archived from the original on 2020-10-30. Retrieved 2023-02-09.
  4. "BEP-0000: Index of BitTorrent Enhancement Proposals". Bittorrent.org. Archived from the original on 2010-02-11. Retrieved 2009-10-22.
  5. "BEP-0005: DHT Protocol". Bittorrent.org. Archived from the original on 2010-02-13. Retrieved 2009-10-22.
  6. "BEP-0012: Multitracker Metadata Extension". Bittorrent.org. Archived from the original on 2012-12-27. Retrieved 2009-10-22.
  7. "bep_0019.rst_post". www.bittorrent.org.
  8. "BEP-0027: Private Torrents". Bittorrent.org. Archived from the original on 2013-03-24. Retrieved 2009-10-22.
  9. "BEP-0017: HTTP Seeding". Bittorrent.org. Archived from the original on 2013-12-13. Retrieved 2009-10-22.
  10. "BEP-0030: Merkle hash torrent extension". Bittorrent.org. Archived from the original on 2009-09-14. Retrieved 2009-10-22.