113-500: BitTorrent , also referred to simply as torrent , is a communication protocol for peer-to-peer file sharing (P2P), which enables users to distribute data and electronic files over the Internet in a decentralized manner. The protocol is developed and maintained by Rainberry, Inc. , and was first released in 2001. To send or receive files, users use a BitTorrent client on their Internet-connected computer, which are available for
226-565: A University at Buffalo alumnus, designed the protocol in April 2001, and released the first available version on 2 July 2001. Cohen and Ashwin Navin founded BitTorrent, Inc. (later renamed Rainberry, Inc. ) to further develop the technology in 2004. The first release of the BitTorrent client had no search engine and no peer exchange. Up until 2005, the only way to share files was by creating
339-448: A flood-like spreading of a file throughout many peer computer nodes. As more peers join the swarm, the likelihood of a successful download by any particular node increases. Relative to traditional Internet distribution schemes, this permits a significant reduction in the original distributor's hardware and bandwidth resource costs. Distributed downloading protocols in general provide redundancy against system problems, reduce dependence on
452-585: A protocol stack . Internet communication protocols are published by the Internet Engineering Task Force (IETF). The IEEE (Institute of Electrical and Electronics Engineers) handles wired and wireless networking and the International Organization for Standardization (ISO) handles other types. The ITU-T handles telecommunications protocols and formats for the public switched telephone network (PSTN). As
565-402: A tunneling arrangement to accommodate the connection of dissimilar networks. For example, IP may be tunneled across an Asynchronous Transfer Mode (ATM) network. Protocol layering forms the basis of protocol design. It allows the decomposition of single, complex protocols into simpler, cooperating protocols. The protocol layers each solve a distinct class of communication problems. Together,
678-540: A chance to join the swarm. Although "swarming" scales well to tolerate "flash crowds" for popular content, it is less useful for unpopular or niche market content. Peers arriving after the initial rush might find the content unavailable and need to wait for the arrival of a "seed" in order to complete their downloads. The seed arrival, in turn, may take long to happen (this is termed the "seeder promotion problem"). Since maintaining seeds for unpopular content entails high bandwidth and administrative costs, this runs counter to
791-447: A close analogy between protocols and programming languages: protocols are to communication what programming languages are to computations . An alternate formulation states that protocols are to communication what algorithms are to computation . Multiple protocols often describe different aspects of a single communication. A group of protocols designed to work together is known as a protocol suite; when implemented in software they are
904-669: A coarse hierarchy of functional layers defined in the Internet Protocol Suite . The first two cooperating protocols, the Transmission Control Protocol (TCP) and the Internet Protocol (IP) resulted from the decomposition of the original Transmission Control Program, a monolithic communication protocol, into this layered communication suite. The OSI model was developed internationally based on experience with networks that predated
1017-599: A computer environment (such as ease of mechanical parsing and improved bandwidth utilization ). Network applications have various methods of encapsulating data. One method very common with Internet protocols is a text oriented representation that transmits requests and responses as lines of ASCII text, terminated by a newline character (and usually a carriage return character). Examples of protocols that use plain, human-readable text for its commands are FTP ( File Transfer Protocol ), SMTP ( Simple Mail Transfer Protocol ), early versions of HTTP ( Hypertext Transfer Protocol ), and
1130-458: A data file treats the file as a number of identically sized pieces, usually with byte sizes of a power of 2, and typically between 32 KB and 16 MB each. The peer creates a hash for each piece, using the SHA-1 hash function, and records it in the torrent file. Pieces with sizes greater than 512 KB will reduce the size of a torrent file for a very large payload, but is claimed to reduce
1243-439: A de facto standard operating system like Linux does not have this negative grip on its market, because the sources are published and maintained in an open way, thus inviting competition. Bandwidth (computing) In computing , bandwidth is the maximum rate of data transfer across a given path. Bandwidth may be characterized as network bandwidth , data bandwidth , or digital bandwidth . This definition of bandwidth
SECTION 10
#17328016494601356-399: A decentralized network of nodes that route traffic to dynamic trackers. Most BitTorrent clients also use peer exchange (PEX) to gather peers in addition to trackers and DHT . Peer exchange checks with known peers to see if they know of any other peers. With the 3.0.5.0 release of Vuze, all major BitTorrent clients now have compatible peer exchange. Web "seeding" was implemented in 2006 as
1469-592: A draft on their website) that is incompatible with that of Azureus. In 2014, measurement showed concurrent users of Mainline DHT to be from 10 million to 25 million, with a daily churn of at least 10 million. Current versions of the official BitTorrent client, μTorrent , BitComet , Transmission and BitSpirit all share compatibility with Mainline DHT. Both DHT implementations are based on Kademlia . As of version 3.0.5.0, Azureus also supports Mainline DHT in addition to its own distributed database through use of an optional application plugin. This potentially allows
1582-522: A file, but also have a few consequences: As of December 2008, BitTorrent, Inc. was working with Oversi on new Policy Discover Protocols that query the ISP for capabilities and network architecture information. Oversi's ISP hosted NetEnhancer box is designed to "improve peer selection" by helping peers find local nodes, improving download speeds while reducing the loads into and out of the ISP's network. Protocol (computing) A communication protocol
1695-422: A given area, keeping internet speeds higher for all users in general, regardless of whether or not they use the BitTorrent protocol. The file being distributed is divided into segments called pieces . As each peer receives a new piece of the file, it becomes a source (of that piece) for other peers, relieving the original seed from having to send that piece to every computer or user wishing a copy. With BitTorrent,
1808-455: A good connection between them do not exchange data simply because neither of them takes the initiative. To counter these effects, the official BitTorrent client program uses a mechanism called "optimistic unchoking", whereby the client reserves a portion of its available bandwidth for sending pieces to random peers (not necessarily known good partners, or "preferred peers") in hopes of discovering even better partners and to ensure that newcomers get
1921-679: A link is limited by the Shannon–Hartley channel capacity for these communication systems, which is dependent on the bandwidth in hertz and the noise on the channel. The consumed bandwidth in bit/s, corresponds to achieved throughput or goodput , i.e., the average rate of successful data transfer through a communication path. The consumed bandwidth can be affected by technologies such as bandwidth shaping , bandwidth management , bandwidth throttling , bandwidth cap , bandwidth allocation (for example bandwidth allocation protocol and dynamic bandwidth allocation ), etc. A bit stream's bandwidth
2034-407: A list of all the torrents shared by the peers it connected to in the current session (or it can even maintain the list between sessions if instructed). At any time the user can search into that Torrent Collection list for a certain torrent and sort the list by categories. When the user chooses to download a torrent from that list, the .torrent file is automatically searched for (by info-hash value) in
2147-456: A machine rather than a human being. Binary protocols have the advantage of terseness, which translates into speed of transmission and interpretation. Binary have been used in the normative documents describing modern standards like EbXML , HTTP/2 , HTTP/3 and EDOC . An interface in UML may also be considered a binary protocol. Getting the data across a network is only part of the problem for
2260-480: A month measured in gigabytes per month. The more accurate phrase used for this meaning of a maximum amount of data transfer each month or given period is monthly data transfer . A similar situation can occur for end-user Internet service providers as well, especially where network capacity is limited (for example in areas with underdeveloped internet connectivity and on wireless networks). Edholm's law , proposed by and named after Phil Edholm in 2004, holds that
2373-457: A networking protocol, the protocol software modules are interfaced with a framework implemented on the machine's operating system. This framework implements the networking functionality of the operating system. When protocol algorithms are expressed in a portable programming language the protocol software may be made operating system independent. The best-known frameworks are the TCP/IP model and
SECTION 20
#17328016494602486-519: A number of very large messages through the network, measuring the end-to-end throughput. As with other bandwidths, the asymptotic bandwidth is measured in multiples of bits per seconds. Since bandwidth spikes can skew the measurement, carriers often use the 95th percentile method. This method continuously measures bandwidth usage and then removes the top 5 percent. Digital bandwidth may also refer to: multimedia bit rate or average bitrate after multimedia data compression ( source coding ), defined as
2599-417: A packet-switched network, rather than this being a service of the network itself. His team was the first to tackle the highly complex problem of providing user applications with a reliable virtual circuit service while using a best-effort service , an early contribution to what will be the Transmission Control Protocol (TCP). Bob Metcalfe and others at Xerox PARC outlined the idea of Ethernet and
2712-554: A protocol. The data received has to be evaluated in the context of the progress of the conversation, so a protocol must include rules describing the context. These kinds of rules are said to express the syntax of the communication. Other rules determine whether the data is meaningful for the context in which the exchange takes place. These kinds of rules are said to express the semantics of the communication. Messages are sent and received on communicating systems to establish communication. Protocols should therefore specify rules governing
2825-565: A reference model for communication standards led to the OSI model , published in 1984. For a period in the late 1980s and early 1990s, engineers, organizations and nations became polarized over the issue of which standard , the OSI model or the Internet protocol suite, would result in the best and most robust computer networks. The information exchanged between devices through a network or other media
2938-399: A response from a range of possible responses predetermined for that particular situation. The specified behavior is typically independent of how it is to be implemented . Communication protocols have to be agreed upon by the parties involved. To reach an agreement, a protocol may be developed into a technical standard . A programming language describes the same for computations, so there is
3051-478: A set of cooperating processes that manipulate shared data to communicate with each other. This communication is governed by well-understood protocols, which can be embedded in the process code itself. In contrast, because there is no shared memory , communicating systems have to communicate with each other using a shared transmission medium . Transmission is not necessarily reliable, and individual systems may use different hardware or operating systems. To implement
3164-401: A single download (for example, a 10 MB file may be transmitted as ten 1 MB pieces or as forty 256 KB pieces). Due to the nature of this approach, the download of any file can be halted at any time and be resumed at a later date, without the loss of previously downloaded information, which in turn makes BitTorrent particularly useful in the transfer of larger files. This also enables
3277-556: A slightly different approach is provided by the BitComet client through its "Torrent Exchange" feature. Whenever two peers using BitComet (with Torrent Exchange enabled) connect to each other they exchange lists of all the torrents (name and info-hash) they have in the Torrent Share storage (torrent files which were previously downloaded and for which the user chose to enable sharing by Torrent Exchange). Thus each client builds up
3390-458: A small text file called a " torrent ", that they would upload to a torrent index site. The first uploader acted as a seed , and downloaders would initially connect as peers . Those who wish to download the file would download the torrent, which their client would use to connect to a tracker which had a list of the IP addresses of other seeds and peers in the swarm. Once a peer completed a download of
3503-456: A standardization process. Such protocols are referred to as de facto standards . De facto standards are common in emerging markets, niche markets, or markets that are monopolized (or oligopolized ). They can hold a market in a very negative grip, especially when used to scare away competition. From a historical perspective, standardization should be seen as a measure to counteract the ill-effects of de facto standards. Positive exceptions exist;
BitTorrent - Misplaced Pages Continue
3616-405: A system is no longer downloading but only uploading data, and terminate its connection by injecting TCP RST (reset flag) packets. Another unofficial feature is an extension to the BitTorrent metadata format proposed by John Hoffman and implemented by several indexing websites. It allows the use of multiple trackers per file, so if one tracker fails, others can continue to support file transfer. It
3729-462: A torrent file could be hosted on one site and tracked by another unrelated site. Private sites operate like public ones except that they may restrict access to registered users and may also keep track of the amount of data each user uploads and downloads, in an attempt to reduce " leeching ". Web search engines allow the discovery of torrent files that are hosted and tracked on other sites; examples include The Pirate Bay and BTDigg . These sites allow
3842-506: A tracker. Azureus was the first BitTorrent client to implement such a system through the distributed hash table (DHT) method. An alternative and incompatible DHT system, known as Mainline DHT , was released in the Mainline BitTorrent client three weeks later (though it had been in development since 2002) and subsequently adopted by the μTorrent , Transmission , rTorrent , KTorrent , BitComet , and Deluge clients. After
3955-430: A transfer mechanism of a protocol is comparable to a central processing unit (CPU). The framework introduces rules that allow the programmer to design cooperating protocols independently of one another. In modern protocol design, protocols are layered to form a protocol stack. Layering is a design principle that divides the protocol design task into smaller steps, each of which accomplishes a specific part, interacting with
4068-417: A variety of computing platforms and operating systems , including an official client . BitTorrent trackers provide a list of files available for transfer and allow the client to find peer users, known as "seeds", who may transfer the files. BitTorrent downloading is considered to be faster than HTTP ("direct downloading") and FTP due to the lack of a central server that could limit bandwidth. BitTorrent
4181-439: A web publisher as creating a direct HTTP download. In addition, it would allow the "web seed" to be disabled if the swarm becomes too popular while still allowing the file to be readily available. This feature has two distinct specifications, both of which are supported by Libtorrent and the 26+ clients that use it. The first was created by John "TheSHAD0W" Hoffman, who created BitTornado. This first specification requires running
4294-410: A web service that serves content by info-hash and piece number, rather than filename. The other specification is created by GetRight authors and can rely on a basic HTTP download space (using byte serving ). In September 2010, a new service named Burnbit was launched which generates a torrent from any URL using webseeding. There are server-side solutions that provide initial seeding of the file from
4407-505: Is a system of rules that allows two or more entities of a communications system to transmit information via any variation of a physical quantity . The protocol defines the rules, syntax , semantics , and synchronization of communication and possible error recovery methods . Protocols may be implemented by hardware , software , or a combination of both. Communicating systems use well-defined formats for exchanging various messages. Each message has an exact meaning intended to elicit
4520-412: Is an alternative to the older single source, multiple mirror sources technique for distributing data, and can work effectively over networks with lower bandwidth . Using the BitTorrent protocol, several basic computers, such as home computers, can replace large servers while efficiently distributing files to many recipients. This lower bandwidth usage also helps prevent large spikes in internet traffic in
4633-1000: Is developing a similar torrent API that will provide the same features, and help bring the torrent community to Web 2.0 standards. Alongside this release is a first PHP application built using the API called PEP, which will parse any Really Simple Syndication (RSS 2.0) feed and automatically create and seed a torrent for each enclosure found in that feed. Since BitTorrent makes up a large proportion of total traffic, some ISPs have chosen to "throttle" (slow down) BitTorrent transfers. For this reason, methods have been developed to disguise BitTorrent traffic in an attempt to thwart these efforts. Protocol header encrypt (PHE) and Message stream encryption/Protocol encryption (MSE/PE) are features of some BitTorrent clients that attempt to make BitTorrent hard to detect and throttle. As of November 2015, Vuze , BitComet , KTorrent , Transmission , Deluge , μTorrent , MooPolice, Halite, qBittorrent , rTorrent , and
BitTorrent - Misplaced Pages Continue
4746-453: Is governed by rules and conventions that can be set out in communication protocol specifications. The nature of communication, the actual data exchanged and any state -dependent behaviors, is defined by these specifications. In digital computing systems, the rules can be expressed by algorithms and data structures . Protocols are to communication what algorithms or programming languages are to computations. Operating systems usually contain
4859-409: Is implemented in several clients, such as BitComet , BitTornado, BitTorrent, KTorrent , Transmission , Deluge , μTorrent , rtorrent , Vuze , and Frostwire . Trackers are placed in groups, or tiers, with a tracker randomly chosen from the top tier and tried, moving to the next tier if all the trackers in the top tier fail. Torrents with multiple trackers can decrease the time it takes to download
4972-433: Is in contrast to the field of signal processing, wireless communications, modem data transmission, digital communications , and electronics , in which bandwidth is used to refer to analog signal bandwidth measured in hertz , meaning the frequency range between lowest and highest attainable frequency while meeting a well-defined impairment level in signal power. The actual bit rate that can be achieved depends not only on
5085-402: Is less than or equal to the actual channel capacity minus implementation overhead. The asymptotic bandwidth (formally asymptotic throughput ) for a network is the measure of maximum throughput for a greedy source , for example when the message size (the number of packets per second from a source) approaches close to the maximum amount. Asymptotic bandwidths are usually estimated by sending
5198-426: Is now hashed individually, enabling files in the swarm to be deduplicated, so that if multiple torrents include the same files, but seeders are only seeding the file from some, downloaders of the other torrents can still download the file. In addition, file hashes can be displayed on tracker, torrent indexing services, to search for swarms by searching for hashes of files contained in them. These hashes are different from
5311-562: Is one of the most common protocols for transferring large files, such as digital video files containing TV shows and video clips, or digital audio files. BitTorrent accounted for a third of all internet traffic in 2004, according to a study by Cachelogic. As recently as 2019 BitTorrent remained a significant file sharing protocol according to Sandvine , generating a substantial amount of Internet traffic, with 2.46% of downstream , and 27.58% of upstream traffic, although this share has declined significantly since then. Programmer Bram Cohen ,
5424-462: Is proportional to the average consumed signal bandwidth in hertz (the average spectral bandwidth of the analog signal representing the bit stream) during a studied time interval. Channel bandwidth may be confused with useful data throughput (or goodput). For example, a channel with x bit/s may not necessarily transmit data at x rate, since protocols, encryption, and other factors can add appreciable overhead. For instance, much internet traffic uses
5537-449: Is referred to as communicating sequential processes (CSP). Concurrency can also be modeled using finite state machines , such as Mealy and Moore machines . Mealy and Moore machines are in use as design tools in digital electronics systems encountered in the form of hardware used in telecommunication or electronic devices in general. The literature presents numerous analogies between computer communication and programming. In analogy,
5650-408: Is the synchronization of software for receiving and transmitting messages of communication in proper sequencing. Concurrent programming has traditionally been a topic in operating systems theory texts. Formal verification seems indispensable because concurrent programs are notorious for the hidden and sophisticated bugs they contain. A mathematical approach to the study of concurrency and communication
5763-515: The DHT Network and when found it is downloaded by the querying client which can subsequently create and initiate a downloading task. Users find a torrent of interest on a torrent index site or by using a search engine built into the client, download it, and open it with a BitTorrent client. The client connects to the tracker(s) or seeds specified in the torrent file, from which it receives a list of seeds and peers currently transferring pieces of
SECTION 50
#17328016494605876-423: The OSI model . At the time the Internet was developed, abstraction layering had proven to be a successful design approach for both compiler and operating system design and, given the similarities between programming languages and communication protocols, the originally monolithic networking programs were decomposed into cooperating protocols. This gave rise to the concept of layered protocols which nowadays forms
5989-638: The PARC Universal Packet (PUP) for internetworking. Research in the early 1970s by Bob Kahn and Vint Cerf led to the formulation of the Transmission Control Program (TCP). Its RFC 675 specification was written by Cerf with Yogen Dalal and Carl Sunshine in December 1974, still a monolithic design at this time. The International Network Working Group agreed on a connectionless datagram standard which
6102-547: The finger protocol . Text-based protocols are typically optimized for human parsing and interpretation and are therefore suitable whenever human inspection of protocol contents is required, such as during debugging and during early protocol development design phases. A binary protocol utilizes all values of a byte , as opposed to a text-based protocol which only uses values corresponding to human-readable characters in ASCII encoding. Binary protocols are intended to be read by
6215-492: The internet service provider of users participating in the swarms of files that are under copyright. In some jurisdictions, copyright holders may launch lawsuits against uploaders or downloaders for infringement, and police may arrest suspects in such cases. Various means have been used to promote anonymity. For example, the BitTorrent client Tribler makes available a Tor -like onion network , optionally routing transfers through other peers to obscure which client has requested
6328-478: The transmission control protocol (TCP), which requires a three-way handshake for each transaction. Although in many modern implementations the protocol is efficient, it does add significant overhead compared to simpler protocols. Also, data packets may be lost, which further reduces the useful data throughput. In general, for any effective digital communication, a framing protocol is needed; overhead and effective throughput depends on implementation. Useful throughput
6441-511: The Azureus/Vuze client to reach a bigger swarm. Another idea that has surfaced in Vuze is that of virtual torrents . This idea is based on the distributed tracker approach and is used to describe some web resource. Currently, it is used for instant messaging . It is implemented using a special messaging protocol and requires an appropriate plugin. Anatomic P2P is another approach, which uses
6554-521: The DHT was adopted, a "private" flag – analogous to the broadcast flag – was unofficially introduced, telling clients to restrict the use of decentralized tracking regardless of the user's desires. The flag is intentionally placed in the info section of the torrent so that it cannot be disabled or removed without changing the identity of the torrent. The purpose of this DRM is to prevent torrents from being shared with clients that do not have access to
6667-694: The PSTN and Internet converge , the standards are also being driven towards convergence. The first use of the term protocol in a modern data-commutation context occurs in April 1967 in a memorandum entitled A Protocol for Use in the NPL Data Communications Network. Under the direction of Donald Davies , who pioneered packet switching at the National Physical Laboratory in the United Kingdom, it
6780-468: The ability of BitTorrent clients to download torrent pieces from an HTTP source in addition to the "swarm". The advantage of this feature is that a website may distribute a torrent for a particular file or batch of files and make those files available for download from that same web server; this can simplify long-term seeding and load balancing through the use of existing, cheap, web hosting setups. In theory, this would make using BitTorrent almost as easy for
6893-399: The amount of memory and bandwidth required for digital signals, capable of achieving a data compression ratio of up to 100:1 compared to uncompressed media. In Web hosting service , the term bandwidth is often incorrectly used to describe the amount of data transferred to or from the website or server within a prescribed period of time, for example bandwidth consumption accumulated over
SECTION 60
#17328016494607006-456: The approval or support of a standards organization , which initiates the standardization process. The members of the standards organization agree to adhere to the work result on a voluntary basis. Often the members are in control of large market shares relevant to the protocol and in many cases, standards are enforced by law or the government because they are thought to serve an important public interest, so getting approval can be very important for
7119-677: The authorization of copyright holders, rendering those sites especially vulnerable to lawsuits. A BitTorrent index is a "list of .torrent files , which typically includes descriptions" and information about the torrent's content. Several types of websites support the discovery and distribution of data on the BitTorrent network. Public torrent-hosting sites such as The Pirate Bay allow users to search and download from their collection of torrent files. Users can typically also upload torrent files for content they wish to distribute. Often, these sites also run BitTorrent trackers for their hosted torrent files, but these two functions are not mutually dependent:
7232-409: The bandwidth of telecommunication networks double every 18 months, which has proven to be true since the 1970s. The trend is evident in the cases of Internet , cellular (mobile), wireless LAN and wireless personal area networks . The MOSFET (metal–oxide–semiconductor field-effect transistor) is the most important factor enabling the rapid increase in bandwidth. The MOSFET (MOS transistor)
7345-448: The basis of protocol design. Systems typically do not use a single protocol to handle a transmission. Instead they use a set of cooperating protocols, sometimes called a protocol suite . Some of the best-known protocol suites are TCP/IP , IPX/SPX , X.25 , AX.25 and AppleTalk . The protocols can be arranged based on functionality in groups, for instance, there is a group of transport protocols . The functionalities are mapped onto
7458-461: The beginning, BitTorrent's non-contiguous download methods made it harder to support "streaming playback". In 2014, the client Popcorn Time allowed for streaming of BitTorrent video files. Since then, more and more clients are offering streaming options. The BitTorrent protocol provides no way to index torrent files. As a result, a comparatively small number of websites have hosted a large majority of torrents, many linking to copyrighted works without
7571-459: The client to seek out readily available pieces and download them immediately, rather than halting the download and waiting for the next (and possibly unavailable) piece in line, which typically reduces the overall time of the download. This eventual transition from peers to seeders determines the overall "health" of the file (as determined by the number of times a file is available in its complete form). The distributed nature of BitTorrent can lead to
7684-432: The complete file, it could in turn function as a seed. These files contain metadata about the files to be shared and the trackers which keep track of the other seeds and peers. In 2005, first Vuze and then the BitTorrent client introduced distributed tracking using distributed hash tables which allowed clients to exchange data on swarms directly without the need for a torrent file. In 2006, peer exchange functionality
7797-442: The content being carried: text-based and binary. A text-based protocol or plain text protocol represents its content in human-readable format , often in plain text encoded in a machine-readable encoding such as ASCII or UTF-8 , or in structured text-based formats such as Intel hex format , XML or JSON . The immediate human readability stands in contrast to native binary protocols which have inherent benefits for use in
7910-609: The content provider, much higher redundancy, and much greater resistance to abuse or to " flash crowds " than regular server software . However, this protection, theoretically, comes at a cost: downloads can take time to rise to full speed because it may take time for enough peer connections to be established, and it may take time for a node to receive sufficient data to become an effective uploader. This contrasts with regular downloads (such as from an HTTP server, for example) that, while more vulnerable to overload and abuse, rise to full speed very quickly, and maintain this speed throughout. In
8023-586: The data. The exit node would be visible to peers in a swarm, but the Tribler organization provides exit nodes. One advantage of Tribler is that clearnet torrents can be downloaded with only a small decrease in download speed from one "hop" of routing. i2p provides a similar anonymity layer although in that case, one can only download torrents that have been uploaded to the i2p network. The bittorrent client Vuze allows users who are not concerned about anonymity to take clearnet torrents, and make them available on
8136-493: The developers, and as such, v2 uses SHA-256 . To ensure backwards compatibility, the v2 .torrent file format supports a hybrid mode where the torrents are hashed through both the new method and the old method, with the intent that the files will be shared with peers on both v1 and v2 swarms. Another update to the specification is adding a hash tree to speed up time from adding a torrent to downloading files, and to allow more granular checks for file corruption. In addition, each file
8249-463: The download taste of the user, and recommend additional content. In May 2007, researchers at Cornell University published a paper proposing a new approach to searching a peer-to-peer network for inexact strings, which could replace the functionality of a central indexing site. A year later, the same team implemented the system as a plugin for Vuze called Cubit and published a follow-up paper reporting its success. A somewhat similar facility but with
8362-422: The efficiency of the protocol. When another peer later receives a particular piece, the hash of the piece is compared to the recorded hash to test that the piece is error-free. Peers that provide a complete file are called seeders, and the peer providing the initial copy is called the initial seeder. The exact information contained in the torrent file depends on the version of the BitTorrent protocol. By convention,
8475-509: The feed for new items, and use them to start the download. Then, I could find a trusted publisher of an Alias RSS feed, and "subscribe" to all new episodes of the show, which would then start downloading automatically – like the "season pass" feature of the TiVo . The RSS feed will track the content, while BitTorrent ensures content integrity with cryptographic hashing of all data, so feed subscribers will receive uncorrupted content. One of
8588-673: The field of computer networking, it has been historically criticized by many researchers as abstracting the protocol stack in this way may cause a higher layer to duplicate the functionality of a lower layer, a prime example being error recovery on both a per-link basis and an end-to-end basis. Commonly recurring problems in the design and implementation of communication protocols can be addressed by software design patterns . Popular formal methods of describing communication syntax are Abstract Syntax Notation One (an ISO standard) and augmented Backus–Naur form (an IETF standard). Finite-state machine models are used to formally describe
8701-734: The file(s). The client connects to those peers to obtain the various pieces. If the swarm contains only the initial seeder, the client connects directly to it, and begins to request pieces. Clients incorporate mechanisms to optimize their download and upload rates. The effectiveness of this data exchange depends largely on the policies that clients use to determine to whom to send data. Clients may prefer to send data to peers that send data back to them (a " tit for tat " exchange scheme), which encourages fair trading. But strict policies often result in suboptimal situations, such as when newly joined peers are unable to receive any data because they do not have any pieces yet to trade themselves or when two peers with
8814-444: The first and popular software clients ( free and open source ) for broadcatching is Miro . Other free software clients such as PenguinTV and KatchTV are also now supporting broadcatching. The BitTorrent web-service MoveDigital added the ability to make torrents available to any web application capable of parsing XML through its standard REST -based interface in 2006, though this has since been discontinued. Additionally, Torrenthut
8927-574: The goals of publishers that value BitTorrent as a cheap alternative to a client-server approach. This occurs on a huge scale; measurements have shown that 38% of all new torrents become unavailable within the first month. A strategy adopted by many publishers which significantly increases availability of unpopular content consists of bundling multiple files in a single swarm. More sophisticated solutions have also been proposed; generally, these use cross-torrent mechanisms through which multiple torrents can cooperate to better share content. The peer distributing
9040-426: The horizontal message flows (and protocols) are between systems. The message flows are governed by rules, and data formats specified by protocols. The blue lines mark the boundaries of the (horizontal) protocol layers. The software supporting protocols has a layered organization and its relationship with protocol layering is shown in figure 5. To send a message on system A, the top-layer software module interacts with
9153-491: The i2p network. Most BitTorrent clients are not designed to provide anonymity when used over Tor, and there is some debate as to whether torrenting over Tor acts as a drag on the network. "Private" torrent websites are usually invitation only, and require members to participate in uploading, but have the downside of a single centralized point of failure. Oink's Pink Palace and What.cd are examples of private torrent sites which have been shut down. Seedbox services download
9266-504: The intention of shaping traffic in a protocol-agnostic manner. Questions about the ethics and legality of Comcast's behavior have led to renewed debate about net neutrality in the United States . In general, although encryption can make it difficult to determine what is being shared, BitTorrent is vulnerable to traffic analysis . Thus, even with MSE/PE, it may be possible for an ISP to recognize BitTorrent and also to determine that
9379-643: The internet as a reference model for general communication with much stricter rules of protocol interaction and rigorous layering. Typically, application software is built upon a robust data transport layer. Underlying this transport layer is a datagram delivery and routing mechanism that is typically connectionless in the Internet. Packet relaying across networks happens over another layer that involves only network link technologies, which are often specific to certain physical layer technologies, such as Ethernet . Layering provides opportunities to exchange technologies when needed, for example, protocols are often stacked in
9492-470: The latest official BitTorrent client (v6) support MSE/PE encryption. In August 2007, Comcast was preventing BitTorrent seeding by monitoring and interfering with the communication between peers. Protection against these efforts is provided by proxying the client-tracker traffic via an encrypted tunnel to a point outside of the Comcast network. In 2008, Comcast called a "truce" with BitTorrent, Inc. with
9605-476: The layers make up a layering scheme or model. Computations deal with algorithms and data; Communication involves protocols and messages; So the analog of a data flow diagram is some kind of message flow diagram. To visualize protocol layering and protocol suites, a diagram of the message flows in and between two systems, A and B, is shown in figure 3. The systems, A and B, both make use of the same protocol suite. The vertical flows (and protocols) are in-system and
9718-427: The layers, each layer solving a distinct class of problems relating to, for instance: application-, transport-, internet- and network interface-functions. To transmit a message, a protocol has to be selected from each layer. The selection of the next protocol is accomplished by extending the message with a protocol selector for each layer. There are two types of communication protocols, based on their representation of
9831-402: The module directly below it and hands over the message to be encapsulated. The lower module fills in the header data in accordance with the protocol it implements and interacts with the bottom module which sends the message over the communications channel to the bottom module of system B. On the receiving system B the reverse happens, so ultimately the message gets delivered in its original form to
9944-603: The name of a torrent file has the suffix .torrent . Torrent files use the Bencode file format, and contain an "announce" section, which specifies the URL of the tracker, and an "info" section, containing (suggested) names for the files, their lengths, the piece length used, and a SHA-1 hash code for each piece, all of which are used by clients to verify the integrity of the data they receive. Though SHA-1 has shown signs of cryptographic weakness, Bram Cohen did not initially consider
10057-476: The original distributor, and provide sources for the file which are generally transient and therefore there is no single point of failure as in one way server-client transfers. Though both ultimately transfer files over a network, a BitTorrent download differs from a one way server-client download (as is typical with an HTTP or FTP request, for example) in several fundamental ways: Taken together, these differences allow BitTorrent to achieve much lower cost to
10170-415: The other parts of the protocol only in a small number of well-defined ways. Layering allows the parts of a protocol to be designed and tested without a combinatorial explosion of cases, keeping each design relatively simple. The communication protocols in use on the Internet are designed to function in diverse and complex settings. Internet protocols are designed for simplicity and modularity and fit into
10283-417: The pieces received at other nodes. If a node starts with an authentic copy of the torrent descriptor, it can verify the authenticity of the entire file it receives. Pieces are typically downloaded non-sequentially, and are rearranged into the correct order by the BitTorrent client, which monitors which pieces it needs, and which pieces it has and can upload to other peers. Pieces are of the same size throughout
10396-457: The possible interactions of the protocol. and communicating finite-state machines For communication to occur, protocols have to be selected. The rules can be expressed by algorithms and data structures. Hardware and operating system independence is enhanced by expressing the algorithms in a portable programming language. Source independence of the specification provides wider interoperability. Protocol standards are commonly created by obtaining
10509-401: The protocol, creating incompatible versions on their networks. In some cases, this was deliberately done to discourage users from using equipment from other manufacturers. There are more than 50 variants of the original bi-sync protocol. One can assume, that a standard would have prevented at least some of this from happening. In some cases, protocols gain market dominance without going through
10622-539: The protocol. The need for protocol standards can be shown by looking at what happened to the Binary Synchronous Communications (BSC) protocol invented by IBM . BSC is an early link-level protocol used to connect two separate nodes. It was originally not intended to be used in a multinode network, but doing so revealed several deficiencies of the protocol. In the absence of standardization, manufacturers and organizations felt free to enhance
10735-439: The risk big enough for a backward incompatible change to, for example, SHA-3 . As of BitTorrent v2 the hash function has been updated to SHA-256. In the early days, torrent files were typically published to torrent index websites, and registered with at least one tracker. The tracker maintained lists of the clients currently connected to the swarm. Alternatively, in a trackerless system (decentralized tracking) every peer acts as
10848-449: The signal bandwidth but also on the noise on the channel. The term bandwidth sometimes defines the net bit rate peak bit rate , information rate , or physical layer useful bit rate , channel capacity , or the maximum throughput of a logical or physical communication path in a digital communication system. For example, bandwidth tests measure the maximum throughput of a computer network. The maximum rate that can be sustained on
10961-440: The task of distributing the file is shared by those who want it; it is entirely possible for the seed to send only a single copy of the file itself and eventually distribute to an unlimited number of peers. Each piece is protected by a cryptographic hash contained in the torrent descriptor. This ensures that any modification of the piece can be reliably detected, and thus prevents both accidental and malicious modifications of any of
11074-514: The top module of system B. Program translation is divided into subproblems. As a result, the translation software is layered as well, allowing the software layers to be designed independently. The same approach can be seen in the TCP/IP layering. The modules below the application layer are generally considered part of the operating system. Passing data between these modules is much less expensive than passing data between an application program and
11187-474: The torrent files first to the company's servers, allowing the user to direct download the file from there. One's IP address would be visible to the Seedbox provider, but not to third parties. Virtual private networks encrypt transfers, and substitute a different IP address for the user's, so that anyone monitoring a torrent swarm will only see that address. On 2 May 2005, Azureus 2.3.0.0 (now known as Vuze )
11300-450: The total amount of data divided by the playback time. Due to the impractically high bandwidth requirements of uncompressed digital media , the required multimedia bandwidth can be significantly reduced with data compression. The most widely used data compression technique for media bandwidth reduction is the discrete cosine transform (DCT), which was first proposed by Nasir Ahmed in the early 1970s. DCT compression significantly reduces
11413-643: The tracker. The flag was requested for inclusion in the official specification in August 2008, but has not been accepted yet. Clients that have ignored the private flag were banned by many trackers, discouraging the practice. BitTorrent does not, on its own, offer its users anonymity. One can usually see the IP addresses of all peers in a swarm in one's own client or firewall program. This may expose users with insecure systems to attacks. In some countries, copyright organizations scrape lists of peers, and send takedown notices to
11526-506: The transmission. In general, much of the following should be addressed: Systems engineering principles have been applied to create a set of common network protocol design principles. The design of complex protocols often involves decomposition into simpler, cooperating protocols. Such a set of cooperating protocols is sometimes called a protocol family or a protocol suite, within a conceptual framework. Communicating systems operate concurrently. An important aspect of concurrent programming
11639-406: The transport layer. The boundary between the application layer and the transport layer is called the operating system boundary. Strictly adhering to a layered model, a practice known as strict layering, is not always the best approach to networking. Strict layering can have a negative impact on the performance of an implementation. Although the use of protocol layering is today ubiquitous across
11752-547: The user to ask for content meeting specific criteria (such as containing a given word or phrase) and retrieve a list of links to torrent files matching those criteria. This list can often be sorted with respect to several criteria, relevance (seeders to leechers ratio) being one of the most popular and useful (due to the way the protocol behaves, the download bandwidth achievable is very sensitive to this value). Metasearch engines allow one to search several BitTorrent indices and search engines at once. The Tribler BitTorrent client
11865-448: The usual SHA-256 hash of files and can be obtained using tools. Magnet links for v2 also support a hybrid mode to ensure support for legacy clients. The BitTorrent protocol can be used to reduce the server and network impact of distributing large files. Rather than downloading a file from a single source server, the BitTorrent protocol allows users to join a "swarm" of hosts to upload and download from each other simultaneously. The protocol
11978-659: The web server via standard BitTorrent protocol and when the number of external seeders reach a limit, they stop serving the file from the original source. A technique called broadcatching combines RSS feeds with the BitTorrent protocol to create a content delivery system, further simplifying and automating content distribution. Steve Gillmor explained the concept in a column for Ziff-Davis in December 2003. The discussion spread quickly among bloggers (Ernest Miller, Chris Pirillo , etc.). In an article entitled Broadcatching with BitTorrent , Scott Raymond explained: I want RSS feeds of BitTorrent files. A script would periodically check
12091-403: Was added allowing clients to add peers based on the data found on connected nodes. In 2017, BitTorrent, Inc. released the BitTorrent v2 protocol specification. BitTorrent v2 is intended to work seamlessly with previous versions of the BitTorrent protocol. The main reason for the update was that the old cryptographic hash function , SHA-1 , is no longer considered safe from malicious attacks by
12204-426: Was among the first to incorporate built-in search capabilities. With Tribler, users can find .torrent files held by random peers and taste buddies. It adds such an ability to the BitTorrent protocol using a gossip protocol , somewhat similar to the eXeem network which was shut down in 2005. The software includes the ability to recommend content as well. After a dozen downloads, the Tribler software can roughly estimate
12317-470: Was first implemented in 1970. The NCP interface allowed application software to connect across the ARPANET by implementing higher-level communication protocols, an early example of the protocol layering concept. The CYCLADES network, designed by Louis Pouzin in the early 1970s was the first to implement the end-to-end principle , and make the hosts responsible for the reliable delivery of data on
12430-588: Was presented to the CCITT in 1975 but was not adopted by the CCITT nor by the ARPANET. Separate international research, particularly the work of Rémi Després , contributed to the development of the X.25 standard, based on virtual circuits , which was adopted by the CCITT in 1976. Computer manufacturers developed proprietary protocols such as IBM's Systems Network Architecture (SNA), Digital Equipment Corporation's DECnet and Xerox Network Systems . TCP software
12543-524: Was redesigned as a modular protocol stack, referred to as TCP/IP. This was installed on SATNET in 1982 and on the ARPANET in January 1983. The development of a complete Internet protocol suite by 1989, as outlined in RFC 1122 and RFC 1123 , laid the foundation for the growth of TCP/IP as a comprehensive protocol suite as the core component of the emerging Internet . International work on
12656-436: Was released, utilizing a distributed database system. This system is a distributed hash table implementation which allows the client to use torrents that do not have a working BitTorrent tracker . A bootstrap server is instead utilized. The following month, BitTorrent, Inc. released version 4.2.0 of the Mainline BitTorrent client, which supported an alternative DHT implementation (popularly known as " Mainline DHT ", outlined in
12769-514: Was written by Roger Scantlebury and Keith Bartlett for the NPL network . On the ARPANET , the starting point for host-to-host communication in 1969 was the 1822 protocol , written by Bob Kahn , which defined the transmission of messages to an IMP. The Network Control Program (NCP) for the ARPANET, developed by Steve Crocker and other graduate students including Jon Postel and Vint Cerf ,
#459540