How does the BitTorrent system really work?

Saptashwa May 27, 2011
Pinterest Stumbleupon Whatsapp

I already know the things that every website will list, like seeds, peers, et al. But I want to know the innards.

How does a tiny .torrent file contain all trackers to download a (theoretical) infinite GBs of data? When we seed a file, does it get uploaded to the website (eg – TPB)? What’s the use, if it gets uploaded to the same website multiple times? Is it that, if some part is being kept offline for maintenance (only 10 seeds sometimes provide more speed than 100)? Is it that so parallel amounts of data can be downloaded at the same time?

I really wish to know how the guys behind-the-scenes do it.

Ads by Google

  1. Kavi
    June 19, 2011 at 6:48 pm

    How does the torrent client actually "know" when a file is complete?

  2. Sahil Dave
    May 31, 2011 at 8:33 am

    An addition to this question from my side***

    If i am a seeder and due to some reason, the 'chunk' i am seeding is infected by a virus or worm or anything, would the uploaded data by ME be infected?

    • Mike
      May 31, 2011 at 5:00 pm

      If you "add content/code/bits" (harmful or not) each peace containing this content would be bigger than the specified piece length and therefor thrown away.

      As for replacing content/code/bits I don't recall how the validation process of pieces is done but in general you have to look at it this way: Like with MD5 collisions it is possible to infect an an existing download without failing the final hash comparison but the malware/virus would have to be optimized for this one specific torrent to cause this collision.

      So randomly infecting torrents with malware you got somewhere on the internet is highly unlikely (almost impossible).

  3. Mike
    May 30, 2011 at 6:05 pm

    There is an eBook called "An IT Forensic Examination of P2P Clients". Maybe you can find it somewhere on the Internet - the official source is locked down for Law Enforcement nowadays. It has an excellent description and overview about P2P networks.

    Otherwise it's pretty much like James said - a torrent file contains the following bencoded data:
    - list of trackers to contact/ask for the file (announce list)
    - name of the file (info name)
    - number of bytes per piece (pieces length)
    - number of pieces (pieces)
    - the file size (length)

    The sizes are exchangeable, having the length and pieces is enough to calculate the piece length, having the piece length and pieces is enough to calculate the length and so on.

    A torrent Client contacts the trackers listed in the file, then asks for a list of clients which are known to have the file with the #hash. Clients can be peers (users who have 100% of the pieces) and leechers/downloaders (who only have some parts/pieces of the entire file). This client list of course contains the IP-Adresses of the users which your client then uses to build up the Peer-to-Peer connections.

    Another technique used is DHT (Distributed hash table) where instead (decentralized network) or in addition to trackers (partially decentralized) each Peer is asked for sources offering the file with the #hash.

    *** This is just as much as I know, I don't give any guarantee on accuracy.

    • Mike
      May 30, 2011 at 6:10 pm

      I forgot - as soon as the client asks for the file at the tracker or another client it is automatically added as a source too (you "announce" yourself as a possible source).

  4. James Bruce
    May 27, 2011 at 7:37 pm

    The torrent file itself contains a list of trackers and hash of the file or files youre downloading/uploading. When you "upload" a torrent for the first time, ie. create a new one, the .torrent file gets uploaded so the tracker knows of the files existence. You dont actually upload the file you are seeding. Now when you seed it, your torrent software reports to the trackers and says "hey, i got that file, point them to me". When someone download a torrent, it checks in with the tracker and see you have the file, then it connects to you directly and downloads it. 

    I don't quite understand the other parts of your question I'm afraid. I'm guessing your question about parts refers to the fact that any file download in split into various chunks of however mnay KB. Downloading means that you request a particular chunk from a user, and if they have it, they give it to you. If their download is still incomplete for that chunk, they say no and you ask the next user. Of course, all this happens for multiple chunks at the same time, for as many users as you can handle, which is why bittorrent is the fastest download method around. 

    Does that answer it all? 

  5. Tina
    May 27, 2011 at 5:52 pm

    Saptashwa,

    I know you're smart enough to search Google yourself, so I kinda wonder why you didn't.

    How BitTorrent Works
    Torrents 101: How Torrent Downloading Works

    • Saptashwa
      May 27, 2011 at 5:58 pm

      Actually, I wanted to get to the real innards, like the kind of code they use to maintain and connect all of these files, and the other questions I posed. What they list is very superficial. Just for the curios techie. Not for a coder, who wants to know how it's being done behind the scenes.

      • Tina
        May 27, 2011 at 6:01 pm

        Ok, I get it now!

        I would guess that the torrent file is just a link and the real information is on the servers.

Ads by Google