How do torrent clients reassemble and store pieces?

9

4

I was wondering, how are pieces downloaded by torrent clients stored and reassembled? Do they use metadata? It seems this is not the case since one is able to play them if they're half formed files? I have no clue how this is done? So basically I'm asking how are the pieces in the downloaded file organized? Is it just from first to last, or are there buffer spaces in between?

Cenoc

Posted 2010-12-08T02:34:56.813

Reputation: 497

Is there any way to download half downloaded pieces after recheck?.My connection is like dialup......So please help. – None – 2011-06-11T16:21:31.663

Answers

19

Welcome to the wonderful world of Torrents! There are a few pieces that comprise the Bittorrent protocol: you have your file, legalthing.iso and you want to distribute it to as many people as possible. So you create a "torrent" file, which describes legalthing.iso, and you distribute the torrent file through a website, or any other way you like. The torrent file can either point directly to your computer (and you'd be acting as the seed) or the torrent file can point to a "tracker", which is a server that connects "seeds" (users with the whole legalthing.iso file already) and "peers" (users who are actively downloading the file).

Getting closer to your question now. The file itself, legalthing.iso, is cryptographically hashed so that each person who reads the torrent file and begins downloading legalthing.iso can check each piece against the hash, and ensure they're not downloading a piece that's been modified from the original. Pieces that fail hash checks are discarded.

Now pretend you're a computer downloading a file, using Bittorrent. The protocol can work one of two ways, either you'll download random pieces of the file, or you'll be downloading the rarest pieces first. This latter approach is to increase the overall "health" (availability) of the torrent.

So what's in the actual torrent file? It varies based on the client used to make it, but generally it contains an "announce" section which is the address of the tracker you're using, and a big huge list of all the pieces of the file you wanna download. Each piece is of a uniform size (32 kb, 512kb, 4mb, really any size you like) and each piece has a hash associated with it. Every time a peer gets a piece it compares the hash for that piece (using the SHA-1 hash code) with the hash listed in the torrent file. That's how it figures out the pieces are good.

Since the torrent file lists each piece of the file you're downloading, every time your client successfully downloads a piece and hashes it, it writes the piece to the correct position on the hard disk within the file. That's why if you download a 1gb file, the client will set aside an empty block of space on your disk that's 1gb in size, to accommodate the torrent pieces you'll be downloading.

Now some video players and other file viewers can deal with "corrupted" files. of course, a half-downloaded torrent is not corrupted, but it is missing pieces and to a program like VLC, it just looks broken. So VLC will do the best it can to play whatever data it can find and that's why they can play while partially downloaded.

There are lots more complicated aspects (google DHT, torrent write buffering, all that fun stuff) but that's the basics of how Bittorrent works.

geodave

Posted 2010-12-08T02:34:56.813

Reputation: 826

@geodave, can you also answer this please? https://superuser.com/questions/1033677/will-the-torrent-client-ask-trackers-for-the-pieces-matching-certain-hashes-or

– David Refoua – 2016-02-18T10:25:30.380

Wow, that's a very full answer.... but I've never noticed it really setting aside a 1GB file? It seems that it grows incrementally? – Cenoc – 2010-12-08T03:37:24.100

1@Cenoc depends on the torrent client, some clients pre-allocate, some don't, some have an additional preferences to pre-allocate – Sathyajith Bhat – 2010-12-08T03:43:44.050

Interesting, so at first is it just a file filled with '0's and then when the partially downloaded file is first checked, it is compared against the hashes? – Cenoc – 2010-12-08T03:46:05.033

1It's an empty container for the eventual complete file, and as each piece is downloaded by the client, it's checked against the hash and then put in the appropriate place in the container until the full file is complete. – geodave – 2010-12-08T03:48:19.460

What do you mean by "empty container"? (sorry about the constant questions) – Cenoc – 2010-12-08T03:50:54.797

1As Sathya mentioned, it depends on the client. Some will pre-allocate a space on your hard-drive for the entire file; it'll reserve a bunch of space, and other programs wont be able to use that space on the drive, until your file is downloaded. Other bittorrent clients will put the completed pieces in a temporary storage location to save on drive-space, and then assemble the full file once all the pieces have been downloaded. It depends on what settings you choose. – geodave – 2010-12-08T03:56:16.953