4

Many download managers like this, this and this support downloading a file over multiple parallel connections, one per thread. The concept is that each connection will download one part of the file separately.
For example if there are 5 connections, then the first connection is going to download the first 0-20% portion of the file, second connection will download 20-40% portion and so on..

Similarly, on the server side, there will be 5 threads, where one thread will be reading 20% of the file in parallel.
But, I thought that trying to concurrently read a single file with multiple threads will actually make the download significantly slower, since the read head of the mechanical disk will have to do more seeks than before.
Even if we assume that the disk controller queuing mechanism is intelligent enough to batch all the 5 multipart requests to a single file together in one sequential read, it does not give us any advantage over doing the read in one just one thread and then serving the file over just 1 http connection.

So how are parallel downloads to a file faster?

  • To note, a download manager make it more possible to recover from a connection lost, as the file is not read sequentially. – yagmoth555 May 01 '20 at 14:02
  • To note too, a download manager can have multiple source too, if we think bittorrent client, so the document you linked for concurrently read is less valid. – yagmoth555 May 01 '20 at 14:03
  • I can understand why it would make sense for bittorrent because it is having multiple connections to multiple servers, not just 1 server. – Anmol Singh Jaggi May 01 '20 at 15:42

2 Answers2

2

My understanding is that downloading different file parts in parallel is only useful when the bottleneck is the network connection: either the upload bandwidth of the server from which you download, or the bandwidth of the network between the server and you. When these links are saturated, the available bandwidth will be divided up across the connections, and in some cases it could be divided evenly across connections. So if you have 5 connections open then you would be getting a larger share of the bandwidth than if you only have one.

Of course, this doesn't work if the server and network are sharing the bandwidth in a more clever way, e.g., by allocating the share between client IPs instead of connections.

When the bottleneck is disk IO on the server or client, then indeed this strategy will not help, and could even harm performance because the reads and writes will be less sequential. Also, when the bottleneck is the available bandwidth between your ISP and modem (which I'd say is maybe the most common case), then parallel downloads should neither harm nor help.

a3nm
  • 859
  • 5
  • 11
  • 2
    Thanks; however I didn't quite understand the first part of your answer: imagine if the server has a bandwidth of 100 Mbps, will it not allocate all of it to a single connection if possible? Why do we need 5 connections to get a larger share of the bandwidth? – Anmol Singh Jaggi May 01 '20 at 15:41
  • 2
    Say you are competing with 9 other users for that 100 Mbps bandwidth, and the bandwidth is split equally between connections. If you and the 9 other users have one connection each, everyone gets 10 Mbps. If you have 5 connections open and the 9 other users have only one each, then you get a share of 5/(5+9) of the bandwidth, i.e,. about 36 Mbps. (And of course the other users get less than they did before: about 7 Mbps each.) – a3nm May 01 '20 at 18:36
  • Ok.. Got it now. – Anmol Singh Jaggi May 01 '20 at 19:34
0

In general, your system has a much faster connection to a disk drive than the network. Even a slow hard drive that can write 50 megabytes per second will have no problem keeping up with multiple downloads using a 100 megabit connection.

Bert
  • 2,733
  • 11
  • 12
  • But lets say that the max bandwidth of the client is 100 MiB, why can it just not use 1 download connection to max out all the 100 Mib and write it down to hard disk at 50 MBPS speed in one thread? – Anmol Singh Jaggi May 01 '20 at 14:09
  • Also, using multiple connections will make the server disk read slower due to random reads from 5 different threads, isn't it? – Anmol Singh Jaggi May 01 '20 at 14:10