Would I run into problems if a download is faster than the write speed of the device it is writing to?

What if I was downloading a file directly to a device which has a write speed considerably slower than the download speed of the internet connection.

Is the write pending data moved into a buffer or temporary storage?
Is this behaviour dependant on OS or browser or perhaps something else?

And if there is a problem, is there anything that I can do to prevent it?

Lewis Norton

Posted 2012-07-25T13:41:38.957

Reputation: 155

I edited this to make it less speculative. Questions are usually expected to be based on actual problems that you face. But this is under discussion right now meta

– Nifle – 2012-07-25T14:04:45.170

"directly to a device" - There's no such thing. You always have application-level and system-level buffers sitting between you and the hardware. – Jonathon Reinhart – 2012-07-25T14:11:43.873

> Questions are usually expected to be based on actual problems that you face. Says who? Why? > But this is under discussion right now meta. Yes, it was asked just four hours ago, and you are the one that stated your opinion that only existing problems quality. You don’t think that someone who has a problem would benefit from seeing an answer ready to go to a previously theoretical question rather than sitting around and waiting for one to eventually be posted? – Synetech – 2012-07-25T16:09:33.370

@Lewis, I have wondered this myself. Looking at the answers below, they don’t seem to address the obvious case. If you watch a large streaming video from a fast connection and (for some reason), the temp directory happens to be on a flash-drive (or floppy or whatever), it stands to reason that the video-player might just keep pulling data from the server without realizing that the drive is unable to write the buffer fast enough. – Synetech – 2012-07-25T16:12:47.767

Answers

To some extent what happens is dependent on the OS and application. However, one can make the following sequence of predictions:

First the stack's receive window will fill up, at something a bit less than the full data rate of the network. It fills slower than the line rate of the network due to the TCP slow start algorithm and other effects of the way TCP/IP stacks behave.

The TCP window can be up to 128 KiB (less 1 byte) on my Linux box. (Say sysctl net.core.rmem_max to get your box's value.) It is usually smaller than this maximum, however. The default is 4 KiB on my box. (Say sysctl net.ipv4.tcp_rmem to get that value.)
Your application will have some buffering of its own. It may be as little as 1 byte, but it can't be zero. Linux would need a zero-copy syscall like recvfile() to avoid the need for application buffering, and it lacks that.

The buffer size is totally up to the application programmer. In programs I've written, I've used anywhere from roughly a dozen bytes up to 64 KiB, depending on the needs of the application. I've inferred the use of much larger buffers (∼1 MiB) in other apps by observing how they behave.
The application will almost certainly be using some kind of buffered I/O mechanism for writing the file, such as C's stdio. This is typically at least 1 KiB, and may be several KiB. On my box here, it appears to default to 8 KiB.

It is possible the application is using unbuffered I/O or is constantly flushing the I/O buffers to disk, but this is uncommon.
The device driver for the storage device may have some buffering. It probably isn't much, but a 4 KiB single page buffer wouldn't be unreasonable.
The storage device itself almost certainly has some cache. Modern hard drives have caches on the order of a few dozen megabytes, for example. If you're writing to a RAID device, there might be an even bigger write-back cache, too.

All five of these buffers have to fill up before the raw I/O performance of the underlying storage device can have any effect. Because they could easily add up to 100 MiB or more, you will need to test with a transfer size larger than this if you want to be sure you're not just testing the combined behavior of these buffers.

Having covered all that, I'll answer your top-level question: as long as you're using a network protocol with a flow control mechanism — e.g. TCP — there should be no problem resulting from the scenario you posit. However, if you're using an unreliable network protocol such as UDP, and the application protocol built on top of it doesn't provide its own flow control mechanism, the application could be forced to drop packets in this situation.

Warren Young

Posted 2012-07-25T13:41:38.957

Reputation: 2 587

What I would expect to happen is one to two things:

The process requesting the data would buffer the "excess" in memory.

or more likely:
The process requesting the data would only request data when it was able to process it so that the speed of download would be effectively reduced to the write speed of the device.

What actually happens will depend on the application doing the download and write operations, so unless you have a specific application in mind the second part of your question is unanswerable.

ChrisF

Posted 2012-07-25T13:41:38.957

Reputation: 39 650

1Could we test this? Limiting the write speed to some USB storage and sending a file its way, whilst monitoring memory for change. However, I'm not sure how to limit a device's write speed on a Linux or Windows system. – Lewis Norton – 2012-07-25T14:04:56.887

OS/app will simply throttle the download speed. Just download a file from a 1Gbit LAN to an old USB1 pendrive and see by yourself.

Axeman

Posted 2012-07-25T13:41:38.957

Reputation: 494

If the underlying protocol is TCP (e.g. HTTP), then there will be no problem. Your downloader has a buffer in memory where it temporarily stores data that has been downloaded. It continually writes data from this buffer to disk. If the disk is slow, then the buffer will become full and the downloader won't ask the operating system to receive more data from the remote server. That means a similar buffer in the Windows TCP driver fills up. The TCP protocol guarantees that you won't have a problem if someone's buffers get full:

http://en.wikipedia.org/wiki/Transmission_Control_Protocol#Flow_control

TCP uses an end-to-end flow control protocol to avoid having the sender send data too fast for the TCP receiver to receive and process it reliably. Having a mechanism for flow control is essential in an environment where machines of diverse network speeds communicate. For example, if a PC sends data to a smartphone that is slowly processing received data, the smartphone must regulate the data flow so as not to be overwhelmed.

TCP uses a sliding window flow control protocol. In each TCP segment, the receiver specifies in the receive window field the amount of additionally received data (in bytes) that it is willing to buffer for the connection. The sending host can send only up to that amount of data before it must wait for an acknowledgment and window update from the receiving host.

So, when the TCP driver's buffer is full, it won't acknowledge to the other computer that it's ready to receive more data.

If the underlying protocol is something more special / proprietary, then all bets are off - because this is a feature of TCP.

James Johnston

Posted 2012-07-25T13:41:38.957

Reputation: 626