Why does copying individual files take so much longer than one large file?

8

0

I was copying a few very very large files to my computer. In total, 28 files and roughly 200GB (meaning each one was . I noticed that it was going at more or less the same speeds the entire time:

enter image description here

However, transferring multiple small files has much larger fluctuations in speeds:

enter image description here

(Yes, this is from google drive but they're all locally downloaded to my computer)

Jon

Posted 2014-09-07T01:59:47.323

Reputation: 8 089

Question was closed 2014-09-10T18:12:37.053

It's got to do with the overhead of file transfer. Small files may require random disk access and the transfer speed also depends on the disk fragmentation. More info here

– Vinayak – 2014-09-07T02:17:13.400

@Vinayak Why would you need a random disk access/what would it be used for? – Jon – 2014-09-07T02:29:51.540

Files aren't always stored sequentially. This video explains it quite well.

– Vinayak – 2014-09-07T02:36:28.590

I understand how defrag work, but why would small files require random disk accesses? – Jon – 2014-09-07T02:44:01.433

1Because to reduce disk fragmentation, the files will will be written to wherever free space is still available within the HDD. For instance, there might be a 10 MB free space somewhere between two large files. The 10 megabytes of space isn't enough to store a large 5 GB movie, but it can accommodate hundreds of small, 1-100 KB files (text files, INI configuration files, documents, GIFs, etc.). When that block is filled up, the HDD will look for other small blocks of free space that it can use to fill with data. – Vinayak – 2014-09-07T02:49:43.423

A better example here – Vinayak – 2014-09-07T03:23:30.243

Answers

-1

To reduce disk fragmentation, the files will will be written to wherever free space is still available on the hard drive.

For instance, there might be a 10 MB contiguous free space somewhere between two large, sequentially stored files.

The 10 megabytes of space isn't enough to store a large 5 gigabyte movie, but it can accommodate hundreds of small, 1-100 KB files (text files, INI configuration files, documents, GIFs, etc.)

When that block is filled up, the HDD will look for other small blocks of free space that it can use to fill with data, which causes the head to move around a lot.

That, along with the overhead mentioned in these linked articles causes small files to take a long time to copy.

Vinayak

Posted 2014-09-07T01:59:47.323

Reputation: 9 310

2Your answer would seem to imply that writing small files should be faster than large files, since you think that the small files would not be fragmented. The OP's charts do not support your explanation. You have misidentified the reason for R/W head movement during file copying. – sawdust – 2014-09-07T09:31:54.640

1File system fragmentation increases disk head movement or seeks, which are known to hinder throughput. Also, I never said small files aren't fragmented. And I have no idea what information you could've gleaned from the 'charts'. – Vinayak – 2014-09-07T19:37:41.350

1Sure, file fragmentation reduces throughput. But fragmentation is not the reason why copying many small files takes longer than a large file. "I never said small files aren't fragmented" -- Then what's the point of your 3rd sentence/paragraph? If a contiguous space "can accommodate hundreds of small, 1-100 KB files", would these small files still be stored in fragments? You seem to make the case that large files are more likely to be fragmented, and therefore require more seeks and have lower data throughput. So tell us, what information did you glean from those charts? – sawdust – 2014-09-08T01:07:27.507

1

I think you meant, "fragmentation is not the only reason why [copying small files takes longer]". Fragmentation definitely affects write speeds and what do you think happens when files (large and small) are being copied? You don't seem to understand how files get fragmented and how de-fragmentation works. And yes, large files on average are more likely to be fragmented since contiguous free clusters might not always be available. You also don't seem to have read the articles I linked in my answer.

– Vinayak – 2014-09-08T06:02:53.737

1

If this is why you're here, please stop now. If not, add a relevant answer to the question.

– Vinayak – 2014-09-08T06:05:10.940

1No, I already wrote what I meant; fragmentation does not explain the difference between copying a lot of small files versus a large file, especially since the large file is more likely to be fragmented. You persist in not expanding on your post. Instead you guess at what I don't know or understand or read. I've already provided a link to my answer in the 8th comment to the OP. – sawdust – 2014-09-08T06:35:25.977