Why is Linux 30x faster than Windows 10 in Copying files?

20

4

I got 20.3 Gig of files and folders totaling 100k+ items. I duplicated all those files in one directory from Windows 10, and it took me an excruciating 3hrs of copying. Done.

The other day, I booted in Linux Fedora 24, recopied the same folder and bam! It took me just 5 mins to duplicate it on the same place but different directory.

Why is Linux so Fast? And Windows is painstakingly slow?

There is a similar question here

Is (Ubuntu) Linux file copying algorithm better than Windows 7?

But the accepted answer is quite lacking.

Jones G

Posted 2016-09-15T00:32:54.637

Reputation: 323

You don't use "Windows" or "Linux" to copy files, you use some specific program running in each of those operating systems. Programs vary widely in the methods they use, and the tradeoffs they make. Which ones were you using? And how? – kreemoweet – 2016-09-15T03:52:56.090

5@kreemoweet: So do operating systems – Windows' NTFS is known to deal really poorly with lots of small files, compared to most other filesystems. – user1686 – 2016-09-15T04:31:51.190

2And nice downvote from a Windows fan huh. You see, copying files, though simple has a lot of applications ranging from data backup in bussiness to scientific studies. For example, in CERN, there are Petabytes of data to deal with, slow copying would be unacceptable. – Jones G – 2016-09-15T04:34:17.683

From that same link - check the 2nd answer from the bottom. Linux caches all the files into available RAM and write to disk when they can - hence why it looks faster (as it only needs to read for now, and write when they can). – Darius – 2016-09-15T05:03:54.967

@DominicGuana File systems do their part (ext3/ext4 can allocate chunks of 100Mb at once). Did you consider that antivirus under windows can play a (slowing) role too? BTW for similar problems with SLAC data acquisition flow (after the 1st level trigger there was too much data) we have learned to write on HDD in parallel... – Hastur – 2016-09-15T06:29:08.253

Hi, no Anti Virus so far, Win Defender disabled as well, Windows is darn slow the time I have started dealing with Terabytes of data. One time I backed up data, external disk to external disk, it took me all night to do the thing - 13hrs. Silly me, after few months data got corrupted in one disk, so I did the same thing in Linux, its just 3 hours, Oh Yhea! – Jones G – 2016-09-20T00:02:40.563

I have had similar experience with windows being very slow and Linux being very fast at file transfer. This includes moving a folder containing many files from one location to another on the same drive. Linux partition always works faster. – Tim – 2018-04-02T21:09:56.480

Linux may be faster at file copying than windows, but you can often waste more time trying to fix all the bugs in Linux. I'm talking GUI Linux.. I'm sure server Linux works fine for the most part. Usually with windows, if I have a real problem I can google and find the answer. But Linux if you ask a question, often people will just give you the answers that you tried already from googling and the reason you're asking is because they don't work for you... – Mikey – 2019-05-04T05:10:46.590

probably some issue with your system. Copying just 20GB of data is not much, and even copying to a USB flash drive may be even fasters. I've copied bigger amounts of small files and I've never seen such a hugh slowdown – phuclv – 2019-05-04T06:51:58.163

Answers

25

The basics of it break down to a few key components of the total system: the UI element (the graphical part), the kernel itself (what talks to the hardware), and the format in which the data is stored (i.e. the file system).

Going backwards, NTFS has been the de-facto for Windows for some time, while the de-facto for the major Linux variants is the ext file system. The NTFS file system itself hasn't changed since Windows XP (2001), a lot of features that exist (like partition shrinking/healing, transactional NTFS, etc.) are features of the OS (Windows Vista/7/8/10) and not NTFS itself. The ext file system had it's last major stable release (ext4) in 2008. Since the file system itself is what governs how and where files are accessed, if you're using ext4 there's a likely chance you'll notice an improvement to speed over NTFS; note however if you used ext2 you might notice that it's comparable in speed.

It could be as well that one partition is formatted in smaller chunks than the other. The default for most systems is a 4096 byte 1, 2 cluster size, but if you formatted your ext4 partition to something like 16k 3 then each read on the ext4 system would get 4x the data vs. the NTFS system (which could mean 4x the files depending on what's stored where/how and how big, etc.). Fragmentation of the files can also play a role in speeds. NTFS handles file fragmentation very differently than the ext file system, and with 100k+ files, there's a good chance there's some fragmentation.

The next component is the kernel itself (not the UI, but the code that actually talks to the hardware, the true OS). Here, there honestly isn't much difference. Both kernels can be configured to do certain things, like disk caching/buffering, to speed up reads and perceived writes, but these configurations usually have the same trade-offs regardless of OS; e.g. caching might seem to massively increase the speed of copying/saving, but if you lose power during the cache write (or pull the USB drive out), then you will lose all data not actually written to disk and possibly even corrupt data already written to disk.

As an example, copy a lot of files to a FAT formatted USB drive in Windows and Linux. On Windows it might take 10 minutes while on Linux it will take 10 seconds; immediately after you've copied the files, safely remove the drive by ejecting it. On Windows it would be immediately ejected from the system and thus you could remove the drive from the USB port, while on Linux it might take 10 minutes before you could actually remove the drive; this is because of the caching (i.e. Linux wrote the files to RAM then wrote them to the disk in the background, while the cache-less Windows wrote the files immediately to disk).

Last is the UI (the graphical part the user interacts with). The UI might be a pretty window with some cool graphs and nice bars that give me a general idea of how many files are being copied and how big it all is and how long it might take; the UI might also be a console that doesn't print any information except when it's done. If the UI has to first go through each folder and file to determine how many files there are, plus how big they are and give a rough estimate before it can actually start copying, then the copy process can take longer due to the UI needing to do this. Again, this is true regardless of OS.

You can configure some things to be equal (like disk caching or cluster size), but realistically speaking it simply comes down to how all the parts tie together to make the system work and more specifically how often those pieces of code actually get updated. The Windows OS has come a long way since Windows XP, but the disk sub-system is an area that hasn't seen much TLC in the OS across all versions for many years (compared to the Linux ecosystem that seems to see some new FS or improvement rather frequently).

Hope that adds some clarity.

txtechhelp

Posted 2016-09-15T00:32:54.637

Reputation: 3 317

Horrible answer in my opinion and down voted. You are introducing differences where there are none. Nobody asked how differently partitioned drives perform. Of course does the question center on the "all else being equal" precept. I can choose a fs for an 8 nvme raid0 any way I want with native read speeds of over 16 gigabytes per second and yet a Windows file copy maxes out at 1.4-1.5 gigabytes any time, all the time. Has nothing to do with caching, fs, partitions, but more with windows OS limitations. – Matthias Wolf – 2019-03-06T04:19:51.460

@Matt what file system are you formatting said RAID array in? If it's NTFS, that could explain the slow down .. but if you have more information to provide, you're free to add a relevant answer, especially if you have any source code (and not an assembly dump) to the core Windows OS to explain directly why said slow down occurs (I for one would especially be interested in that!).

– txtechhelp – 2019-03-06T07:26:28.323

I use ntfs, what better option is there as fs on a windows server ? – Matthias Wolf – 2019-03-06T10:16:21.993

I contacted MSFT and had many many discussions and tried many things over the years and never got it to exceed 1.5GB/second, despite having 100Gb nics on each machines and have all other traffic per Mellanox profiling tools show the connects are working perfectly fine at 94-95Gb/sec throughput. No slowdowns between linux machines, but as soon as a windows OS machine is involved I see those bottlenecks – Matthias Wolf – 2019-03-06T10:28:47.180

I am talking about single file transfers, all single threaded. There is no hardware bottleneck whatsoever, its purely OS based. – Matthias Wolf – 2019-03-06T10:29:35.667

I learned something but I really don't understand why Windows Explorer takes forever to open a folder, I mean you see the files but it keeps adding to them. Like my downloads folder or my screenshots folder which has a lot of PNGs. Sure if there were only 10 files in the folder, you'd barely notice it but I have an .m2 SSD - it's kinda ridiculous. Also, search for files sucks in windows. I have to use a 3rd party tool which maintains an index but even using 3rd party tools that don't seems faster than searching with windows which does maintain an index? – Mikey – 2019-05-04T05:04:48.287

@MatthiasWolf consider this: Linux is a system optimized by Linus Torvalds. Windows is a system not optimized by Linus Torvalds. Linus is a speed-junkie, and i assume Windows devs just want things to work, and a paycheck – hanshenrik – 2019-11-20T17:51:18.100