Why is an Ext4 disk check so much faster than NTFS?

12

3

I had a situation today where I restarted my computer and it said I needed to check the disk for consistancy. About 10 minutes later (at "1%" complete), I gave up and decided to let it run when I go home.

For comparison, my home computer uses Ext4 for all of the partitions, and the disk checks (which run around once week) only take a couple seconds. I remember reading that having fast disk checks was a priority, but I don't know how they could do that.

So, how does Ext4 do disk checks so fast? Is there some huge breakthrough in doing this after NTFS came out (~10 years ago)?

Note: The NTFS disk is ~300 GB and the Ext4 disk is ~500 GB. Both are about half full.

Brendan Long

Posted 2011-03-11T18:08:24.937

Reputation: 1 728

I haven't had Windows chkdsk an NTFS volume on bootup since 2008 R2 was released. Even in a CSV cluster with multiple nodes accessing the same NTFS volume locking tens of thousands of Lucene index files. It's quite impressive. – Brain2000 – 2018-10-27T20:38:34.493

Answers

11

There are two main reasons for the performance difference, and two possible reasons. First, the main reasons:


Increased Performance of ext4 vs. NTFS

Various benchmarks have concluded that the actual ext4 file system can perform a variety of read-write operations faster than an NTFS partition. Note that while these tests are not indicative of real-world performance, we can extrapolate these results and use this as one reason.

As for why ext4 actually performs better then NTFS can be attributed to a wide variety of reasons. For example, ext4 supports delayed allocation directly. Again though, the performance gains depend strictly on the hardware you are using (and can be totally negated in certain cases).

Reduced Filesystem Checking Requirements

The ext4 filesystem is also capable of performing faster file system checks than other equivalent journaling filesystems (e.g. NTFS). According to the Wikipedia page:

In ext4, unallocated block groups and sections of the inode table are marked as such. This enables e2fsck to skip them entirely on a check and greatly reduces the time it takes to check a file system of the size ext4 is built to support. This feature is implemented in version 2.6.24 of the Linux kernel.


And now, the two possible reasons:


File System Checking Utilities Themselves

Certain applications may run different routines on filesystems to actually perform the health "check". This can easily be seen if you use the fsck utility set on Linux versus the chkdsk utility on Windows. These applications are written on different operating systems for different file systems. The reason I bring this up as a possible reason is the low-level system calls in each operating system is different, and so you may not be able to directly compare the utilities using two different operating systems.

Disk Fragmentation

This one is easy to understand, and also helps us to understand the differences between file systems. While all digital data held in a file is the same, how it gets stored on the hard drive is quite different from filesystem to filesystem. File fragmentation can obviously increase access speeds, attributing to more of a speed difference.

Breakthrough

Posted 2011-03-11T18:08:24.937

Reputation: 32 927

This answer is just a list of ext4 vs NTFS talking points with no relevance to the question. Journaled file systems never need to be checked in ordinary operation. An automatic check means something is seriously wrong. Without knowing what's wrong, it's impossible to know why the checking is so slow. Comparing it to ext4's weekly checks is comparing apples and oranges. – benrg – 2016-07-27T20:59:18.160

1What confuses me is that your second point initially seems like it would have the biggest effect, but my Ext4 partition has about as much used space as my NTFS partition has total -- instead of being much faster, they should be about the same speed. I guess it's likely that Ext4's performance improvements make it faster to check as well, but Ext4 isn't that much faster than NTFS (certainly not the several orders of magnitude difference I see in filesystem checks). – Brendan Long – 2011-03-11T19:16:15.313

I'm not sure what you mean... In general, file content takes up much more space then the indexes on most modern filesystems (ext4 and NTFS included). The filesystems just store the content differently, which (as I mentioned, in some cases) allows for higher performance. – Breakthrough – 2011-03-11T23:28:52.023

What confuses me is that the actually-checked part should be about the same size on both (since my Ext4 partition has about as much used space as the NTFS partition has total), but the Ext4 partition does its check in seconds, while the NTFS one takes hours. – Brendan Long – 2011-03-12T00:55:55.160

1@Brendan Long if you look at the first link in my answer, some people have found that file reads are actually quicker with a drive using ext4 versus NTFS. Even though the digital data held within the file is the same, it is not stored the same way on the disk. However, if you say the NTFS one takes hours, then you are possibly verifying each sector on the drive, so you might be skipping some alternative checks in the ext4 filesystem check (explaining the large speed difference). It's a lot faster to verify each file rather then the entire disk surface. – Breakthrough – 2011-03-12T14:42:00.177

3

From my understanding ext4 tries to write data to the largest continuous gap of open inodes where no data currently resides. This severely reduces latency when those files have to be read as for the, most part, the whole content of an individual file would mostly lie on a single continuous track so the drives head would have less seeking do do when finding every block containing the data that makes up that one file.

It (ext4) can still become fragmented but much less so and not necessarily in a way that affects read/write performance severely as with NTFS. On NTFS, data is written to the first open blocks in the path of the head.

So wherever the head lies and there is open blocks it writes as much of that data as can fit then writes wherever it lands elsewhere on the disk when the head has to move, say, to another part of the disk to access a different file that has to be opened in a program you just loaded while that other file was being still being written.
This means that if the file is large it is likely to be spread out in blocks separated from each other on separate tracks and is why defragmenting is needed often for NTFS.

Also why servers generally don't use it as there is heavier I/O going on with a server where data is constantly being written and read from disk 24/7.

Also I'm not sure but if chkdsk checks the integrity of each file (which I believe both it and fsck do) then it would also be slower in comparison due to what I just described about fragmenting on NTFS.

jesse james

Posted 2011-03-11T18:08:24.937

Reputation: 31

Neither NTFS chkdsk nor ext4 fsck read file data. It would be pointless, because there is no checksum or any other way to verify its integrity. – benrg – 2016-07-27T20:53:18.770

0

Windows should never need to check an NTFS volume at startup. If it does, something has gone seriously wrong—something much worse than a mere BSOD or power outage. There is a significant chance that some of your data was also corrupted by whatever corrupted the filesystem metadata. The disk check can't detect that; its only purpose is to avoid further corruption.

KB2854570 lists some reasons that this can happen. One is hibernating an OS with a volume mounted, modifying the contents of the volume, then resuming from hibernation with the volume (re)attached. If you do that, there is a high probability of silent data corruption.

I don't know why your ext4 filesystem was checking itself once per week, but it was probably (hopefully) not due to a comparable crisis that recurred weekly. It was probably just doing a routine sanity check, and not a full consistency check.

benrg

Posted 2011-03-11T18:08:24.937

Reputation: 548