recently we have a failure in our storage, we need to fsck. The storage is about 1.2 Tera, and it took us more than 5 hours.

Is there an alternative solution for the ext3 filesystem, or one that is better than ext3? Suggestions with pro and cons are welcome.


  • 3
  • 3
Abi Aqil
  • 53
  • 1
  • 5
  • 5 hours? That's a lot. What's your hardware? You don't say anything. There's no way you can get any decent help with such limited info. – niXar Aug 18 '09 at 01:30
  • 35k number of users, and the storage is use to store emails, emails are stored in maildir formar. like someone mentioned in ansers below we have more than one millions emails... – Abi Aqil Aug 18 '09 at 02:08
  • 1
    You don't mention what the "failure in storage" was...is there a possibility that there's a problem with the drive (or a drive in a RAID) that can be causing it to need re-reading and resets, lengthening the time to check the drive?? – Bart Silverstrim Aug 18 '09 at 11:02

6 Answers6


There are a few LWN articles that might be of interest:

  • 1,260
  • 7
  • 10

You've got a lot of options here. File systems although having a similar basic use all behave differently depending what type of workload you throw at them - its practically a certainty that while one person may swear by the benefits of say ReiserFS another will loath it.

From an enterprise point of view the two file systems I'm most familiar with are JFS & XFS. Although not very widely used I'm a fan of JFS as its NEVER let me down, got good to high performance in a variety of workloads, very stable and is relatively tolerant of power failures. XFS will give you a bit better performance but it does have significant known risks of data loss or corruption if power is interrupted - definitely worth using if you have managed power.

Desktop land I'm now using Ext4 exclusively as its MUCH faster than ext3 and is causing less i/o overhead and cpu use which is great for extending my laptop battery life :-)

Good references for more info: http://en.wikipedia.org/wiki/Comparison_of_file_systems

Edit: As others have also mentioned, in the event of a file system error [corruptions etc] some kind of repair is going to have to occur to fix the problem. Whether this is automated [like ZFS] or manual [virtually everything else] it will take time for your file system to get back into a clean state. Most of the time you are going to have to do these operations with the file system either unmounted or in a read-only state. How much time this takes is going to vary mostly depending on the severity of the problem, the size / state of meta-date and the speed of your disks. A horrible time example I've been through was an XFS corruption requiring a complete 12TB file system rebuild which took around 12hrs to complete.

  • 802
  • 6
  • 9

I suspect that you're going to see similar results for any filesystem when you have to run a full check of the filesystem. If you have 1,000,000 inodes in use, it doesn't really matter how they're organized if you have to check the consistency of all of them. Any way you cut it, you're going to be touching 1,000,000 files.

The things that will significantly speed this up are faster disks and more spindles. If you need 1.2 TB, you'll get significantly better performance out of 8× 300 GB SAS drives in RAID 10 than you will out of a single 1.2 TB SATA drive, independent of the filesystem. Sure, it will cost you more, but what does your downtime cost you? It still won't prevent filesystem errors, but it will reduce the recovery time.

Something else to consider is whether the data on the failed volume changes very much. If it's mostly static and you have a good backup, it may be faster to re-mkfs the volume and restore from backup. You risk losing recent changes, but again, you have to weigh this against the cost of downtime.

James Sneeringer
  • 6,755
  • 23
  • 27
  • Downvoted because he's asking about filesystems, not drive architectures. – Karl Katzke Aug 18 '09 at 02:47
  • 1
    upvoted because he may think the primary issue is filesystems, but it's also possible that he didn't consider the effect of hardware on the filesystem check when there is specific checking for corruption of data...I didn't see anything wrong with proposing checking something someone may not have considered in pursuing the problem. – Bart Silverstrim Aug 18 '09 at 11:00

I'm not sure there are any real good alternatives that in the case of file system corruption fsck doesn't take a long time on a file system that size. XFS is typically my file system of choice on Linux but any corruption on it also requires fsck. I'm not sure it's that much faster than ext3.

While this doesn't help with your current situation the days of fsck are numbered. The newer COW file systems, such as ZFS on Solaris/OpenSolaris, are never inconsistent and do not require fsck. I'm hoping that Btrfs will be similar in that regard on Linux once it's production ready. For now the best thing to do is try to limit the size of the file systems...not always an option in today's explosion of unstructured data.

Do you know what caused the file system corruption to begin with? If it was sudden lost of power then the best thing you can do is get a UPS.

  • 12,409
  • 2
  • 27
  • 41
  • 2
    HAHAHAHAHAHAHAHA... ZFS never requires a fsck? Pull the other one, it plays jingle bells. More things can cause filesystem corruption than just an inopportune power failure. – womble Aug 18 '09 at 01:31
  • ZFS doesn't even have a fsck utility. Due to the COW and checksumming it's metadata and data are always consistent. That doesn't mean that data not fully flushed to disk will be there or that hardware failures can't do funny things. But in general the check summing should catch the hardware related issues. http://opensolaris.org/os/community/zfs/faq/#whynofsck – 3dinfluence Aug 18 '09 at 01:41
  • "In general" doesn't mean "always"... so you're saying that "Most of the time the checksumming should catch hardware issues". Given that I know two separate people who have written programs to repair corruption in ZFS filesystems, your assertions hold little value for me. – womble Aug 18 '09 at 01:49
  • Just looked at the fsck situation with btrfs. It looks like it will have a fsck utility and it will be able to run on a mounted online filesystem. I'm not exactly sure how it differs from a scrub though given the checksumming on both the metadata and the data. – 3dinfluence Aug 18 '09 at 01:54
  • 1
    I hear you. I know of one such instance as well. But in the that case I don't know that fsck would have helped. As we all know fsck isn't perfect either. I've lost more ext3 file systems than what most people consider more aggressive file systems such as XFS. It's all just a roll of the dice and at some point backups are the only solution. But with ZFS I'll still say that if you're doing proper maintenance on your pool (scrubbing) the chances of ever needing a fsck is near zero outside of truly bizarre situations where backups are your friend. – 3dinfluence Aug 18 '09 at 02:09

If you're on Linux I'd actually recommend taking a close look at btrfs.

Imo there is no filesystem suitable for large storage in Linux right now. XFS, JFS reiser and others have their own set of problems to begin with. JFSs defrag for example hasn't been ported to Linux yet. XFS is only a good option when you can make sure, that there will never be a power loss and your hardware is absolutely reliable. I'm not convinced that the latter one is predictable. Reiser isn't really in active development anymore and has its fair share of bugs, sadly.

btrfs is an effort to bring ZFS like functionlity to Linux while avoiding some of the design shortcomings of ZFS. Keep in mind though, that btrfs is currently still under active development, but it is maturing. I wouldn't recommend it for production use quite yet, but it may be a viable upgrade path for you in the future. While you wait you might actually want to stick with ext3/4. While those are far from perfect they are imo your best choice on Linux right now.

  • 573
  • 5
  • 14

Have you considered splitting your volume up? Then you can run parallel fscks (or fewer, if you have some clean FSs).

eg. 3 x 400GB volumes = 1.2TB. Use symlinks or change homedir paths, to put files in the right spot.

Mail is pretty good for this - you don't get very big files.

  • 816
  • 4
  • 5