1

I have a locally attached 5 TB RAID6 system (IBM DS3512). It will be used as a storage for large data files that are written in sequential writes, and then read back for processing. Eventually the data is deleted.

Directory traversal is not important, as we have our own indexing service.

Since this is an online system, availability and resilience from corruption is important, as is fast rebuild time.

Does XFS have particular advantages over EXT4 in this context?

Furthermore, how would I go about tuning the filesystem?

The target system runs RHEL 6.3.

ewwhite
  • 194,921
  • 91
  • 434
  • 799
Asgeir S. Nilsen
  • 353
  • 1
  • 2
  • 7
  • What is "your own indexing service"? If it gives a filename, the directory traversal certainly matters, more so if there are lots of small(ish) files in few directories... – vonbrand Feb 20 '13 at 20:12
  • Yes, it is based on file names (full path). Files are typically 30MB+ in size. We employ directory hashing to spread files across directories to keep number of files per directory reasonable. – Asgeir S. Nilsen Feb 21 '13 at 14:11

2 Answers2

4

I'd go with the default ext4, if only because it has shown in practice that it can take quite a beating, and in case of trouble there will probably be much more expertise at hand.

Oh, and before wishing you good luck, don't believe what colored squares with missing pieces tell you on random Internet sites. They might be spouting nonsense, no, they are spouting nonsense unless they know your problem and setup intimately. Set up an experiment with realistic data and workload, and measure. See if the difference really matters, look for other inputs. Check what your operating system vedor recommends.

Oh, and good luck! Please do add an answer (or a comment) with your analysis and conclusions here or contribute it to your distribution's documentation. Might even think of writing an article for LWN...

vonbrand
  • 1,153
  • 2
  • 8
  • 16
2

I think XFS can be tuned well for this purpose. It caches aggressively, handles large files well, works with large file and directory counts and is resilient.

On a RHEL 6.x system, you'll want to employ the tuned-adm framework and bias it towards the intended performance characteristics of your application.

Based on your description, it makes sense to consider the throughput-performance tuned profile.

Note:
There is a small load-related bug that impacts XFS on November 2012 and newer EL6 kernels... There is also a unique optimization feature that is now default in the in-kernel XFS version.

While a big proponent of XFS for the past decade, I've been moving many installations to ZFS on Linux as a replacement, especially for large filesystems.

ewwhite
  • 194,921
  • 91
  • 434
  • 799
  • How would you reckon XFS fares in terms of recovery in case of a server crash? The raid controller is battery backed, so it's really only kernel crashes / uncontrolled shutdowns we need to care about. – Asgeir S. Nilsen Feb 21 '13 at 14:14
  • We have used ZFS before, as the systems ran Solaris before. Is ZFS on Linux compatible with a pre-existing ZFS from a Solaris host? – Asgeir S. Nilsen Feb 21 '13 at 14:15
  • @AsgeirS.Nilsen Absolutely. You can zpool export/import existing pools into the Linux version. – ewwhite Feb 21 '13 at 14:25
  • 1
    XFS still has some pretty serious memory needs. I recently set up a 10TB XFS file system with a few million files (log archive storage) and was unable to run xfs_check on the system with 24GB of RAM. I've read that this has been greatly improved in the recent past but this was on CentOS 6.3, roughly the equivalent of RHEL 6.3. –  Feb 23 '13 at 16:14
  • @yoonix that doesn't sound right. – ewwhite Feb 23 '13 at 16:55
  • Jan 31 19:03:42 backup01 kernel: xfs_db invoked oom-killer: gfp_mask=0x280da, order=0, oom_adj=0, oom_score_adj=0 –  Feb 23 '13 at 17:24
  • @AsgeirS.Nilsen With xfs, after losing part of the block device that the FS is on, you can end up with a lot of files with names like 3221831711 (where the name = the inode number, as it has the inode, but not the directory entry.) You will also find fragments of your directory tree, with the top level directory having a numeric name, but everything under it being ok. (Probably some files missing if their inodes were on the lost part of the device, I forget.) I don't remember having many recovered files with corrupt data, but this was a while ago. Prob. since xfs keeps inodes near extents. – Peter Cordes Dec 28 '14 at 22:39