3

I have a system that has a high throughput of small files on disk, i.e. a huge number of small files are created, written and deleted within seconds.

Are there any reasonable ext2/ext3/ext4 mount options to improve the performance? I guess, meta-data journalling results in a huge performance drop here.

Michael
  • 285
  • 4
  • 16

2 Answers2

5

I have a Synology NAS (DS1815+, DSM 5.2) running ext4 and I observed bad performance on directories with thousands of (small) files.

The Synology NAS ext4 did not have the dir_index extension set up by default! I have no idea why Synology does that, because the dir_index feature only jumps in for big directories. It then creates an index which makes accessing, counting, iterating etc. much faster.

To check if you have the feature installed, use tune2fs -l /dev/yourDev :

tune2fs -l  /dev/vg1/volume_1  | grep features 
Filesystem features:      ... dir_index ...

If you do not see dir_index in the list of features, than you can add it to your ext3/4 filesystem:

  1. umount /dev/youDev #Unmount the device
  2. tune2fs -O dir_index /dev/yourDev #add feature to FS
  3. e2fsck -D /dev/yourDev # Create indexes

For a Synology NAS the steps are:

syno_poweroff_task -d
vgchange -ay
tune2fs -O dir_index /dev/vg1/volume_1
e2fsck -D /dev/vg1/volume_1

The dir_index feature is available for ext3 and ext4 and is usually/often a default when creating ext4 on modern Linux distributions. However - as this Synology NAS shows - it's worth checking. In my case, this features increased rsyncing folder with 100k (small) files to the NAS from 4Mb/s to 14Mb/s.

Some quoting from www.debian-administration.org

The most useful tweak you can perform is the way that directory indexes are scanned when looking for files. This is controlled by the option "dir_index". By default this wouldn't be enabled but you can add it by running:

mine:~# tune2fs -O dir_index /dev/sda1

Once you've done this you'll be able to see the updated filesystem flags which are in use:

mine:~# tune2fs -l /dev/sda1 | grep features Filesystem features:
has_journal resize_inode dir_index filetype needs_recovery sparse_super large_file

Once you've done this you should find that listing the contents of directories with large numbers of files becomes faster, and that finding files in directories is also better.

alfonx
  • 230
  • 3
  • 8
4

Yeah, metadata ops are going to absolutely kill you. The most important mount option I can imagine helping would be noatime, which turns off atime (or "last access time") updates on all files. That'll stop one metadata write for every file access, which might halve your I/O rate (if you read/write each file once) to hundreds of times (if you write once, read many). noatime also implies nodiratime, which turns off atime updates just on directories. If that's a bit too brutal (you need atime sometimes), then consider relatime (mount(8) explains that one better than I can).

On a hardware level, seriously consider more RAM and a non-volatile cache hardware RAID controller. More RAM helps the kernel cache more data, which reduces (or can even eliminate) read I/O, and the NVRAM cache RAID controller means that your data is safe and secure after it's been written to flash (which is fast), rather than all the way down to the spinning disks (which is slooooow). You can also go SSDs, but they're still significantly slower than NVRAM.

womble
  • 95,029
  • 29
  • 173
  • 228
  • Nice answer. What about turning off journalling (via `tune2fs -O ^has_journal`) or using ext2. Would that improve performance? – Michael Aug 07 '12 at 09:28
  • I haven't benchmarked those, but I'd be wary of them because they reduce durability. If you can trade-off durability for performance, there are a whole *range* of interesting things you can do... – womble Aug 07 '12 at 09:30