54

Which Linux filesystem would you choose for best speed in the following scenario:

  • a hundred million files
  • ~2k file size on average
  • >95% read access
  • pretty random access
  • high concurrency (>100 processes)

Note: The files are stored in a deep hierarchical tree to avoid large directories. Each leaf directory contains around one thousand files.

How would you benchmark it?

Mutantoe
  • 91
  • 6
bene
  • 2,214
  • 2
  • 19
  • 14
  • 3
    There's some additional info needed. For instance, are you storing all files in a flat directory, or in nested (sorted) directories? This can have a dramatic performance impact on file access times. Sifting through 100,000,000 entries in a "flat" arrangement will entail significant overhead regardless of the FS type; best case, you're looking at a tree search of some kind, which still requires multiple lookups to arrive at your file. If you catagorize the files into subdirectories, access time will significantly speed up as there are fewer entries to search at each level. – Avery Payne May 10 '09 at 17:54
  • Are the file accessed serially or concurrently ? – Steve Schnepp May 11 '09 at 08:45

7 Answers7

23

In terms of random seeks Reiser wins, followed by EXT4, followed by JFS. I'm not sure if this will correlate exactly to directory lookups, but it seems like it would be an indicator. You'll have to do your own tests for that specifically. EXT2 beats the pants off everything for file creation times, likely due to its lack of a journal, still EXT4 beats everything except Reiser which you may not want to use due to hans reiser's current status.

You might want to look into drives that support NCQ, and make sure your install is setup to use it. Under heavy seeking it should provide a speed boost.

Lastly, make sure your machine has a ton of ram. Since the files aren't often updated, linux will end up caching most of them to ram if it's got free space. If your usage patterns are right, this will give you a massive speed boost.

sysadmin1138
  • 131,083
  • 18
  • 173
  • 296
Andrew Cholakian
  • 876
  • 1
  • 6
  • 12
  • 1
    the problem of bonnie++ is that it doesn't even roughly test my usage scenario – bene May 10 '09 at 14:54
  • 2
    You've got a point about it not testing directory lookups, but honestly, if that's your choke point, you're better off dumping your data into a real database. Filesystems don't work nearly as well on the small objects most databases are designed to use – Andrew Cholakian May 12 '09 at 02:24
  • 7
    @AndrewCholakian Link is now dead. – Don Scott Nov 17 '15 at 04:13
8

I agree with most of what Andrew said, except that I would recommend Reiser4 or the older (but better supported) ReiserFS. As those tests (and the documentation for ReiserFS) indicate, it is designed for precicesly the situation you are asking about (large numbers of small files or directories). I have used ReiserFS in the past with Gentoo and Ubuntu without any problems.

As to the status of Hans Reiser, I do not see it as being a problem with the code or stability of the File System itself. Reiser4 is even sponsored by both DARPA and Linspire so while I agree that the further development of the Reiser File System is undetermined, I do not thing that should be a deciding factor as to whether anyone should use it or not.

Mike
  • 404
  • 3
  • 7
  • 5
    I've used ReiserFS for a long time. Actually, I'm *still* using it on an older Gentoo server I haven't gotten around to reinstalling yet. This installation is 4 years old this May. What I *can* tell you is that it has slowed down significantly. That phenomenon has taken place over time on all file systems using ReiserFS that are in active read+write usage on all machines which had such file systems, no exceptions - so if you want to use it over a prolonged period of time it's something to keep in mind. I've moved away from it, using XFS for big filesystems now. – Mihai Limbăşan May 10 '09 at 00:54
4

I know this is not a direct answer to your question, but in these cases I think a database might be more suitable to host this. Small files can be stored in binary format in a database table and retrieved at wil. The software that is using these files should be able to support this though...

Jeroen Landheer
  • 448
  • 1
  • 6
  • 14
  • 1
    What is a file system, if not just a hierarchical database? Your proposal adds layers of abstraction, complexity and software that are probably not warranted. Furthermore, the question's owner is accomplishing his task with 'UNIX Philosophy' which I suspect you dislike being more of a Windows guy? – Stu Thompson May 10 '09 at 10:14
  • 4
    First of all, I have nothing against Unix or anything else in that area. There are big differences between file systems and databases and that's why both technologies were developed. Databases are designed to work with a huge amount of small entities, in which they do a better job than most file systems. I was merely pointing out that there might be another road you can take with this. – Jeroen Landheer May 11 '09 at 19:16
  • 1
    And it is much easier to "clean/vacuum" a db file than defrag a filesystem on linux. Most/all of the fs don't provide that functionality, saying it isn't necessary. Noting Mihai's comment above though, you can see it isn't strictly true. – Gringo Suave Jun 07 '13 at 22:50
3

Somebody over on the Unix StackExchange created a benchmark (with source) to test just this scenario:

Q: What is the most high-performance Linux filesystem for storing a lot of small files (HDD, not SSD)?

The best read performance seems to come from ReiserFS.

thenickdude
  • 340
  • 2
  • 9
  • Btrfs looks to have better or comparable results in everything but delete. But, how often do you delete 300k files? I liked rfs in the past, but btrfs might be a better bet for the future. – Gringo Suave Jun 07 '13 at 22:57
3

In my experience, ext2 blows ext4 out of the water for small files. If you don't care about write integrity, it's great. For example, subversion creates lots and lots and lots of small files, which ext4 and other filesystems (XFS) choke on (run a cron job that rsyncs the data to ext4 from ext2 every half hour or so virtually solves the problem.)

Running these commands makes ext2 even faster (even though most of these options make the filesystem unstable after a crash unless you run sync before it crashes). These commands have almost no effect on ext4 with small files.

echo 15 > /proc/sys/vm/swappiness
echo 10 > /proc/sys/vm/vfs_cache_pressure
echo 99 > /proc/sys/vm/dirty_ratio
echo 50 > /proc/sys/vm/dirty_background_ratio
echo 360000 > /proc/sys/vm/dirty_expire_centisecs
echo 360000 > /proc/sys/vm/dirty_writeback_centisecs
echo "2000" > /proc/sys/vm/vfs_cache_pressure
GregL
  • 9,030
  • 2
  • 24
  • 35
Jason Hall
  • 31
  • 2
  • actually if the files are tiny then ext4 would be better because it supports [inline files](https://lwn.net/Articles/469805/) which are stored directly in the inode (similar to [resident files](https://en.wikipedia.org/wiki/NTFS#Resident_vs._non-resident_attributes) in NTFS which are stored in the MFT record). Since inodes are much smaller than MFT record by default, it may be useful to increase inode size to fit more small files in the inodes – phuclv Apr 22 '21 at 04:58
1

I guess ext3 (or ext4), maybe JFS would be nice solution. I'd be wary with ext4 and btrfs (filesystems are tricky - be prepared with backups if you want to use latest, newest stuff).

There are also various parameteres you can tweak during mkfs time to tune the filesystem to your liking.

I'd certainly recommend against XFS. Not because it's a bad filesystem, but creation/deletion is a costly operation on it.


To avoid problems with directory searches, use an intelligent naming scheme, for example:

<first letter of id>_<last letter of id>/<id>

or similar, more complicated schemes. This will speed up your directory searches and thus overall access speeds. (It's an old unix trick, back from V7 I think)

  • 1
    what's the advantage of using the first and the last letter and not just the first n letters? – bene Jun 11 '09 at 14:55
  • it's just one of possible schemes - whether it would be an advantage depends on the "key" used for indexing. This particular scheme I had seen referenced with application that stored data on people in organisation, and this way they've got better indexing. As always, you need to adapt it to your data and then profile till you find exact answers :) –  Jun 30 '09 at 18:31
1

Most FS will choke with more than 65K files in a dir, I think that is still true of ext4. The Reiser file systems do not have that limit (the folks at mp3.com paid to make sure of that). Not sure about anything else, but that is one of the usage scenarios that ReiserFS was made for.

Khaled
  • 35,688
  • 8
  • 69
  • 98
Ronald Pottol
  • 1,683
  • 1
  • 11
  • 19
  • This weekend I had a dir on ext4 with 1000000 files in it. As long as you do not do `ls` or tab-completion it works fast. Probably due to the index. – Ole Tange Apr 14 '14 at 11:30
  • ext4 has a dir_index extension, that speeds up many files in one directory. – alfonx Feb 22 '16 at 22:58