8

I am dealing with hundred million files in a filesystem (distributed among a lot of subdirectories), and I need to be able to list them very quickly, particularly in order to rsync them efficiently.

On a other hand, I don't really need to have the actual content of the file kept in cache.

I am constantly adding and removing files, but not that frequently (something like ten times per second).

Is there a way I can tell the OS (2.6.18-194.el5) to use the 24GB available RAM more on inode caching than on file caching? I already looked at /proc/etc/vm/vfs_cache_pressure but it doesn't seem to be exactly what I am looking for...

john.doe
  • 105
  • 1
  • 1
  • 3

3 Answers3

6

How about lowering the value for vfs_cache_pressure? According to Documentation for /proc/sys/vm/* this should do what you want:

vfs_cache_pressure

This percentage value controls the tendency of the kernel to reclaim the memory which is used for caching of directory and inode objects.

At the default value of vfs_cache_pressure=100 the kernel will attempt to reclaim dentries and inodes at a "fair" rate with respect to pagecache and swapcache reclaim. Decreasing vfs_cache_pressure causes the kernel to prefer to retain dentry and inode caches. When vfs_cache_pressure=0, the kernel will never reclaim dentries and inodes due to memory pressure and this can easily lead to out-of-memory conditions. Increasing vfs_cache_pressure beyond 100 causes the kernel to prefer to reclaim dentries and inodes.

Increasing vfs_cache_pressure significantly beyond 100 may have negative performance impact. Reclaim code needs to take various locks to find freeable directory and inode objects. With vfs_cache_pressure=1000, it will look for ten times more freeable objects than there are.

seeker
  • 906
  • 8
  • 4
2

you can use these 2 commands to do the same job.

Updatedb (to update the list of file and folders location in whole drive)

locate / (to list all files in the whole OS, which is lightening fast as it picks them up from the Database)

Farhan
  • 4,210
  • 9
  • 47
  • 76
  • Yup, I run updatedb from cron to keep the inode cache 'warm.' It works well, take a look at your slabtop to see statistics. Also, running `strace -c updated` can give you some insight as far as how much gets updated. – Marcin Dec 06 '11 at 13:00
  • 1
    Thank you, but how does that help me improving rsync performance?edit: Ok, just read Marcin's comment, but regarding the warmness of the cache, how is it different from running a simple find /? – john.doe Dec 06 '11 at 13:03
  • In some implementations, the `updatedb` shell script actually runs `find /` to get the initial file list. On Mac OS X it still works this way. – Ladadadada Dec 06 '11 at 14:38
  • 4
    The OP scans the filesystem multiple times per second, anyway. I don't see any advantage in scanning it even more often. – hagello Oct 06 '15 at 23:57
1

Complementary to seeker's answer I would like to add/highlight a few things about vfs_cache_pressure.

In short it

influences the tendency the system reclaims memory for caching of VFS caches, versus pagecache and swap.

-> doc of /proc/sys/vm/

Some important values:

  • =0: The Kernel will never reclaim memory
  • =100: Reclaim at a "fair" rate (= default)

In order to apply the changes temporarily adapt the value in:

$ cat /proc/sys/vm/vfs_cache_pressure 
15

For a permanent change (applied during reboot):

Either add a line to /etc/sysctl.conf or (better) create a new file in /etc/sysctl.d/*.conf. E.g.:

$ cat 10-vfs-cache-pressure.conf 
vm.vfs_cache_pressure=10

For me decreasing it to somewhere between 10 to 15 resulted in good performance, however this depends of course a lot on your system, amount of users and services running on it. I think there is no substitute other than playing around and carefully looking at the impact.


You might what to have a deeper look into slabtop.

slabtop will help you investigating the consequences when changing this and related kernel parameters (dentry and *inode_cache are what you are most probably looking for -> this answer might also help here).