Best available technology for layered disk cache in linux



I've just bought a 6-core Phenom with 16G of RAM. I use it primarily for compiling and video encoding (and occassional web/db). I'm finding all activities get disk-bound and I just can't keep all 6 cores fed. I'm buying an SSD raid to sit between the HDD and tmpfs.

I want to setup a "layered" filesystem where reads are cached on tmpfs but writes safely go through to the SSD. I want files (or blocks) that haven't been read lately on the SSD to then be written back to a HDD using a compressed FS or block layer.

So basically reads: - Check tmpfs - Check SSD - Check HD

And writes: - Straight to SSD (for safety), then tmpfs (for speed)

And periodically, or when space gets low: - Move least frequently accessed files down one layer.

I've seen a few projects of interest. CacheFS, cachefsd, bcache seem pretty close but I'm having trouble determining which are practical. bcache seems a little risky (early adoption), cachefs seems tied to specific network filesystems.

There are "union" projects unionfs and aufs that let you mount filesystems over each other (USB device over a DVD usually) but both are distributed as a patch and I get the impression this sort of "transparent" mounting was going to become a kernel feature rather than a FS.

I know the kernel has a built-in disk cache but it doesn't seem to work well with compiling. I see a 20x speed improvement when I move my source files to tmpfs. I think it's because the standard buffers are dedicated to a specific process and compiling creates and destroys thousands of processes during a build (just guessing there). It looks like I really want those files precached.

I've read tmpfs can use virtual memory. In that case is it practical to create a giant tmpfs with swap on the SSD?

I don't need to boot off the resulting layered filesystem. I can load grub, kernel and initrd from elsewhere if needed.

So that's the background. The question has several components I guess:

  • Recommended FS and/or block layer for the SSD and compressed HDD.
  • Recommended mkfs parameters (block size, options etc...)
  • Recommended cache/mount technology to bind the layers transparently
  • Required mount parameters
  • Required kernel options / patches, etc..


Posted 2010-10-17T03:26:35.653

Reputation: 282



Right now there is nothing production level to do this.

Here are the two options I would consider though:

  • bcache is a kernel patch to use an SSD as a cache for random reads and writes. It can be used with any filesystem.
  • KQInfotech has a closed beta of natvie ZFS 2 for linux. The general availability date was supposed to be in early December 2010, but it was pushed back to January 5th and once again to January 14th.

I have not used either of those options on Linux. I have used ZFS on OpenSolaris and FreeBSD though.

Jeff Strunk

Posted 2010-10-17T03:26:35.653

Reputation: 331