5

Quick and dirty summary: I want something like a write cache that drains only when the system isn't busy.

I want something along the lines of this question RAM disk and physical RAID with a slight twist.

For a particular build I'm doing often, I would like to keep build output in a ramdisk that eventually gets written out to disk. One particular target I'm building is really a bunch of cp, tar, gzip and such so I'm disk bound. In a tight debug cycle I want it to be fast, and don't care so much if the build output is destroyed by power-off. However, it would be nice if the ramdisk was synced out to disk when the system wasn't busy. I can imagine doing this with a a cron job, but I'm asking you all on the off chance there's a more coherent solution that combines the characteristics of the other question (ramdisk that overflows to disk) with this new twist (given enough idle disk time, the contents of the ramdisk make it out to disk too). Idealy, the whole thing looks like a single mountpoint, where I set the total size and the amount of ram to use.

--- updated ---

I don't think the page cache does what I want because I really want pretty fast write performance.

kbyrd
  • 3,604
  • 2
  • 23
  • 34
  • 1
    What OS would be a good start. Sounds linuxy, ionice perhaps on your rysnc script? – Ronald Pottol Oct 20 '09 at 20:55
  • you may be able to use the batch command (which schedules things to be run with atd) to cause the sync operation to be run when system load is below a certain threshold, see `man batch` – Anthony Chivetta Oct 20 '09 at 21:45

3 Answers3

4

but I'm asking you all on the off chance there's a more coherent solution that combines the characteristics of the other question (ramdisk that overflows to disk) with this new twist (given enough idle disk time, the contents of the ramdisk make it out to disk too).

What you're asking for is a deferred write mechanism, i.e. writes to the ramdisk receive priority over writes to permanent storage, but all data eventually writes to disk, correct?

Idealy, the whole thing looks like a single mountpoint, where I set the total size and the amount of ram to use.

As funny as it sounds, you might be able to get away with using an LVM mirror to complete this process.

  1. Make the ramdisk a member of a volume group with a physical drive.

  2. Mirror the ramdisk to your hard drive. Note that LVM mirrors are direction-specific, i.e. data flows from one PV to another unidirectionally.

  3. Mount the LVM volume somewhere as a unified filesystem.

Writes are spooled up and written to the LVM-based ramdisk (and by virtue of the mirror, the physical drive as well). Unlike a RAID-1 where the writes are synchronous and parallel (both drives write out at once), an LVM mirror is asynchronous and sequential (the primary drive receives the write, then LVM pushes the write to the mirror). This comes close (but not quite 100%) to the behaviour you're looking for. Keep in mind that LVM does put pressure on pending writes to clear to disk, so any "idling" you see will be measured in seconds at best, partial seconds at worst.

The flip side of this arrangement is that you now have a very nice persistence mechanism. When you start up, create your ramdisk and mirror the existing drive to it; once the mirror is complete, break the mirror, and reverse the direction (ramdisk -> hdd). This means every restart will result in your data being put into the ramdisk, and just before shutdown, written from ramdisk back to a hard drive. It could probably be scripted in perl or a shell script.

I'm sure that there are other ways to do this, but this is the quick'n'dirty version. I'll think about it a bit more and see what I can figure out.

Avery Payne
  • 14,326
  • 1
  • 48
  • 87
  • Very nice! I was willing to get much worse write sync deferral, even waiting minutes or more if I had to. I'm saying I want all writes for build output to be at RAM speed, sacrificing reliability. But, if I can get some no-guarantees reliability by using the system when it's otherwise idle, I'll take it. – kbyrd Oct 20 '09 at 22:06
  • 1
    I forgot to mention, the write-back is not exactly "lazy" - with very high I/O pressure, the LVM will try to write out as much as it can, so it might not entirely solve your issue. But the same follow-up edit above also talks about how it can make your data "persistent" with just a small amount of overhead. – Avery Payne Oct 20 '09 at 22:24
2

What you describe is exactly what your operating system's disk cache is supposed to do. Modern OS are very good at this if they have enough memory at their disposal.

In my opinion: Give your machine enough RAM and let the OS do the hard work.

Ludwig Weinzierl
  • 1,170
  • 1
  • 11
  • 22
  • 1
    That's true for reading, but not for writing. The build is doing a bunch of writing and my impression is that the kernel tries pretty hard to keep those now dirty pages written out to disk, eventually slowing down the writes until it can catch up. Using a few GB in a ramdisk, the kernel no longer tries to write out this data, I'm cool with losing it, but I would like to have it eventually (minutes our hours later is fine) hit disk if my system isn't otherwise occupied. – kbyrd Oct 20 '09 at 17:01
  • 2
    Play around with some of the options in /proc/sys/vm/, in particular, dirty_background_ratio. – David Pashley Oct 20 '09 at 17:33
  • I did a cursory look, I'll look more later. But, I want to do this for a particular subset of my system, a mountpoint or directory or whatever. I want pretty normal behavior on everything else. – kbyrd Oct 20 '09 at 18:22
1

Couple of wacky ideas. First, Puppy Linux kinda does this; it syncs your working space to permanent every so often. It seems to do this with smart scripts and simple copy commands.

Second ... what if you scheduled an rsync between ramdisk->real disk every so often?

JamesR
  • 1,061
  • 5
  • 6
  • I ended up doing this, it let me define how long I was willing to wait, plus I could ionice the rsync. – kbyrd Oct 23 '09 at 15:17