4

I have read a lot on serverfault and searched on google about write caching, but still cant find the answer.

I have an HP ProLiant DL380 G5 with HW raid with 512 MB Battery backed cache. I use debian linux.

There are 3 write caches: 1. OS write cache 2. HW raid with 512 MB Battery backed cache 3. HDD caches

My question is: How to configure them properly, so there will be no data loss during power loss?

I was thinking that disabling the OS write cache and HDD caches would solve the problem and it will be still performing well because of HW raid cache. Am I right?

The second question is about HW raid cache read/write ratio. I was thinking that since the OS RAM is used as a read cache, that it would be better to change the ratio of HR raid cache to 0/100 or maybe 0/80 (read/write). So it will be utilized better that 50/50 which I think is the default read/write cache ratio. What is the optimal values of this ratio?

Thank you

ewwhite
  • 194,921
  • 91
  • 434
  • 799
Jozko
  • 43
  • 1
  • 3

2 Answers2

3

These systems are designed to just plug in and go. Here's how each tier handles I/O.

OS

Writes are cached briefly (dirty pages) in RAM while the I/O subsystem actually commits things. Once a write is committed, the page is then cached in case it is immediately read again. The OS Cache does not maintain a pool of uncomitted writes, it maintains a pool of already comitted writes that may need to be read again. It is, in effect, a 100% read cache.

RAID Controller

The BBC of the RAID controller receives the Write from the OS. Depending on the cache policy of the volume being written to (write-thru vs write-back), the RAID controller may report the write as Comitted at this time. It will then queue the write for comitting to actual disk

Disk

Some RAID cards actually do disable the HD cache. Others, don't. I don't remember how HP does theirs, but would not be surprised if the HD cache is disabled and the write-optimization logic is pushed up into the RAID controller itself; there is a reason HP uses custom firmware on their drives.


Operating systems, and the filesystems they support, know very well that sudden power-loss is a failure mode that can kill writes between the time the OS determines that it needs to happen and when the storage system reports it is done. We've been doing this a while now, and we're pretty good at defending against it.

The XFS filesystem has a bad reputation for survivability in sudden power-loss situations due to how it handles metadata writes. But then, it's intended environment is one where power is presumed to be adequately redundant. Other filesystems, the ext series, btrfs, and of course zfs, survive that just fine as well.


If you're operating in an environment with known bad power, to ensure no data loss during power outages:

  • Use a filesystem known to be robust for sudden power loss (basically, anything but XFS)

And that's it. The BBC on the RAID card ensures the RAID cache is preserved until power is restored. The disk caches are likely disabled. No need to tune the RAID card cache to be all-read. No need to disable the OS block caches.

Really.

sysadmin1138
  • 131,083
  • 18
  • 173
  • 296
  • IMHO disk internal cache use the same write-through logic as OS cache. – DukeLion Feb 10 '13 at 10:53
  • I am not sure about Windows but Linux may defer non-sync writes up to the sysadmin's configured time. Defaults are 5 or 60 seconds depending on the filesystem. – Zan Lynx Nov 22 '14 at 20:26
1

A typical setup for the HP ProLiant server and controller you have is to leave OS write caching ON, enable the RAID controller's caching (set to 25:75 read:write ratio) and it's your choice on the individual disks' cache. Better to leave them off to be safe in this setup. The warning in the array controller configuration utility specifics to only enable drive write caching is you have stabilized/protected facility power. While HP uses specific firmware on their disks, there's no performance difference between them and the RAW Seagate (or other OEM) disks they use.

You'll want the latest firmware for the Smart Array P400i controller embedded in the system as well.

XFS is a perfectly-fine and stable filesystem option for enterprise (server-class) hardware, especially if you have a battery-backed of flash-backed cache on your array controller.

You shouldn't be expecting sudden, uncontrollable power-loss for your server. If you are, please use a well-sized uninterruptible power supply (UPS).

In addition, there are numerous questions here addressing the issues you bring up:

What is the memory module on a RAID card needed for?

Incredibly low disk performance on HP ProLiant DL385 G7

Non-volatile cache RAID controllers: what kind of protection is there against NVCACHE failure?

ewwhite
  • 194,921
  • 91
  • 434
  • 799