2

Is a BBU necessary when you have A+B power?

Would the data on the RAID Card Cache and Drive Cache be lost when a kernel panic happens and you are forced to do a cold reset?

ispirto
  • 499
  • 9
  • 21
  • What is "A+B power"? This is not a term I'm familiar with. – Evan Anderson Nov 22 '14 at 19:26
  • @EvanAnderson Datacenter-provided [discrete A and B power feeds](http://onr.com/wp-content/uploads/2013/03/Power-Infrastructure1.png) distributed along separate paths and delivered to the cabinet. – ewwhite Nov 22 '14 at 20:49

3 Answers3

4

Some of this has been covered here before... and here, too. I can't think of any situation where you wouldn't want a battery-backed or flash-backed cache unit on your hardware RAID controller. It's what makes write caching possible.

See: BBWC: in theory a good idea but has one ever saved your data?

If your system panics suddenly, the question as to what happens to in-flight disk transitions depends a bit on the nature of the crash, when it happens, the filesystem in use and your storage subsystem. I've had data corruption in some cases and I've also had the RAID controller cache save the day.

Cache Status Details: The current array controller had valid data stored in its battery/capacitor backed write cache the last time it was reset or was powered up. This indicates that the system may not have been shut down gracefully. The array controller has automatically written, or has attempted to write, this data to the drives. This message will continue to be displayed until the next reset or power-cycle of the array controller.

With regard to A/B power feeds, it's nice that your datacenter or facility provide it, but it should have no bearing on your RAID controller caching decision.

enter image description here

ewwhite
  • 194,921
  • 91
  • 434
  • 799
  • I think the RAID card's software doesn't recognize if there is an OS crash so it won't force itself to write all the data in the cache to the disks. Instead, it'll keep the data like nothing happened to the OS and BBU will power the cache during a cold reboot and then after the reboot, the RAID card will write the data on the cache to the disks. – ispirto Nov 23 '14 at 19:29
  • Which may or may not be successful... There's no promise that there won't be data loss or corruption. But that doesn't matter. It makes sense to have a flash or battery-backed cache on hardware RAID controllers, regardless of the datacenter power situation. – ewwhite Nov 23 '14 at 19:30
  • What about the drive cache, though? I think it's not safe to use drive caches. – ispirto Nov 23 '14 at 19:37
  • @ispirto Don't use them. – ewwhite Nov 23 '14 at 19:38
3

A "real" hardware RAID controller (not a "fake" RAID that relies on the host CPU) is a freestanding computer separate from the host computer in which it's installed. A hardware RAID controller will handle reading/writing from the disks as the operating system makes requests, but it does not specifically rely on anything running inside the host computer operating system to function. The controller's operating system will continue running (and flushing cache, etc) even if the host computer's operating system crashes.

Edit:

I didn't mention the battery backed cache at all. I'm so used to RAID controllers like Dell's PERC series that disable write-back caching when there's no battery that I just consider a battery backup to be an integral part of any serious RAID controller.

re: the kernel panic scenario - It's worth noting that your RAID controller isn't going to save you at all if your operating system or applications aren't leaving the filesystem or their data files in a consistent state at the end of each write. If you're using a journaling filesystem or database applications that are ACID-compliant then your odds of losing data are a lot lower than if you're using filesystems or applications that are capable of leaving their on-disk data structures in an inconsistent state.

Evan Anderson
  • 141,071
  • 19
  • 191
  • 328
  • So if I understand correctly, When a panic happens, if the filesystem finished the writes, some of the data that filesystem writes may be still on the RAID card's cache. So if we give the RAID card enough time before the cold reset, wouldn't it write the data on the cache to the disks? – ispirto Nov 23 '14 at 19:24
  • It looks like there is no guarantees whether or not the RAID card will flush its cache to the disk. So, yeah a BBU is a must. Thank you. – ispirto Nov 23 '14 at 19:35
2

Yes, redundant power, while reducing the need for a BBU, does not eliminate that need.

Consider the case where you wire up A/B power incorrectly, for example.

The additional cost for a BBU is usually worth it where your data matters e.g. storage node, DB, etc.

dmourati
  • 24,720
  • 2
  • 40
  • 69
  • The machine's power distribution board or motherboard might also fail. Although personally I've seen more RAID card failures. – Zan Lynx Nov 22 '14 at 20:22