5

I am installing an Intel Xserve (Quad core Xeon) with Snow Leopard Server (10.6) on two 80Gb 7200rpm SATA HDs.

I created a mirrored RAID set using Disk Utility with those two drives, all went fine.

I was then asking myself if this is really a good idea. I know that an hardware RAID system would be better, but what about this software RAID?

Have you any feedback on this? Will it work fine if one HD breaks down? Does this affect performance?

[UPDATE]

In short: Hardware RAID is better than software RAID which is better than none.

Thank you all for the answers, they were very helpful.

Especially Gordon's script to monitor failures. As Apple's software RAID is pretty silent about a drive failure.

Arko
  • 222
  • 3
  • 9

6 Answers6

19

I'll second SvenW's warning about silent failures; if anything, it's a little too good at surviving a drive failure. I've seen the aftermath of a couple of servers that had one drive drop out of a software mirror for some reason (I suspect not coming ready in time after a reboot); everything works fine off the remaining drive until, several months later, something goes wrong with THAT drive -- and it switches back to the drive that glitched the first time, and the last few months have vanished.

Here's a short shell script I whipped up to fix this. Substitute in your email address, save it as something like /etc/periodic/daily/150.check-raid, make it executable, and it should mail you a warning (at 3:15 the next morning) if the raid ever degrades. To test it (strongly recommended in case of spam blocks, etc), plug in a couple of disposable drives (USB keychain drives, whatever), mirror them, unplug one, leave the other overnight and see if you have a warning in your mailbox in the morning.

#!/bin/sh

# This script checks for any degraded/offline/failed/whatever software
# RAIDs, and if any are found emails a note to an admin.  To use it,
# replace the ADMIN_EMAIL value with your own email address, drop it in
# /etc/periodic/daily, and change the owner to root.  This'll make it
# run its check every morning at 3:15am.
#
# Warning: this script doesn't check anything other than software RAIDs
# built with the Apple (i.e. Disk Utility) RAID tools.  It does not check
# any hardware RAIDs (including Apple's RAID card), or even any third-party
# software RAIDs.  If "diskutil listraid" doesn't list it, it's not going
# to be checked.
#

ADMIN_EMAIL="user@example.com"

if diskutil listraid | grep "^Status:" | grep -qv "Online$"; then
    diskutil listraid | mail -s 'RAID problem detected' "$ADMIN_EMAIL"
fi
Gordon Davisson
  • 11,036
  • 3
  • 27
  • 33
  • 2
    This... is exactly what i was looking for! Thank you. Wish i was able to give you an UpVote. I'm feeling much better now i have a way to automatically monitor that software RAID. – Arko Jun 24 '10 at 12:46
  • Nice script... easy to implement. I'm trying to figure out how to add the output of `diskutil listraid` to the body of the email. Any ideas? Also, if I try to run the `mail -s ...` command directly from command line, it expects me to do CTRL+D to send the mail. Does the script not expect this? – churnd Jun 24 '10 at 16:49
  • @churnd: actually, the script should already do this by piping the output of diskutil listraid into the mail command. mail reads the body of the message from stdin; if you don't have the pipeline there, it'll read from the terminal, ending with ^D. – Gordon Davisson Jun 24 '10 at 18:28
3

My personal preference is always for hardware RAID but would use software over none. I am also aware there are some who have the opposing view. Yes, it will continue to function if one of the drives fail. That is the main reason to use RAID. The "R" stands for redundant.

Software RAID must impact performance because the CPU is having to do more. However, in reality if that difference becomes noticeable you're pushing the server far too hard, which will result in more problems than the loss of a few CPU cycles. e.g. Heat will become a serious concern.

One question I would ask is why such tiny drives? For very little more you can get much larger drives. If nothing else, 80GB drives are becoming quite hard to source, which may be an issue from the future maintenance point of view.

John Gardeniers
  • 27,262
  • 12
  • 53
  • 108
  • 2
    Not long ago, 80GB was the standard size for XServe drives, and Apple charges much more for larger drives (currently, you pay 200$ for 1TB instead of 160GB and 450$ for 2TB). It' financial attractive to go with the base model and buy your drives elsewhere. – Sven Jun 23 '10 at 10:59
  • Could be used or an inherited system, so that's why they're tiny. Or it's for a specific purpose that doesn't require huge drives. – Bart Silverstrim Jun 23 '10 at 11:40
  • If you just need an OD master, 80GB is plenty and OS X Server is the only way to get it. – MDMarra Jun 23 '10 at 11:48
  • Thank you for your answer, i was thinking of it the same way: Better to have a sw RAID than none. As @SvenW and @Bart commented, 80GB is the standard drives that were included with the Xserve, and yes it is a inherited system. I am planning to leave it like this for the system drive and add eventually a third HD for data if needed. As it will be used as a light web server, for dev and staging purposes. – Arko Jun 23 '10 at 12:54
3

I have good experience with the software raid, but I only use it as the system drive. Be sure to use Server Monitor or Disk Utility though to check for the drive status, as at least in MacOS 10.4 the system is quite silent about a failed drive. I am not sure if this got better in 10.5/10.6, this is something on my test list. One thing I really miss though is RAID5, when you have more than two drives all you can do is striping.

In my case, performance was not affected noticably, but again, it's used only as a system drive, with an FC RAID for data.

When you are still in the testing process, make sure you test failure/recovery by removing a drive so you know how it's done in case of a drive failure.

Sven
  • 97,248
  • 13
  • 177
  • 225
  • RAID 5 is bad for large drives! Ugh! Unrecoverable errors...silent...I've spouted about this before when I hit it (Dell PERC where a drive failed in a RAID 5, discovered that a "good" drive and an unrecoverable error on it that was never reported and couldn't fix). Now I'm finding articles online about this becoming a bigger problem as drives get larger and error tolerances from manufacturers are getting fudgier. RAID 5 should really not be an option anymore. – Bart Silverstrim Jun 23 '10 at 11:38
  • Good advice about testing failure and recovery. Will definitely do it. – Arko Jun 23 '10 at 12:57
  • Failures are still silent in 10.5 and 10.6; see my answer... – Gordon Davisson Jun 24 '10 at 15:47
2

The XServe RAID card is $699 - although I much prefer hardware RAID I'd say that OSX's software RAID is good enough to seriously undermine the justification of such an expensive hardware controller.

Basically don't worry about it, spend a fraction of this money on a =>80GB USB disk and leave it to Time Machine to give you a recoverable backup if you lose your mirror.

Chopper3
  • 100,240
  • 9
  • 106
  • 238
2

Modified version of Gordon's script to check Apple's hardware RAID card.

#!/bin/sh

# This script checks for any degraded/offline/failed/whatever software
# RAIDs, and if any are found emails a note to an admin.  To use it,
# replace the ADMIN_EMAIL value with your own email address, drop it in
# /etc/periodic/daily, and change the owner to root.  This'll make it
# run its check every morning at 3:15am.


ADMIN_EMAIL="example@example.com"

if raidutil list status | grep "^General" | grep -qv "Good$"; then
     raidutil list status | mail -s 'RAID problem detected' "$ADMIN_EMAIL"
fi
Daniel
  • 21
  • 1
1

The best utility to monitor any OS X RAID (hardware or software) is RAID Monitor. I use it on all of my OS X boxes that have RAIDs and it's great stuff, something that Apple should have included in their OS.