0

I have 24 disk enclosure using RAID 5 with 1 hot swap. Here is what it looks like in Dell's OpenManage Software.

enter image description here

As you can see one of the disks is predicted to fail. I have a replacement disk with the exact same specifications. Is it as simple as removing the 'bad' disk and replacing it. Will the hot swap take over? Or do I have to reconfigure anything?

jwillis0720
  • 155
  • 10

2 Answers2

1
  • Remove bad disk.
  • Replace bad disk with new disk.
  • Monitor the rebuild process.

See page 31 of the manual.

ewwhite
  • 194,921
  • 91
  • 434
  • 799
  • If drive is spun up and I need to "prepare for removal from controller software", do I have to reboot the system to access the PERC controller? – jwillis0720 Oct 07 '16 at 01:08
  • No need to reboot into the PERC BIOS - all the needed steps can be performed with the use of OpenManage Server Administrator. – JimNim Oct 10 '16 at 16:50
1

Simply removing the bad disk and inserting the replacement will get the job done, but it's not the safest method. You're using RAID5, so your data is already at high risk of corruption or loss as it is.

Check out the "Replacing A Physical Disk Receiving SMART Alerts" section of the Server Administrator Storage Management User's Guide for the recommended procedure.

I would strongly recommend that you perform a consistency check before replacement - this helps reduce the risk of encountering data corruption during the drive replacement process, especially in the event of a rebuild. The guide I linked notes that "failure to perform a check consistency can result in data loss." After a consistency check, you could move forward with the steps listed in that section of the document (the same steps that ewwhite suggested).

A method that may be potentially safer than manually failing the problematic drive (which forces your RAID5 into a degraded state) would be the steps listed in the "Virtual Disk Task: Replace Member Disk" section - this essentially mirrors data from the problem drive over to a spare without putting the array in a degraded state during the process. The benefit here is that if a different drive failed during the process, you would not lose data accessibility. This method also improves your odds of avoiding double fault conditions that result from bad blocks and lead to corruption (punctured stripes).

JimNim
  • 2,736
  • 12
  • 23