2

We have a ProLiant server with Smart Array P440ar controller.

Since about three weeks the "don't remove" lights on some of the SAS drives are on. Perhaps after a power loss.

If we reboot it, it shows in the post screen:

Embedded RAID : Smart Array P440ar Controller - Configuration Required

  • 1786-Slot 0 Drive Array Recovery Needed The following SAS/SATA drive(s)

need Automatic Data Recovery (Rebuild): Port 1I, box:6, bay: 4 (SAS)

The Smart Storage Administrator software shows the status "Ready for Rebuild" on our RAID5 Array. We have two arrays: A RAID1 with two disks (obviously) and a RAID5 with three disks. The error status is given for the RAID5. It does not show errors on any specific drive, just on the array.

It's been like this for weeks. We tried to shut down the virtual servers for a week-end in order to give the server "rest" because I read somewhere that the rebuild might not start if there is too much load on the drives. However it also didn't start with the virtual servers shut down.

Here's the link to the ADU Report: http://strategyplayer.net/downloads/offtopic/ADUReport1.zip

My knowledge isn't good enough to read something useful form it.

Any help appreciated!

david-c
  • 21
  • 3

1 Answers1

2

"Ready for rebuild" means that the rebuild process is blocked. If it's on the RAID5 array, you probably have run into a URE situation where you have a failed disk and a failing disk. The failing disk likely has read errors and can't be read from reliably to complete the reconstruction of the failed disk.

You should run a proper backup NOW, then try turning the hardware off (drives spin down) and powering up again. Watch the controller prompts and see if that jumpstarts the rebuild process.

Edit:

After reading the ADU report, two of the three 300GB disks have hard read errors. URE is the likely situation. Power loss alone wouldn't necessarily cause this. Is this system connected to a backup battery?

ewwhite
  • 194,921
  • 91
  • 434
  • 799
  • Thanks for your reply. So you say that two of the disks are failing now (apparently started failing at the same time)? So what can we do about it? As far as I understand we could just replace the failing disk if ONE disk in a RAID5 is failing. But what do we do if TWO are failing? – david-c Apr 18 '18 at 11:06
  • Take a backup while the systems are up. Do what I said... turn the system off. See if the drives begin to rebuild. If they don't you'll have to rebuild the array. There's no recovery option at that point. – ewwhite Apr 18 '18 at 11:38
  • We're taking a separate bare metal recovery backup right now. (We have a scheduled backup every night.) After this we will try powering down and powering up the device again. We already did it some days ago and got a prompt form the controller (I mentioned the prompt in the original question.) May I ask for clarification about your definition of "rebuild" and "recovery" of the array? I've read quite a view conflicting definitions online (meaning: not anyone seems to mean the same thing with "rebuild") so I want to make sure that I understand your message correctly. – david-c Apr 18 '18 at 12:32