I have a HP Server with SmartArray P400 controller (incl. 256 MB Cache/Battery Backup) with a logicaldrive with replaced failed physicaldrive that does not rebuild.
This is how it looked when I detected the error:
~# /usr/sbin/hpacucli ctrl slot=0 show config Smart Array P400 in Slot 0 (Embedded) (sn: XXXX) array A (SATA, Unused Space: 0 MB) logicaldrive 1 (698.6 GB, RAID 1, OK) physicaldrive 1I:1:1 (port 1I:box 1:bay 1, SATA, 750 GB, OK) physicaldrive 1I:1:2 (port 1I:box 1:bay 2, SATA, 750 GB, OK) array B (SATA, Unused Space: 0 MB) logicaldrive 2 (2.7 TB, RAID 5, Failed) physicaldrive 1I:1:3 (port 1I:box 1:bay 3, SATA, 750 GB, OK) physicaldrive 1I:1:4 (port 1I:box 1:bay 4, SATA, 750 GB, OK) physicaldrive 2I:1:5 (port 2I:box 1:bay 5, SATA, 750 GB, OK) physicaldrive 2I:1:6 (port 2I:box 1:bay 6, SATA, 750 GB, Failed) physicaldrive 2I:1:7 (port 2I:box 1:bay 7, SATA, 750 GB, OK) unassigned physicaldrive 2I:1:8 (port 2I:box 1:bay 8, SATA, 750 GB, OK) ~#
I thought that I had drive 2I:1:8 configured as a spare for Array A and Array B, but it seems this was not the case :-(. I noticed the problem due to I/O errors on the host, even if only 1 physicaldrive of the RAID5 is failed.
Does someone know why this could happen? The logicaldrive should go into "Degraded" mode but still be fully accessible from the host os!?
I first tried to add the unassigned drive 2I:1:8 as a spare to logicaldrive 2, but this was not possible:
~# /usr/sbin/hpacucli ctrl slot=0 array B add spares=2I:1:8 Error: This operation is not supported with the current configuration. Use the "show" command on devices to show additional details about the configuration. ~#
Interestingly it is possible to add the unassigned drive to the first array without problems. I thought maybe the controller put the array into "failed" state due to the missing spare and protects failed arrays from modification. So I tried was to reenable the logicaldrive (to add the spare afterwards):
~# /usr/sbin/hpacucli ctrl slot=0 ld 2 modify reenable Warning: Any previously existing data on the logical drive may not be valid or recoverable. Continue? (y/n) y Error: This operation is not supported with the current configuration. Use the "show" command on devices to show additional details about the configuration. ~#
But as you can see, re-enabling the logicaldrive this was not possible.
Now I replaced the failed drive by hotswapping it with the unassigned drive. The status now looks like this:
~# /usr/sbin/hpacucli ctrl slot=0 show config Smart Array P400 in Slot 0 (Embedded) (sn: XXXX) array A (SATA, Unused Space: 0 MB) logicaldrive 1 (698.6 GB, RAID 1, OK) physicaldrive 1I:1:1 (port 1I:box 1:bay 1, SATA, 750 GB, OK) physicaldrive 1I:1:2 (port 1I:box 1:bay 2, SATA, 750 GB, OK) array B (SATA, Unused Space: 0 MB) logicaldrive 2 (2.7 TB, RAID 5, Failed) physicaldrive 1I:1:3 (port 1I:box 1:bay 3, SATA, 750 GB, OK) physicaldrive 1I:1:4 (port 1I:box 1:bay 4, SATA, 750 GB, OK) physicaldrive 2I:1:5 (port 2I:box 1:bay 5, SATA, 750 GB, OK) physicaldrive 2I:1:6 (port 2I:box 1:bay 6, SATA, 750 GB, OK) physicaldrive 2I:1:7 (port 2I:box 1:bay 7, SATA, 750 GB, OK) ~#
The logical drive is still not accessible. Why is it not rebuilding?
What can I do?
FYI, this is the configuration of my controller:
~# /usr/sbin/hpacucli ctrl slot=0 show Smart Array P400 in Slot 0 (Embedded) Bus Interface: PCI Slot: 0 Serial Number: XXXX Cache Serial Number: XXXX RAID 6 (ADG) Status: Enabled Controller Status: OK Chassis Slot: Hardware Revision: Rev E Firmware Version: 5.22 Rebuild Priority: Medium Expand Priority: Medium Surface Scan Delay: 15 secs Surface Analysis Inconsistency Notification: Disabled Raid1 Write Buffering: Disabled Post Prompt Timeout: 0 secs Cache Board Present: True Cache Status: OK Accelerator Ratio: 25% Read / 75% Write Drive Write Cache: Disabled Total Cache Size: 256 MB No-Battery Write Cache: Disabled Cache Backup Power Source: Batteries Battery/Capacitor Count: 1 Battery/Capacitor Status: OK SATA NCQ Supported: True ~#
Thanks for you help in advance.