0

PE T310 with a PERC S100, one RAID 5 array with three drives, status of the virtual disk is failed, but it boots and everything appears to work.

All drives in Server Administrator are Online with a green check, but one has no available tasks.

I've also seen a disagreement between OMSA and the BIOS. If I boot into the bios, one drive is status ready, one is status online, and one is status spare.

I've upgraded the BMC, BIOS and PERC S100 drivers to the latest and this continues.

Is this a common problem? Is there anything I can do to remedy this? If a drive has actually failed I wouldn't know as I'm effectively flying blind.

HopelessN00b
  • 53,385
  • 32
  • 133
  • 208
Jeremy
  • 1

2 Answers2

2

This sounds like it's most likely a simple drive failure, combined w/ some OMSA/controller troubles in displaying the proper status.

The problem drive is probably the one you're seeing in a "ready" state from the BIOS - that just means that the drive still functions when the controller tries to initialize it, but it is no longer an active member of the RAID set due to whatever problem it initially had.

However...

...one drive is status ready, one is status online, and one is status spare

How many total hard drives are there on this controller? If there are only the 3 you're listing, and the controller BIOS menu only reports 1 of 3 drives as a healthy member of the RAID5 (which is clearly incorrect, as you have data access), then we can't trust the information that the controller and OMSA are giving us regarding HDD health/status.

I've upgraded the BMC, BIOS and PERC S100 drivers

Did you update the firmware for the PERC S100 as well? I would say that is equally important, and likely the culprit in the mis-reporting seen.

status of the virtual disk is failed

Is this from OMSA or from the controller BIOS? It would be good to know the VD status from both sides.

Get that controller FW updated if you've not already done so. Otherwise, there may not be much else that you can do aside from deleting and recreating the RAID. Contacting Dell support would be advisable at that point.

JimNim
  • 2,736
  • 12
  • 23
0

It is worth noting that I replaced a drive in this array a few weeks ago (the array was listed as degraded and not failed), verified it started the rebuild, and never checked it again. Now I see in the logs that the rebuild failed due to data errors on the original disks. I expect if I could get the rebuild to complete I'd be in good shape.

The S100 doesn't appear to have firmware - it's a software only controller? At least I can't find any firmware downloads for it. I am using the latest version of the driver however.

On other higher end PERC controllers I've been able to start a consistency check of the array. I'd love to be able to do that here and then rebuild it, but I don't see that option in OMSA on this controller.

Is my only option if I can't get the rebuild to complete to recreate the RAID set and restore from backup to get consistency back in OMSA and the controller BIOS, not to mention on the disks themselves?

Jeremy
  • 1