-1

I have a server with an Adaptec 6405 RAID controller and 4 disks in a RAID 5 configuration. Staff in the data center called me because they noticed a red LED was turned on in one of the drive bays.

I have then checked the status using 'arcconf getconfig 1' and I got the status message 'Logical devices/Failed/Degraded: 2/0/1'.

The status of the logical devices was listed as 'Rebuilding'. However, I did not get any suspicious status of the affected physical device, the S.M.A.R.T. setting was 'no', the S.M.A.R.T. warnings were '0' and also 'arcconf getsmartstatus 1' returned no problems with any of the disk drives.

The 'arcconf getlogs 1 events tabular' command gives lots of output (sorry, can't paste the log file here as I only have remote console access, I could post a screenshot though). Here are some sample entries:

eventtype FSA_EM_EXPANDED_EVENT
grouptype FSA_EXE_SCSI_GROUP
subtype FSA_EXE_SCSI_SENSE_DATA
subtypecode 12
cdb 28 00 17 c4 74 00 00 02 00 00 00 00
data 70 00 06 00 00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00 00 0

The 'arcconf getlogs 1 device tabular' command reports mediumErrors 1 for two of the disks.

Today, I have checked the status of the controller again. Everything is back to normal, the controller status is now 'Logical devices/Failed/Degraded: 2/0/0', the logical devices are also all back to 'Optimal'. I was not able to check the LED status, my guess is that the red LED is off again.

Now I have a lot of questions:

  • what is a possible cause for the medium error, why it is not reported by the SMART log too?
  • Should I replace the disk drives? They were purchased just a month ago.
  • The rebuilding process took one or two days, is that normal? The disks are 2 TByte each and the storage system is mostly idling.
  • the timestamp of the logs seem to show the moment of the log retrieval, not the moment of the incident.

Please advise, all help is very appreciated.

nn4l
  • 1,336
  • 5
  • 22
  • 40
  • If the single drive is blinking and the array is degraded, you just put a new drive and if it's idling it will be all fine until the copy to the new drive is finished. Also check if these are not the green drives, ha ha ha – Andrew Smith Jul 08 '12 at 08:03

1 Answers1

3

what is a possible cause for the medium error, why it is not reported by the SMART log too?

COULD be a not smart related error? Depending on Cabling a SAS incompatibility.

Should I replace the disk drives? They were purchased just a month ago

Oh man, you ask that? They are under full warranty now - what do you gain by NOT replacing them and waiting until the warranty expires?

The rebuilding process took one or two days, is that normal? The disks are 2 TByte each and the storage system is mostly idling.

Well, yes. Be happy it worked. See, RAID 5, 23TB discs = no protection, RAID 5 starts failing over 1tb. Welcome to a world of pain - if you value your data, better put in Raid 6.

They are big slow drives that take ages to rebuild, yes.

the timestamp of the logs seem to show the moment of the log retrieval, not the moment of the incident.

Possible.

TomTom
  • 50,857
  • 7
  • 52
  • 134