7

I'm maintaining a pre-configured storage system for a project team. The storage consists of 36 x 8 TB HGST Disks (~240 TB net) in RAID60 (2x18 disks in RAID6), Controller is a MegaRAID SAS 9361-8i, OS=Scientific Linux 7.2

storcli64 /c0 /v1 show
DG/VD TYPE   State Access Consist Cache Cac sCC       Size Name
0/1   RAID60 Optl  RW     Yes     RWBD  -   ON  232.860 TB      

When running weekly scheduled consistency checks (or if I trigger them manually), I get a lot of inconsistency warnings, but they never seem to be fixed - during the next cc the same warnings reappear:

[root@faith ~]# grep 22649241 /var/log/message*
/var/log/messages:Oct 25 20:54:01 faith MR_MONITOR[1604]: <MRMON063> Controller ID:  0   Consistency Check found inconsistent parity on VD#012    strip:  #012    ( VD   =   1,   strip#012      =   22649241)
/var/log/messages-20161002:Sep 25 22:34:03 faith MR_MONITOR[1604]: <MRMON063> Controller ID:  0   Consistency Check found inconsistent parity on VD#012    strip:  #012    ( VD   =   1,   strip#012      =   22649241)
/var/log/messages-20161024:Oct 23 22:35:46 faith MR_MONITOR[1604]: <MRMON063> Controller ID:  0   Consistency Check found inconsistent parity on VD#012    strip:  #012    ( VD   =   1,   strip#012      =   22649241)

What would be the best way to get array consistent? I guess a rebuild could likely fail in the current state? Or might this be a side effect of the raid60?

  • 3
    What's "a lot"? Ten? Ten thousand? Do the same strip values ever repeat? And putting 18 **8TB** disks in RAID6? I hope that data isn't important, or you have good backups - and a lot of time to restore. – Andrew Henle Oct 25 '16 at 21:59
  • The eventlog reports (always) 24 corrections, I checked a few values and they repeat constantly in every consistency check from the very first (e.g. strip 199999 -> grep 199999 event.log | wc -l = 29). Maybe the cc is confused by the raid60 (since a cc couldn't apply for the raid0 component) - but I assume the cc runs on every raid6 array, not on the raid60 disk group? – alphabricks Oct 26 '16 at 06:49
  • 1
    Is the HBA firmware up-to-date? See http://www.avagotech.com/support/download-search I'd also dig through the support docs there, looking for known issues. These kind of issues are one technical reason why high-end storage systems tend to use only vendor-supplied disks of limited variety instead of allowing you to plug in any old disk. *Somebody* has to spend the time to make sure the entire system works for those customers willing to pay for "It **WILL** work" for systems storing their critical data. – Andrew Henle Oct 27 '16 at 10:15
  • @ Andrew Henle - no, it's just marketing dealing. But using disks not designed for RAID in a RAID can have drawbacks. – Overmind Jan 04 '18 at 07:38

0 Answers0