When does a Raid restore redundancy after a broken sector is flagged as defective?

Question

What happens when I flag a sector on a hdd in a RAID setup as `defective (GLIST) ?

Will the data be written to the replacement sector right away or does this depend on the actual setup/settings (soft/hardware raid)?

Example: Raid 5 - 4 Drives - Linux Hardware Raid

On HDD 1 sector 0x123456 breaks. It is flagged as defective. This causes the data on this sector to be flagged as lost and the sector will now point to vendor specific data. But as the raid contains 1 copy, the valid data could be restored.

At which moment will the data on the broken drive be restored thus having 2 sets of valid data again?

I imagine it to be one of these:

repair on read (data is written to the replacement sector the next time the data is read)
repair on flag (data is written to the replacement sector right after the sector is flagged as defective)
repair must be triggered manually (command triggers the rebuild)

If it is indeed an individual issue/setting then I would be especially interested in the Smart Array P800.

But please feel free to share anything you know about this.

PS: If you found this by google the smartmontools site is a great starting point: e.g http://smartmontools.sourceforge.net/badblockhowto.html#bb

score 4 · Accepted Answer · answered Nov 20 '14 at 19:15

Depends.

In daily business, your hard disk does write a checksum and some ECC information for every sector being written, and verifies this data during a read operation.

If the error is small enough (e.g. a flipped bit or other minor errors) to be covered by your hard disk's ECC capabilities, your hard disk may recover from this on its own. The corrected error may still be visible in SMART output, but the operating system or the hardware raid controller didn't notice a read error.

Otherwise, the hard disk will report an unrecoverable read error to your controller and internally mark the sector as being broken. The attempt to write data to the same (logical) sector lets your hard disk allocate a replacement sector from a list of reserved sectors and transparently map access from the logical sector to the new (replacement) physical sector. Your write request will be stored on different physical sector, fixing the error for you.

If the disk is out of replacement sectors, this will also fail and you can't any longer recover from this just by re-writing the same logical sector.

Hardware raid controllers typically do attempt to discover such failed sectors "earlier" than a usuall read access by performing background media scans, scheduled self-tests and verifying the accuracy of stored raid parity.

If the error is being fixed by rewriting the same sector is a different story, the field is largely undocumented and mostly up to one's personal experience. Just from my experience of 15 years on tens of thousands of servers running dozens of hardware raid controllers from half a dozen of different vendors:

some vendors always do perform background media scans and silently attempt to fix bad blocks automatically. HP/Compaq is on that side.
some vendors make the permanent background media scan an option, which has to be specifically turned on (and defaults to "off" after powerup).
some vendors offer the background media scan as a one-time operation, which is to be triggered manually via an admin interface or CLI
some vendors do break even more.

As an example for "break even more", around 10 years ago I've had serious issues in a RAID 10 configuration on a specific controller type: ocassionally, file system and application data became damaged. Closer investigation and introducing a checksum on application level showed that sometimes zeroes have been read, but non-zeroed data has been expected.

Culprit: when reading from a bad block, the controller logged this as an error, but didn't recover from the working copy at all. Instead, it reported the surrounding 8k stripe of data to be a stripe of zeroes and the read operation to be successful. The behaviour was reproducible on > 100 controllers, and the vendor's customer support even stated this to be perfectly acceptable, as RAID were to recover from full disk failure only and not handling the failure of individual blocks.

In a RAID4/RAID5 configuration, the same controller would recover from RAID redundancy and deliver the recovered stripe to the OS, but wouldn't recover the bad block on disk automatically. In order to recover from the bad block, one had either to rewrite the same logical block on OS-level or to issue a "regenerate parity" operation within the admin interface. The later one would scan all disks, verify RAID parity checksums and attempt to recover bad blocks by rewriting any blocks with a read error or a failed RAID parity.

On the other extreme, Compaq/HP have been performing background scans on their RAID controllers for ages, and if the block/sector can't be recovered automatically from parity or something else looks fishy, the controller would log this, start blinking the LEDs of affected drives and try to alert the admin (e.g. by a nagging message screen during POST). I haven't heard of any bad-block-trouble on our current fleet of around 10k of HP Smart Array controllers, including around 1100 P800s. However, that's just my experience.

I hope you told that RAID controller vendor that you weren't going to use any more of their controllers. — user253751, Aug 18 '17 at 02:41
The vendor's support statement on our findings was certainly a reason to decommission and replace those systems as soon as possible by a different (otherwise troublesome) controller by a different vendor. — knoepfchendruecker, Aug 19 '17 at 03:01

When does a Raid restore redundancy after a broken sector is flagged as defective?

1 Answers1