mdadm stalls rebuilding a RAID5 array at 99.9%

Question

I recently installed three new disks in my QNAP TS-412 NAS.

These three new disks should be combined with the already present disk into a 4 disk RAID5 array, so I started the migration process.

After multiple tries (each taking about 24 hours) the migration seemed to work but resulted in a non-responsive NAS.

At that point I reset the NAS. Everything went downhill from there:

The NAS boots but marks the first disk as failed and removes it from all arrays, leaving them limp.
I ran checks on the disk and can't find any issues with it (which would be weird anyway, as it's almost new).
The admin interface didn't offer any recovery options, so I figured I'd just do it manually.

I've successfully rebuilt all QNAP internal RAID1 arrays using mdadm (being /dev/md4, /dev/md13 and /dev/md9), leaving only the RAID5 array; /dev/md0:

I've tried this multiple times now, using these commands:

mdadm -w /dev/md0

(Required as the array was mounted read-only by the NAS after removing /dev/sda3 from it. Can't modify the array in RO mode).

mdadm /dev/md0 --re-add /dev/sda3

After which the array starts rebuilding. It stalls at 99.9% though, while the system is extremely slow and/or unresponsive. (Login in using SSH fails most of the time).

Current state of things:

[admin@nas01 ~]# cat /proc/mdstat                            
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] 
md4 : active raid1 sdd2[2](S) sdc2[1] sdb2[0]
      530048 blocks [2/2] [UU]

md0 : active raid5 sda3[4] sdd3[3] sdc3[2] sdb3[1]
      8786092608 blocks super 1.0 level 5, 64k chunk, algorithm 2 [4/3] [_UUU]
      [===================>.]  recovery = 99.9% (2928697160/2928697536) finish=0.0min speed=110K/sec

md13 : active raid1 sda4[0] sdb4[1] sdd4[3] sdc4[2]
      458880 blocks [4/4] [UUUU]
      bitmap: 0/57 pages [0KB], 4KB chunk

md9 : active raid1 sda1[0] sdd1[3] sdc1[2] sdb1[1]
      530048 blocks [4/4] [UUUU]
      bitmap: 2/65 pages [8KB], 4KB chunk

unused devices: <none>

(It's stalled at 2928697160/2928697536 for hours now)

[admin@nas01 ~]# mdadm -D /dev/md0
/dev/md0:
        Version : 01.00.03
  Creation Time : Thu Jan 10 23:35:00 2013
     Raid Level : raid5
     Array Size : 8786092608 (8379.07 GiB 8996.96 GB)
  Used Dev Size : 2928697536 (2793.02 GiB 2998.99 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Mon Jan 14 09:54:51 2013
          State : clean, degraded, recovering
 Active Devices : 3
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 1

         Layout : left-symmetric
     Chunk Size : 64K

 Rebuild Status : 99% complete

           Name : 3
           UUID : 0c43bf7b:282339e8:6c730d6b:98bc3b95
         Events : 34111

    Number   Major   Minor   RaidDevice State
       4       8        3        0      spare rebuilding   /dev/sda3
       1       8       19        1      active sync   /dev/sdb3
       2       8       35        2      active sync   /dev/sdc3
       3       8       51        3      active sync   /dev/sdd3

After inspecting /mnt/HDA_ROOT/.logs/kmsg it turns out that the actual issue appears to be with /dev/sdb3 instead:

<6>[71052.730000] sd 3:0:0:0: [sdb] Unhandled sense code
<6>[71052.730000] sd 3:0:0:0: [sdb] Result: hostbyte=0x00 driverbyte=0x08
<6>[71052.730000] sd 3:0:0:0: [sdb] Sense Key : 0x3 [current] [descriptor]
<4>[71052.730000] Descriptor sense data with sense descriptors (in hex):
<6>[71052.730000]         72 03 00 00 00 00 00 0c 00 0a 80 00 00 00 00 01 
<6>[71052.730000]         5d 3e d9 c8 
<6>[71052.730000] sd 3:0:0:0: [sdb] ASC=0x0 ASCQ=0x0
<6>[71052.730000] sd 3:0:0:0: [sdb] CDB: cdb[0]=0x88: 88 00 00 00 00 01 5d 3e d9 c8 00 00 00 c0 00 00
<3>[71052.730000] end_request: I/O error, dev sdb, sector 5859367368
<4>[71052.730000] raid5_end_read_request: 27 callbacks suppressed
<4>[71052.730000] raid5:md0: read error not correctable (sector 5857246784 on sdb3).
<4>[71052.730000] raid5: some error occurred in a active device:1 of md0.
<4>[71052.730000] raid5:md0: read error not correctable (sector 5857246792 on sdb3).
<4>[71052.730000] raid5: some error occurred in a active device:1 of md0.
<4>[71052.730000] raid5:md0: read error not correctable (sector 5857246800 on sdb3).
<4>[71052.730000] raid5: some error occurred in a active device:1 of md0.
<4>[71052.730000] raid5:md0: read error not correctable (sector 5857246808 on sdb3).
<4>[71052.730000] raid5: some error occurred in a active device:1 of md0.
<4>[71052.730000] raid5:md0: read error not correctable (sector 5857246816 on sdb3).
<4>[71052.730000] raid5: some error occurred in a active device:1 of md0.
<4>[71052.730000] raid5:md0: read error not correctable (sector 5857246824 on sdb3).
<4>[71052.730000] raid5: some error occurred in a active device:1 of md0.
<4>[71052.730000] raid5:md0: read error not correctable (sector 5857246832 on sdb3).
<4>[71052.730000] raid5: some error occurred in a active device:1 of md0.
<4>[71052.730000] raid5:md0: read error not correctable (sector 5857246840 on sdb3).
<4>[71052.730000] raid5: some error occurred in a active device:1 of md0.
<4>[71052.730000] raid5:md0: read error not correctable (sector 5857246848 on sdb3).
<4>[71052.730000] raid5: some error occurred in a active device:1 of md0.
<4>[71052.730000] raid5:md0: read error not correctable (sector 5857246856 on sdb3).
<4>[71052.730000] raid5: some error occurred in a active device:1 of md0.
<4>[71052.730000] raid5: some error occurred in a active device:1 of md0.
<4>[71052.730000] raid5: some error occurred in a active device:1 of md0.
<4>[71052.730000] raid5: some error occurred in a active device:1 of md0.
<4>[71052.730000] raid5: some error occurred in a active device:1 of md0.
<4>[71052.730000] raid5: some error occurred in a active device:1 of md0.
<4>[71052.730000] raid5: some error occurred in a active device:1 of md0.
<4>[71052.730000] raid5: some error occurred in a active device:1 of md0.
<4>[71052.730000] raid5: some error occurred in a active device:1 of md0.
<4>[71052.730000] raid5: some error occurred in a active device:1 of md0.
<4>[71052.730000] raid5: some error occurred in a active device:1 of md0.
<4>[71052.730000] raid5: some error occurred in a active device:1 of md0.
<4>[71052.730000] raid5: some error occurred in a active device:1 of md0.
<4>[71052.730000] raid5: some error occurred in a active device:1 of md0.
<4>[71052.730000] raid5: some error occurred in a active device:1 of md0.

The above sequence is repeated at a steady rate for various (random?) sectors in the 585724XXXX range.

My questions are:

Why is it stalled so close to the end, while still using so many resources that the system stalls (the md0_raid5 and md0_resync processes are still running).
Is there any way to see what is causing it to fail/stall? <-- Likely due to the sdb3 errors.
How can I get the operation to complete without losing my 3TB of data? (Like skipping the troublesome sectors on sdb3, but keeping the intact data?)

Did you check the system logs `/var/log/syslog` and `/var/log/messages`? — Khaled, Jan 14 '13 at 09:35
@Khaled good call! Both logs are empty on the NAS, but `/mnt/HDA_ROOT/.logs/kmsg` has some troubling messages. I'll update question with them. — Remco Overdijk, Jan 14 '13 at 09:45

score 3 · Answer 1 · answered Jan 14 '13 at 10:30

The obvious approach would be to replace the faulty disk, re-create the arrays and replay the backup you have taken before the array extension operation.

But since you appear not to have this option, this would be the next best thing to do:

get a Linux system with enough space to accomodate all your disks' raw space (12 TB, if I got the numbers right)
copy the data off your disks to this system, destinations may be files or block devices, it does not matter all that much for mdraid. In the case of your defective sdb3 device you might need to use ddrescue instead of a simple dd to copy the data.
try to re-assemble and rebuild the array from there on

Also, take a look at this blog page for some hints on what can be done for assessing the situation in a multiple device failure for a RAID 5 array.

score 2 · Accepted Answer · answered Jan 14 '13 at 10:05

2

It is likely stalling before finishing because it requires the faulty disk to return some sort of status, but it's not getting it.

Regardless, all your data is (or should be) intact with only 3 out of 4 disks.

You say it ejects the faulty disk from the array - so it should still be running, albeit in degraded mode.

Can you mount it ?

You can force the array to run by performing the following:

print out the details of the array: mdadm -D /dev/md0
stop the array: mdadm --stop /dev/md0
re-create the array and force md to accept it: ``mdadm -C -n md0 --assume-clean /dev/sd[abcd]3`

This latter step is totally safe as long as:

you don't write to the array, and
you used the exact same creation parameters as before.

That last flag will prevent a rebuild and skip any integrity tests.
You should then be able to mount it and recover your data.

answered Jan 14 '13 at 10:05

adaptr

16,479
21
33

Well, it ejected `sda3` from the array which it marked as faulty, but according to the logs `sdb3` seems to be the faulty disk instead, which it left in the array. So `sda3` was erased by the rebuild and the faulty `sdb3` remained in the array, which totals at 2 missing disks in a 4 disk array according to my calculations :(. Something tells me I'll end up with corrupt data. Do your steps still work in this scenario, enabling me to recover at last some (hopefully the crucial part) of the data on the array? – Remco Overdijk Jan 14 '13 at 10:12
The steps I outlined will not touch the existing data as long as you don't write to the array, so I would advise you to try that, and then mount it read-only to see what's what. – adaptr Jan 14 '13 at 10:43

mdadm stalls rebuilding a RAID5 array at 99.9%

2 Answers2