0

I have a Linux software RAID10 device on md0. It's made up of 4 1TB disks sd[abcd]. Yesterday Smart emailed me to say a disk was going bad (seek errors going up and reallocated sectors). I rebooted with a new drive and added it to the array. /proc/mdstat showed it was re-syncing. Sometime mid-morning, errors started flying about a "media error" on ANOTHER disk in the array. I checked /var/log/messages and saw a ton of Emask 0x49 (media error) entries for another drive in the same array. Thanks Murphy.

I replaced the newly failed drive but no luck starting the array. mdadm also tells me sdc is busy. Anyone know why? That's the newest drive:

    # mdadm  -S /dev/md0
    mdadm: stopped /dev/md0

    # mdadm --assemble /dev/md0 /dev/sda /dev/sdb /dev/sdc /dev/sdd -fv
    mdadm: looking for devices for /dev/md0
    mdadm: /dev/sda is identified as a member of /dev/md0, slot 1.
    mdadm: /dev/sdb is identified as a member of /dev/md0, slot -1.
    mdadm: /dev/sdc is identified as a member of /dev/md0, slot -1.
    mdadm: /dev/sdd is identified as a member of /dev/md0, slot 0.
    mdadm: added /dev/sda to /dev/md0 as 1
    mdadm: no uptodate device for slot 2 of /dev/md0
    mdadm: no uptodate device for slot 3 of /dev/md0
    mdadm: added /dev/sdb to /dev/md0 as -1
    mdadm: failed to add /dev/sdc to /dev/md0: Device or resource busy
    mdadm: added /dev/sdd to /dev/md0 as 0
    mdadm: /dev/md0 assembled from 2 drives and 1 spare - not enough to start the array.

# cat /proc/mdstat 
Personalities : [raid10] 
md0 : inactive sdd[4](S) sdb[6](S) sda[5](S)
      2930287104 blocks super 1.0

unused devices: <none>

# for d in a b c d; do mdadm -E /dev/sd$d; done
/dev/sda:
          Magic : a92b4efc
        Version : 1.0
    Feature Map : 0x0
     Array UUID : 24edfbfb:f97149e1:93e019e7:fc7b3f03
           Name : bach:0
  Creation Time : Thu Sep 30 13:50:40 2010
     Raid Level : raid10
   Raid Devices : 4

 Avail Dev Size : 1953524896 (931.51 GiB 1000.20 GB)
     Array Size : 3907049472 (1863.03 GiB 2000.41 GB)
  Used Dev Size : 1953524736 (931.51 GiB 1000.20 GB)
   Super Offset : 1953525152 sectors
          State : clean
    Device UUID : fc75bc5b:e32851bb:9725e0ce:aeaa1680

    Update Time : Thu Dec 27 09:28:13 2012
       Checksum : 3a03b8e1 - correct
         Events : 7314

         Layout : near=1, far=2
     Chunk Size : 256K

    Array Slot : 5 (failed, failed, failed, failed, 0, 1, failed)
   Array State : uU__ 5 failed


/dev/sdb:
          Magic : a92b4efc
        Version : 1.0
    Feature Map : 0x0
     Array UUID : 24edfbfb:f97149e1:93e019e7:fc7b3f03
           Name : bach:0
  Creation Time : Thu Sep 30 13:50:40 2010
     Raid Level : raid10
   Raid Devices : 4

 Avail Dev Size : 1953524896 (931.51 GiB 1000.20 GB)
     Array Size : 3907049472 (1863.03 GiB 2000.41 GB)
  Used Dev Size : 1953524736 (931.51 GiB 1000.20 GB)
   Super Offset : 1953525152 sectors
          State : clean
    Device UUID : adbb2437:931c08fc:0e5428b8:a6d0d47d

    Update Time : Thu Dec 27 09:28:13 2012
       Checksum : 3d2946ab - correct
         Events : 7306

         Layout : near=1, far=2
     Chunk Size : 256K

    Array Slot : 6 (failed, failed, failed, failed, 0, 1)
   Array State : uu__ 4 failed


/dev/sdc:
          Magic : a92b4efc
        Version : 1.0
    Feature Map : 0x0
     Array UUID : 24edfbfb:f97149e1:93e019e7:fc7b3f03
           Name : bach:0
  Creation Time : Thu Sep 30 13:50:40 2010
     Raid Level : raid10
   Raid Devices : 4

 Avail Dev Size : 1953524896 (931.51 GiB 1000.20 GB)
     Array Size : 3907049472 (1863.03 GiB 2000.41 GB)
  Used Dev Size : 1953524736 (931.51 GiB 1000.20 GB)
   Super Offset : 1953525152 sectors
          State : clean
    Device UUID : 5c216a06:c17d4e4f:9dc5c09b:b3f7d72f

    Update Time : Thu Dec 27 09:28:13 2012
       Checksum : f5508998 - correct
         Events : 0

         Layout : near=1, far=2
     Chunk Size : 256K

    Array Slot : 6 (failed, failed, failed, failed, 0, 1)
   Array State : uu__ 4 failed


/dev/sdd:
          Magic : a92b4efc
        Version : 1.0
    Feature Map : 0x0
     Array UUID : 24edfbfb:f97149e1:93e019e7:fc7b3f03
           Name : bach:0
  Creation Time : Thu Sep 30 13:50:40 2010
     Raid Level : raid10
   Raid Devices : 4

 Avail Dev Size : 1953524896 (931.51 GiB 1000.20 GB)
     Array Size : 3907049472 (1863.03 GiB 2000.41 GB)
  Used Dev Size : 1953524736 (931.51 GiB 1000.20 GB)
   Super Offset : 1953525152 sectors
          State : clean
    Device UUID : 69a39c8f:0b25b888:0b4e1848:42aed006

    Update Time : Thu Dec 27 09:28:13 2012
       Checksum : 3b3d0e7c - correct
         Events : 7314

         Layout : near=1, far=2
     Chunk Size : 256K

    Array Slot : 4 (failed, failed, failed, failed, 0, 1, failed)
   Array State : Uu__ 5 failed

I've got backups of the array but it'll be a day a full day to restore. Any way to get this thing online?

Server Fault
  • 3,454
  • 7
  • 48
  • 88
  • Was the raid array entirely rebuilt before you inserted the second drive? –  Dec 28 '12 at 11:29
  • Eric -- no, not sure. It was at 37% at one point but beyond that would be sheer imagination. – Server Fault Dec 28 '12 at 12:34
  • 1
    If second hard drive failed before raid was rebuilt and was mirroring first hard drive info, then Murphy is evil and all you can do afaik is to use your backup. –  Dec 28 '12 at 12:36

1 Answers1

1

Well, as a last ditch attempt, I tried re-creating the array with the newly failed disk and the mdadm --assume-clean option to see what it would do. It came up, but no data to be found. Eh well.. yay for backups.

Server Fault
  • 3,454
  • 7
  • 48
  • 88