mdadm shows inactive raid0 array instead of degraded raid6 after disk failure

0

I've been running an Ubuntu 18.04 system with an 8-disk raid 6 array, that crashed while having a faulty disk (I only noticed that there was a faulty disk after the crash). The raid array has survived multiple Ubuntu installs, so it's not the first time I've had a faulty disk and usually I keep the array running until I've received a replacement disk, but this time I can't get it running again (my plan was to first get it running again, and only order the replacement when I'm sure that the array is recoverable).

I've disconnected the faulty drive and the other 7 drives still seem to be operating, mdadm however seems to think that those 7 drives are part of a raid 0 array instead of a raid 6 array. So at the moment, I'm not courageous enough to try anything that might be destructive without at least trying to get some confirmation that there's a bigger chance of working than destroying data (I've made intermittent backups over the years, but there's probably some photos on there that haven't been backed up yet... and yes I know I need a better backup strategy, and as always it'll be the first thing I'll do after this problem is either fixed or conclusively unfixable).

When I run mdadm --assemble --scan I get the following output:

mdadm: /dev/md127 assembled from 7 drives - not enough to start the array while not clean - consider --force.

I would try the --force option, but the output of mdadm --detail /dev/md127 is as follows:

/dev/md127:
           Version : 1.2
        Raid Level : raid0
     Total Devices : 7
       Persistence : Superblock is persistent

             State : inactive
   Working Devices : 7

              Name : Ares:RaidStorage  (local to host Ares)
              UUID : 8f7270c4:ec9e19f5:afa4d7c7:2fcb1ee1
            Events : 1627931

    Number   Major   Minor   RaidDevice

       -       8       32        -        /dev/sdc
       -       8        0        -        /dev/sda
       -       8      112        -        /dev/sdh
       -       8       80        -        /dev/sdf
       -       8       48        -        /dev/sdd
       -       8       16        -        /dev/sdb
       -       8       96        -        /dev/sdg

And since mdadm --assemble --help says that --force involves modifying superblocks I'm afraid that running --force will overwrite the superblocks with the info for a raid 0 array.

I've let mdadm examine the member devices and they still all think they're part of an 8-disk raid 6 array (which I hope means there's still a chance of recovery):

root@Ares:/# for i in a b c d f g h; do mdadm --examine /dev/sd$i; done
/dev/sda:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 8f7270c4:ec9e19f5:afa4d7c7:2fcb1ee1
           Name : Ares:RaidStorage  (local to host Ares)
  Creation Time : Mon Jun 25 18:19:09 2012
     Raid Level : raid6
   Raid Devices : 8

 Avail Dev Size : 5860531120 (2794.52 GiB 3000.59 GB)
     Array Size : 17581590528 (16767.11 GiB 18003.55 GB)
  Used Dev Size : 5860530176 (2794.52 GiB 3000.59 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=944 sectors
          State : active
    Device UUID : 1e104c8a:529eb411:a7fd472a:5854d356

    Update Time : Fri Mar  1 21:50:02 2019
       Checksum : 712f8115 - correct
         Events : 1627931

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdb:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 8f7270c4:ec9e19f5:afa4d7c7:2fcb1ee1
           Name : Ares:RaidStorage  (local to host Ares)
  Creation Time : Mon Jun 25 18:19:09 2012
     Raid Level : raid6
   Raid Devices : 8

 Avail Dev Size : 5860531120 (2794.52 GiB 3000.59 GB)
     Array Size : 17581590528 (16767.11 GiB 18003.55 GB)
  Used Dev Size : 5860530176 (2794.52 GiB 3000.59 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=944 sectors
          State : active
    Device UUID : d3bb43b7:9f39be47:102328fa:2bab3f5e

    Update Time : Fri Mar  1 21:50:02 2019
       Checksum : ab7d4456 - correct
         Events : 1627931

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 2
   Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdc:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 8f7270c4:ec9e19f5:afa4d7c7:2fcb1ee1
           Name : Ares:RaidStorage  (local to host Ares)
  Creation Time : Mon Jun 25 18:19:09 2012
     Raid Level : raid6
   Raid Devices : 8

 Avail Dev Size : 5860531120 (2794.52 GiB 3000.59 GB)
     Array Size : 17581590528 (16767.11 GiB 18003.55 GB)
  Used Dev Size : 5860530176 (2794.52 GiB 3000.59 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=944 sectors
          State : active
    Device UUID : 325a0adf:3d917a47:977edea3:db21d42a

    Update Time : Fri Mar  1 21:50:02 2019
       Checksum : 494b0c89 - correct
         Events : 1627931

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 1
   Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdd:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 8f7270c4:ec9e19f5:afa4d7c7:2fcb1ee1
           Name : Ares:RaidStorage  (local to host Ares)
  Creation Time : Mon Jun 25 18:19:09 2012
     Raid Level : raid6
   Raid Devices : 8

 Avail Dev Size : 5860531120 (2794.52 GiB 3000.59 GB)
     Array Size : 17581590528 (16767.11 GiB 18003.55 GB)
  Used Dev Size : 5860530176 (2794.52 GiB 3000.59 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=944 sectors
          State : active
    Device UUID : 6c0200a0:37b50833:683a868b:ebfb9e94

    Update Time : Fri Mar  1 21:50:02 2019
       Checksum : 47416ea1 - correct
         Events : 1627931

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 3
   Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdf:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 8f7270c4:ec9e19f5:afa4d7c7:2fcb1ee1
           Name : Ares:RaidStorage  (local to host Ares)
  Creation Time : Mon Jun 25 18:19:09 2012
     Raid Level : raid6
   Raid Devices : 8

 Avail Dev Size : 5860531120 (2794.52 GiB 3000.59 GB)
     Array Size : 17581590528 (16767.11 GiB 18003.55 GB)
  Used Dev Size : 5860530176 (2794.52 GiB 3000.59 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=944 sectors
          State : active
    Device UUID : b91d04d3:3f1508ad:687bb30f:7d6fc687

    Update Time : Fri Mar  1 21:50:02 2019
       Checksum : 6b999e8b - correct
         Events : 1627931

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 4
   Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdg:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 8f7270c4:ec9e19f5:afa4d7c7:2fcb1ee1
           Name : Ares:RaidStorage  (local to host Ares)
  Creation Time : Mon Jun 25 18:19:09 2012
     Raid Level : raid6
   Raid Devices : 8

 Avail Dev Size : 5860531120 (2794.52 GiB 3000.59 GB)
     Array Size : 17581590528 (16767.11 GiB 18003.55 GB)
  Used Dev Size : 5860530176 (2794.52 GiB 3000.59 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1960 sectors, after=944 sectors
          State : active
    Device UUID : 64ba7519:7d47e97c:21c5622a:18df9eca

    Update Time : Fri Mar  1 21:50:02 2019
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : df7c2710 - correct
         Events : 1627931

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 5
   Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdh:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 8f7270c4:ec9e19f5:afa4d7c7:2fcb1ee1
           Name : Ares:RaidStorage  (local to host Ares)
  Creation Time : Mon Jun 25 18:19:09 2012
     Raid Level : raid6
   Raid Devices : 8

 Avail Dev Size : 5860531120 (2794.52 GiB 3000.59 GB)
     Array Size : 17581590528 (16767.11 GiB 18003.55 GB)
  Used Dev Size : 5860530176 (2794.52 GiB 3000.59 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=944 sectors
          State : active
    Device UUID : 493cfa55:b00800db:40c8fbc4:c94dabbb

    Update Time : Fri Mar  1 21:50:02 2019
       Checksum : 5b4dbb3 - correct
         Events : 1627931

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 6
   Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)

For completeness sake here's also the output of some other commands that are probably relevant:

root@Ares:/# mdadm --examine --scan
ARRAY /dev/md/RaidStorage  metadata=1.2 UUID=8f7270c4:ec9e19f5:afa4d7c7:2fcb1ee1 name=Ares:RaidStorage

root@Ares:/# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
unused devices: <none>

root@Ares:/# cat /etc/mdadm/mdadm.conf
# mdadm.conf
#
# !NB! Run update-initramfs -u after updating this file.
# !NB! This will ensure that initramfs has an uptodate copy.
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default (built-in), scan all partitions (/proc/partitions) and all
# containers for MD superblocks. alternatively, specify devices to scan, using
# wildcards if desired.
#DEVICE partitions containers

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

# instruct the monitoring daemon where to send mail alerts
MAILADDR root

# definitions of existing MD arrays
ARRAY /dev/md/RaidStorage  metadata=1.2 UUID=8f7270c4:ec9e19f5:afa4d7c7:2fcb1ee1 name=Ares:RaidStorage

# This configuration was auto-generated on Mon, 01 Oct 2018 20:35:13 +0200 by mkconf

To exclude the possibility of some configuration being wrong in my Kubuntu installation, I've also tried mdadm --assemble --scan from a Kubuntu USB-stick, but that had the exact same effect as running it from my normal install.

So my questions are as follows:

  1. how is it possible that all drives think they're part of my 8-disk raid 6 array, but mdadm --assemble --scan still results in an inactive raid 0 array?
  2. can I safely call mdadm --assemble --scan --force?
  3. If no on 2: how can I convince mdadm to see my 7 disks as part of an 8-disk raid 6?

D.F.

Posted 2019-03-06T17:41:49.617

Reputation: 1

Answers

0

This worked for me:

  • go into /etc/mdadm/mdadm.conf
  • edit the line for your array and add in raid level and disk numbers. Mine was

    level=raid6 num-devices=6
    

    obviously you'd need to say 8 devices:)

Looks like my system massively overheated – two boxes in a tight space and disks went nuts. One disk dropped out, which I thought was screwing it up, and another just disappeared, but when I checked the events I had 5 disks with the same and the first one to drop out was behind.

After setting mdadm.conf, I restarted the system and checked all the disks were detected then the assemble command needed the --force option (scary), but worked ok and started checking itself. Then added back in the first disk to drop out and it looks like it just updated it and it’s all clean now.

Dave

Posted 2019-03-06T17:41:49.617

Reputation: 1