Metadata of the assambled raid inconsistent with the metadata of the individual drives?

3

2

We have a raid 10 with two failed drives, with one drive from each set still functional.

When booting into rescue system the metadata seems to be fine and consistent with the expected state.

The metadata from mdadm --detail of the md is as follows:

  Version : 1.1
  Creation Time : Mon Mar 16 15:53:57 2015
     Raid Level : raid10
  Used Dev Size : 975581184 (930.39 GiB 999.00 GB)
   Raid Devices : 4
  Total Devices : 2
    Persistence : Superblock is persistent

    Update Time : Mon May 28 08:52:58 2018
          State : active, FAILED, Not Started 
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

         Layout : near=2
     Chunk Size : 512K

           Name : 2
           UUID : 34f4a5fa:4b8e03fa:3119b353:f45188a0
         Events : 8618632

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync set-A   /dev/sda1
       1       8       17        1      active sync set-B   /dev/sdb1
       4       0        0        4      removed
       6       0        0        6      removed

The init system can't assemble the raid, with the kernel claiming that there are not enough mirrors.

(...)
md/raid10:md2: not enough operational mirrors.
md: pers->run() failed ...
dracut: mdadm: failed to start array /dev/md2: Input/output error
(...)

Trying to assemble the raid manually (mdadm --assemble --readonly --force /dev/md2 /dev/sd[ab]1) yields the following:

/dev/md2:
        Version : 1.1
     Raid Level : raid0
  Total Devices : 1
    Persistence : Superblock is persistent

          State : inactive

           Name : 2
           UUID : 34f4a5fa:4b8e03fa:3119b353:f45188a0
         Events : 8618632

    Number   Major   Minor   RaidDevice

       -       8        1        -        /dev/sda1

Checking with --examine the metadata of the participating drives gives us a consistent output with the expected state (before manual assembly and after):

/dev/sda1:
          Magic : a92b4efc
        Version : 1.1
    Feature Map : 0x1
     Array UUID : 34f4a5fa:4b8e03fa:3119b353:f45188a0
           Name : 2
  Creation Time : Mon Mar 16 15:53:57 2015
     Raid Level : raid10
   Raid Devices : 4

 Avail Dev Size : 1951162368 (930.39 GiB 999.00 GB)
     Array Size : 1951162368 (1860.77 GiB 1997.99 GB)
    Data Offset : 262144 sectors
   Super Offset : 0 sectors
   Unused Space : before=262064 sectors, after=0 sectors
          State : clean
    Device UUID : 89288c87:2cf8f6cd:483328b4:fffb3db6

Internal Bitmap : 8 sectors from superblock
    Update Time : Mon May 28 08:52:58 2018
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : eaf59503 - correct
         Events : 8618632

         Layout : near=2
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AAA. ('A' == active, '.' == missing, 'R' == replacing)

We are aware that the third active drive is removed, but this shouldn't be the root of the issue.

So our two main questions are:

  1. Why is the state of the array inconsistent with the individual drives?
  2. How can this be resolved?

For the record: CentOS 6 with a kernel version 2.6.32-696.30.1.el6.x86_64

user910763

Posted 2018-06-01T10:40:12.847

Reputation: 31

@grawity: Not that we know, we will check that the next time we go to the system. – user910763 – 2018-06-01T11:14:11.170

No answers