mdadm RAID underlaying an LVM gone after reboot


The main part of my problem ha been discussed here the other time... ... but I have two special questions that haven't been answered before.

The Situation: I upgraded my hardware, installing two new disks and changed the setup.

This is the new setup

PROMPT> blkid
/dev/nvme0n1p3: UUID="29cd2fd5-2cd5-455c-9ac0-7cb278e46ee3" TYPE="swap" PARTUUID="dc3d569d-363f-4f68-87ac-56e1bdb4f29d"
/dev/nvme0n1p1: UUID="D2F8-117C" TYPE="vfat" PARTUUID="cdc50870-bddc-47bf-9835-369694713e41"
/dev/nvme0n1p2: UUID="e36edd7b-d3fd-45f4-87a8-f828238aab08" TYPE="ext4" PARTUUID="25d98b84-55a1-4dd8-81c1-f2cfece5c802"
/dev/sda: UUID="2d9225e8-edc1-01fc-7d54-45fcbdcc8020" UUID_SUB="fec59176-0f3a-74d4-1947-32b508978749" LABEL="fangorn:0" TYPE="linux_raid_member"
/dev/sde: UUID="2d9225e8-edc1-01fc-7d54-45fcbdcc8020" UUID_SUB="efce71d0-3080-eecb-ce2f-8a166b2e4441" LABEL="fangorn:0" TYPE="linux_raid_member"
/dev/sdf: UUID="dddf84e3-2ec3-4156-8d18-5f1dce7be002" UUID_SUB="be5e9e9f-d627-2968-837c-9f656d2f62ba" LABEL="fangorn:1" TYPE="linux_raid_member"
/dev/sdg: UUID="dddf84e3-2ec3-4156-8d18-5f1dce7be002" UUID_SUB="1e73b358-a0c8-1c2d-34bd-8663e7906e6f" LABEL="fangorn:1" TYPE="linux_raid_member"
/dev/sdh1: UUID="6588304d-6098-4f80-836e-0e4832e2de8f" TYPE="ext4" PARTUUID="000533fc-01"
/dev/md0: UUID="7Nyd6C-oG50-b3jJ-aGkL-feIE-pAFc-5uM7vy" TYPE="LVM2_member"
/dev/md1: UUID="MtJAdS-Jdbn-2MR7-6evR-XCvL-wEm5-PUkg8p" TYPE="LVM2_member"
/dev/mapper/fg00-Data: LABEL="data" UUID="a26b7a38-d24f-4d28-ab8d-89233db95be6" TYPE="ext4"

PROMPT> mdadm -Es 
ARRAY /dev/md/1  metadata=1.2 UUID=dddf84e3:2ec34156:8d185f1d:ce7be002 name=fangorn:1
ARRAY /dev/md/0  metadata=1.2 UUID=2d9225e8:edc101fc:7d5445fc:bdcc8020 name=fangorn:0

strange: there are no /dev/md/0 and /dev/md/1 files but only /dev/md0 and /dev/md1

PROMPT> mdadm --detail /dev/md[01]
           Version : 1.2
     Creation Time : Wed Oct 16 16:51:48 2019
        Raid Level : raid1
        Array Size : 3906886464 (3725.90 GiB 4000.65 GB)
     Used Dev Size : 3906886464 (3725.90 GiB 4000.65 GB)
      Raid Devices : 2
     Total Devices : 2
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Wed Oct 16 16:57:03 2019
             State : clean 
    Active Devices : 2
   Working Devices : 2
    Failed Devices : 0
     Spare Devices : 0

Consistency Policy : bitmap

              Name : fangorn:0  (local to host fangorn)
              UUID : 2d9225e8:edc101fc:7d5445fc:bdcc8020
            Events : 2

    Number   Major   Minor   RaidDevice State
       0       8        0        0      active sync   /dev/sda
       1       8       64        1      active sync   /dev/sde
           Version : 1.2
     Creation Time : Wed Oct 16 16:51:56 2019
        Raid Level : raid1
        Array Size : 1953382464 (1862.89 GiB 2000.26 GB)
     Used Dev Size : 1953382464 (1862.89 GiB 2000.26 GB)
      Raid Devices : 2
     Total Devices : 2
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Wed Oct 16 16:51:56 2019
             State : clean 
    Active Devices : 2
   Working Devices : 2
    Failed Devices : 0
     Spare Devices : 0

Consistency Policy : bitmap

              Name : fangorn:1  (local to host fangorn)
              UUID : dddf84e3:2ec34156:8d185f1d:ce7be002
            Events : 1

    Number   Major   Minor   RaidDevice State
       0       8       80        0      active sync   /dev/sdf
       1       8       96        1      active sync   /dev/sdg

on top of these two RAID1 arrays I built a Volume-Group to glue them together to one single partition which I use as /data mount

The problem

whenever I reboot, the RAID gets lost. There is no single trace left of the arrays.

yes, I did edit /etc/mdadm/mdadm.conf

PROMPT> cat /etc/mdadm/mdadm.conf 
# mdadm.conf
# !NB! Run update-initramfs -u after updating this file.
# !NB! This will ensure that initramfs has an uptodate copy.
# Please refer to mdadm.conf(5) for information about this file.

# by default (built-in), scan all partitions (/proc/partitions) and all
# containers for MD superblocks. alternatively, specify devices to scan, using
# wildcards if desired.
#DEVICE partitions containers
DEVICE /dev/sda /dev/sde /dev/sdf /dev/sdg

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

# instruct the monitoring daemon where to send mail alerts
MAILADDR m++++@++++++.++

# definitions of existing MD arrays

#ARRAY /dev/md1 metadata=1.2 name=fangorn:1 UUID=7068062b:a1a9265e:a7b5dc00:586d9f1b
#ARRAY /dev/md0 metadata=1.2 name=fangorn:0 UUID=9ab9aecd:cdfd3fe8:04587007:892edf3e
ARRAY /dev/md0 level=raid1 num-devices=2 metadata=1.2 name=fangorn:0 UUID=7368baaf:9b08df19:d9362975:bf70eb1f devices=/dev/sda,/dev/sde
ARRAY /dev/md1 level=raid1 num-devices=2 metadata=1.2 name=fangorn:1 UUID=dc218d09:18f63682:78b5ab94:6aa53459 devices=/dev/sdf,/dev/sdg

yes, I issued update-initramfs -u

the only way to bring back my data is by recreating the arrays after every reboot:

mdadm --create /dev/md0 --assume-clean --level=raid1 --verbose --raid-devices=2 /dev/sda /dev/sde
mdadm --create /dev/md1 --assume-clean --level=raid1 --verbose --raid-devices=2 /dev/sdf /dev/sdg
  • notice the --assume-clean switches
  • The LVM recreates itself imediately and can be mounted.
  • No data is lost.

BUT: how can I make the system reassemble the arrays on reboot?

There already is quite some data on the disks, so I would not want to repartition the underlaying hardware unless I had a way to do so without loosing data.

Can I access the data without the arrays and LVM being up and running?

Additional Information

  • added 2019-10-22

Right after reboot - i.e. when the reassembly has failed and I am in single-user mode - I get the following outputs from mdadm (I deleted mdadm.conf in the meantime to see whether it will help - it diden't):

PROMPT> mdadm --assemble --scan -v 
mdadm: looking for devices for further assembly
mdadm: cannot open device /dev/sr0: No medium found
mdadm: no recogniseable superblock on /dev/sdh1
mdadm: Cannot assemble mbr metadata on /dev/sdh
mdadm: Cannot assemble mbr metadata on /dev/sdg
mdadm: Cannot assemble mbr metadata on /dev/sdf
mdadm: Cannot assemble mbr metadata on /dev/sde
mdadm: Cannot assemble mbr metadata on /dev/sda
mdadm: no recogniseable superblock on /dev/nvme0n1p3
mdadm: no recogniseable superblock on /dev/nvme0n1p2
mdadm: Cannot assemble mbr metadata on /dev/nvme0n1p1
mdadm: Cannot assemble mbr metadata on /dev/nvme0n1
mdadm: No arrays found in config file or automatically

After that I recreated the arrays as described above and got this output:

PROMPT> mdadm --create /dev/md0 --assume-clean --level=raid1 --verbose --raid-devices=2 /dev/sda /dev/sde
mdadm: partition table exists on /dev/sda
mdadm: partition table exists on /dev/sda but will be lost or
       meaningless after creating array
mdadm: Note: this array has metadata at the start and
    may not be suitable as a boot device.  If you plan to
    store '/boot' on this device please ensure that
    your boot-loader understands md/v1.x metadata, or use
mdadm: partition table exists on /dev/sde
mdadm: partition table exists on /dev/sde but will be lost or
       meaningless after creating array
mdadm: size set to 3906886464K
mdadm: automatically enabling write-intent bitmap on large array
Continue creating array? 
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.

PROMPT> mdadm --create /dev/md1 --assume-clean --level=raid1 --verbose --raid-devices=2 /dev/sdf /dev/sdg
mdadm: partition table exists on /dev/sdf
mdadm: partition table exists on /dev/sdf but will be lost or
       meaningless after creating array
mdadm: Note: this array has metadata at the start and
    may not be suitable as a boot device.  If you plan to
    store '/boot' on this device please ensure that
    your boot-loader understands md/v1.x metadata, or use
mdadm: partition table exists on /dev/sdg
mdadm: partition table exists on /dev/sdg but will be lost or
       meaningless after creating array
mdadm: size set to 1953382464K
mdadm: automatically enabling write-intent bitmap on large array
Continue creating array?
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md1 started.

Did another lsblk afterwards:

sda             3,7T linux_raid_member disk  
└─md0           3,7T LVM2_member       raid1 
  └─fg00-Data   5,5T ext4              lvm   /data
sde             3,7T linux_raid_member disk  
└─md0           3,7T LVM2_member       raid1 
  └─fg00-Data   5,5T ext4              lvm   /data
sdf             1,8T linux_raid_member disk  
└─md1           1,8T LVM2_member       raid1 
  └─fg00-Data   5,5T ext4              lvm   /data
sdg             1,8T linux_raid_member disk  
└─md1           1,8T LVM2_member       raid1 
  └─fg00-Data   5,5T ext4              lvm   /data
sdh           119,2G                   disk  
└─sdh1        119,2G ext4              part  /home
sr0            1024M                   rom   
nvme0n1         477G                   disk  
├─nvme0n1p1     300M vfat              part  /boot/efi
├─nvme0n1p2   442,1G ext4              part  /
└─nvme0n1p3    34,6G swap              part  [SWAP]

Is this a hint someone can make sens of?

PROMPT> fdisk -l

The primary GPT table is corrupt, but the backup appears OK, so that will be used.
Disk /dev/sda: 3,65 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: ST4000DM004-2CV1
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 513C284A-5CC0-4888-8AD0-83C4291B3D78

The primary GPT table is corrupt, but the backup appears OK, so that will be used.
Disk /dev/sde: 3,65 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: ST4000DM004-2CV1
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 437D10E3-E679-4062-9321-E8EE1A1AA2F5

The primary GPT table is corrupt, but the backup appears OK, so that will be used.
Disk /dev/sdf: 1,84 TiB, 2000398934016 bytes, 3907029168 sectors
Disk model: ST2000DL003-9VT1
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: E07D5DB8-7253-45DE-92C1-255B7F3E56C8

The primary GPT table is corrupt, but the backup appears OK, so that will be used.
Disk /dev/sdg: 1,84 TiB, 2000398934016 bytes, 3907029168 sectors
Disk model: Hitachi HDS5C302
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: E07D5DB8-7253-45DE-92C1-255B7F3E56C8

Disk /dev/md0: 3,65 TiB, 4000651739136 bytes, 7813772928 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes

Disk /dev/md1: 1,84 TiB, 2000263643136 bytes, 3906764928 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes

Disk /dev/mapper/fg00-Data: 5,47 TiB, 6000908173312 bytes, 11720523776 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes

another bit of information

PROMPT> gdisk /dev/sda
GPT fdisk (gdisk) version 1.0.4

Caution! After loading partitions, the CRC doesn't check out!
Warning! Main partition table CRC mismatch! Loaded backup partition table
instead of main partition table!

Warning! One or more CRCs don't match. You should repair the disk!
Main header: OK
Backup header: OK
Main partition table: ERROR
Backup partition table: OK

Partition table scan:
  MBR: protective
  BSD: not present
  APM: not present
  GPT: damaged

Caution: Found protective or hybrid MBR and corrupt GPT. Using GPT, but disk
verification and recovery are STRONGLY recommended.

Martin L.

Posted 2019-10-16T15:40:27.077

Reputation: 33

This Question is quite similar but without any solution:

– Martin L. – 2019-10-22T07:05:15.733

In another thread I found the following note:

On a different note: If you don’t need to boot from the RAID5, there is no need to add the configuration to the /etc/mdadm/mdadm.conf file. Ubuntu will automatically start the RAID, since the configuration is available in the RAID superblock.

... which exactly is what does not happen in my case.

– Martin L. – 2019-10-22T07:10:57.760

Strange observation mdadm --assemble --scan -v tells me mdadm: /dev/sd[aefg] has wrong uuid. – Martin L. – 2019-10-22T07:20:12.707

A comment on this question by @jmlnik makes me think, I got the RAID completely wrong...

– Martin L. – 2019-10-23T07:08:05.013

– Martin L. – 2019-10-23T07:25:49.597



I'd like to present another variant of Martin L. solution. It differs in the fact it introduces much less downtime, because data migration onto new array could be done transparently while system works. You will only experience reduced disk performance during migration.

Do as it is suggested in his answer up to the place where he suggests to create new VGs.

Don't create new VG. Create new PVs on the newly create arrays, and extend your existing VG with these PV: vgextend fg00 /dev/md-NEW.

Then, move logical volumes from old pvs to new ones with pvmove /dev/md-OLD. This could be done even while file systems are mounted and being accessed. This will take a long time, but eventually it finishes. I'd run this within screen, and verbosely: screen pvmove -vi5 /dev/md-OLD, to be sure it wouldn't interrupt if SSH session closes and it shows a progress every 5 seconds.

There could be the case there is not enough PEs in the new PV to do this. It is because you now use partitions instead of whole drives, useable space and array size is slightly smaller. If it is so, you have to reduce one LV. For example, unmount a FS, reduce is (with resize2fs) and reduce LV size. This will take a longer time, is still faster than copying a busy file system file-by-file.

When old PVs are empty (pvmove finishes), remove them from VG, remove PV labels and remove old arrays. Zap those now unused drives, partition them and add into running arrays. Array resync will also be also done in the background and you only experience reduced disk performance until it completes.

Now, don't forget to fix booting, i.e. mdadam --examine --scan >> /etc/mdadm/mdadm.conf, update-initramfs and so on.

Nikita Kipriyanov

Posted 2019-10-16T15:40:27.077

Reputation: 505


@nh2 gives an easy but possibly dangerous solution in his answer to What's the difference between creating mdadm array using partitions or the whole disks directly

By the way, if this happens to you, your data is not lost. You can most likely just sgdisk --zap the device, and then recreate the RAID with e.g. mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sdc /dev/sdd (mdadm will tell you that it already detects past data, and asks you if you want to continue re-using that data). I tried this mutliple times and it worked, but I still recommend taking a backup before you do it.

After some lengthy research I managed to find a solution.

Here is, what I did

First some status-information

PROMPT> df -h
Dateisystem           Größe Benutzt Verf. Verw% Eingehängt auf
/dev/mapper/fg00-Data  5,4T    1,5T  3,8T   28% /data

Then unmount the partition

PROMPT> umount /data

PROMPT> cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md1 : active raid1 sdg[1] sdf[0]
      1953382464 blocks super 1.2 [2/2] [UU]
      bitmap: 0/15 pages [0KB], 65536KB chunk

md0 : active raid1 sde[1] sda[0]
      3906886464 blocks super 1.2 [2/2] [UU]
      bitmap: 0/30 pages [0KB], 65536KB chunk

unused devices: <none>

Now I degrade the two arrays

PROMPT > mdadm --manage /dev/md0 --fail /dev/sde
mdadm: set /dev/sde faulty in /dev/md0

PROMPT > mdadm --manage /dev/md1 --fail /dev/sdg
mdadm: set /dev/sdg faulty in /dev/md1

PROMPT > cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md1 : active raid1 sdg[1](F) sdf[0]
      1953382464 blocks super 1.2 [2/1] [U_]
      bitmap: 0/15 pages [0KB], 65536KB chunk

md0 : active raid1 sde[1](F) sda[0]
      3906886464 blocks super 1.2 [2/1] [U_]
      bitmap: 0/30 pages [0KB], 65536KB chunk

unused devices: <none>

Remove the disks from the array

PROMPT > mdadm --manage /dev/md0 --remove /dev/sde 
mdadm: hot removed /dev/sde from /dev/md0
PROMPT > mdadm --manage /dev/md1 --remove /dev/sdg
mdadm: hot removed /dev/sdg from /dev/md1

PROMPT > cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md1 : active raid1 sdf[0]
      1953382464 blocks super 1.2 [2/1] [U_]
      bitmap: 0/15 pages [0KB], 65536KB chunk

md0 : active raid1 sda[0]
      3906886464 blocks super 1.2 [2/1] [U_]
      bitmap: 0/30 pages [0KB], 65536KB chunk

unused devices: <none>

Now /dev/sde and /dev/sdg are free to be (re)partitioned.

  • So I created new partitions on /dev/sde and /dev/sdg as suggested a few MB smaller than the available space.
  • Created new 2-disk RAID1 arrays with one active disk and one "missing"
  • built up an new LVM volume-group with those new arrays as physical volumes
  • created a logical volume on top of it (same size as the old one minus the few MB I lost when creating the partitions)
  • copied all data from the old LV to the new one
  • destroyed the old RAID and added the disks partitions to the new one

Here is the new satus

NAME              SIZE FSTYPE            TYPE  MOUNTPOINT
sda               3,7T                   disk  
└─sda1            3,7T linux_raid_member part  
  └─md2           3,7T LVM2_member       raid1 
    └─fg01-Data   5,5T ext4              lvm   /data
sde               3,7T                   disk  
└─sde1            3,7T linux_raid_member part  
  └─md2           3,7T LVM2_member       raid1 
    └─fg01-Data   5,5T ext4              lvm   /data
sdf               1,8T                   disk  
└─sdf1            1,8T linux_raid_member part  
  └─md3           1,8T LVM2_member       raid1 
    └─fg01-Data   5,5T ext4              lvm   /data
sdg               1,8T                   disk  
└─sdg1            1,8T linux_raid_member part  
  └─md3           1,8T LVM2_member       raid1 
    └─fg01-Data   5,5T ext4              lvm   /data
sdh             119,2G                   disk  
└─sdh1          119,2G ext4              part  /home
sr0              1024M                   rom   
nvme0n1           477G                   disk  
├─nvme0n1p1       300M vfat              part  /boot/efi
├─nvme0n1p2     442,1G ext4              part  /
└─nvme0n1p3      34,6G swap              part  [SWAP]

PROMPT > cat /proc/mdstat 
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md3 : active raid1 sdf1[1] sdg1[0]
      1953381376 blocks super 1.2 [2/1] [U_]
      [==>..................]  recovery = 10.0% (196493504/1953381376) finish=224.9min speed=130146K/sec
      bitmap: 0/15 pages [0KB], 65536KB chunk

md2 : active raid1 sda1[1] sde1[0]
      3906884608 blocks super 1.2 [2/1] [U_]
      [=>...................]  recovery =  6.7% (263818176/3906884608) finish=429.0min speed=141512K/sec
      bitmap: 2/30 pages [8KB], 65536KB chunk

unused devices: <none>

Martin L.

Posted 2019-10-16T15:40:27.077

Reputation: 33