3

I am running Debian 9 with mdadm RAID 1 on it. Previously both hard-drives could boot into the OS, now only one of them can.

I have recently had a faulty disk which needed to be replaced, so replace it I did.

First, I ran

mdadm --add /dev/md0 /dev/sda

Which worked well. Right after that I ran

grub-install /dev/sda

Which gave me the following output:

Installing for i386-pc platform.
grub-install: warning: Couldn't find physical volume `(null)'. Some modules may be missing from core image..
grub-install: error: unable to identify a filesystem in hd0; safety check can't be performed.

Here is my output from lsblk:

sda           8:0    0 447.1G  0 disk
└─md0         9:0    0 232.8G  0 raid1
  ├─md0p1   259:0    0  14.9G  0 md    [SWAP]
  ├─md0p2   259:1    0     1K  0 md
  ├─md0p3   259:2    0   216G  0 md    /
  └─md0p5   259:3    0   1.9G  0 md    /boot
sdb           8:16   0   5.5T  0 disk
├─sdb1        8:17   0   5.5T  0 part
└─sdb9        8:25   0     8M  0 part
sdc           8:32   0   5.5T  0 disk
├─sdc1        8:33   0   5.5T  0 part
└─sdc9        8:41   0     8M  0 part
sdd           8:48   0 232.9G  0 disk
└─sdd1        8:49   0 232.9G  0 part
  └─md0       9:0    0 232.8G  0 raid1
    ├─md0p1 259:0    0  14.9G  0 md    [SWAP]
    ├─md0p2 259:1    0     1K  0 md
    ├─md0p3 259:2    0   216G  0 md    /
    └─md0p5 259:3    0   1.9G  0 md    /boot

And here is the output from mdadm --detail /dev/md0:

/dev/md0:
        Version : 1.2
  Creation Time : Wed Dec 12 15:26:35 2018
     Raid Level : raid1
     Array Size : 244066304 (232.76 GiB 249.92 GB)
  Used Dev Size : 244066304 (232.76 GiB 249.92 GB)
   Raid Devices : 2
  Total Devices : 2
    Persistence : Superblock is persistent

  Intent Bitmap : Internal

    Update Time : Thu May 28 18:59:51 2020
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           Name : localserver:0  (local to host localserver)
           UUID : 83d5a386:66110e10:e5f3c600:734423a8
         Events : 5339803

    Number   Major   Minor   RaidDevice State
       2       8        0        0      active sync   /dev/sda
       1       8       49        1      active sync   /dev/sdd1

I have tried booting with just /dev/sda, but to no avail. I have also tried running blockdev --flushbufs /dev/sda, as recommended by some, but to no avail.

When running GParted I can see that /dev/sdd1 has the flags boot and raid, while /dev/sda has none. I also see there that /dev/sda's first sector starts at 0, while /dev/sdd1's starts at 2048.

Can anyone suggest a way I can solve this?

I don't mind detaching the "weird" hard-drive, formatting it, and reattaching it.

Oleg
  • 343
  • 1
  • 6
  • 16

1 Answers1

2

Ok - so this is how I've solved it. The fact that /dev/sdd has a partition called /dev/sdd1 and that the sector starts at 2048 was of great help.

This wikiarticle on the Arch Wiki was of great help as well. The key is to have both rives partitioned in exactly the same way.

  1. let's remove the disk that wouldn't let us install GRUB on it from the software raid:
mdadm --fail /dev/md0 /dev/sda
mdadm --remove /dev/md0 /dev/sda
  1. Here comes the magical bit. Using sfdisk, let's save the partitioning information of our working disk, and then repartition the problematic disk:
sfdisk -d /dev/sdd > raidinfo-partitions.sdd
sfdisk /dev/sda < raidinfo-partitions.sdd

Voila!

  1. let's re-add the disk to our software RAID array:

mdadm --add /dev/md0 /dev/sda1

  1. Finally, when both drives are synced, let's install grub:

grub-install /dev/sda

Oleg
  • 343
  • 1
  • 6
  • 16