4

I have a Debian server with MD raid (2 actives, one spare):

Personalities : [raid1] 
md1 : active raid1 sdc2[0] sdb2[1] sda2[2](S)
      1068224 blocks [2/2] [UU]

md0 : active raid1 sdc1[2](S) sdb1[1] sda1[0]
      487315584 blocks [2/2] [UU]
      bitmap: 5/233 pages [20KB], 1024KB chunk

unused devices: <none>

Whenever I boot this server, the array becomes degraded and it starts syncing the spare disk. The thing is, it seems to be because there is a USB disk attached to it, which is currently /dev/sdd. It boots fine when this disk is not present. /dev/sdd1, the only partition, has no md superblock on it, and the partition type is Linux, not raid autodetect.

This is the mirror device detail for md0:

mdadm --detail /dev/md0
/dev/md0:
        Version : 0.90
  Creation Time : Sun Jun  8 04:10:39 2008
     Raid Level : raid1
     Array Size : 487315584 (464.74 GiB 499.01 GB)
  Used Dev Size : 487315584 (464.74 GiB 499.01 GB)
   Raid Devices : 2
  Total Devices : 3
Preferred Minor : 0
    Persistence : Superblock is persistent

  Intent Bitmap : Internal

    Update Time : Tue Sep 15 09:23:33 2015
          State : active 
 Active Devices : 2
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 1

           UUID : 9e408fbb:563a5459:f999b789:24d3b44e
         Events : 0.83145

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       8       17        1      active sync   /dev/sdb1

       2       8       33        -      spare   /dev/sdc1

The details of /dev/sdc1 does really show it's spare:

mdadm --examine /dev/sdc1
/dev/sdc1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 9e408fbb:563a5459:f999b789:24d3b44e
  Creation Time : Sun Jun  8 04:10:39 2008
     Raid Level : raid1
  Used Dev Size : 487315584 (464.74 GiB 499.01 GB)
     Array Size : 487315584 (464.74 GiB 499.01 GB)
   Raid Devices : 2
  Total Devices : 3
Preferred Minor : 0

    Update Time : Sat Sep 12 21:09:59 2015
          State : clean
Internal Bitmap : present
 Active Devices : 2
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 1
       Checksum : 7761bb13 - correct
         Events : 83145


      Number   Major   Minor   RaidDevice State
this     2       8       33        2      spare   /dev/sdc1

   0     0       8        1        0      active sync   /dev/sda1
   1     1       8       17        1      active sync   /dev/sdb1
   2     2       8       33        2      spare   /dev/sdc1

Really nothing out of the ordinary.

Any idea?

Edit:

The relevant content of /etc/mdadm/mdadm.conf:

ARRAY /dev/md0 level=raid1 num-devices=2 UUID=9e408fbb:563a5459:f999b789:24d3b44e
   spares=1
ARRAY /dev/md1 level=raid1 num-devices=2 UUID=e4578e57:9e0fd9e9:c7736f30:0e251564
   spares=1

This sort of matches the output of mdadm --detail --scan:

ARRAY /dev/md0 metadata=0.90 spares=1 UUID=9e408fbb:563a5459:f999b789:24d3b44e
ARRAY /dev/md1 metadata=0.90 spares=1 UUID=e4578e57:9e0fd9e9:c7736f30:0e251564

Is it perhaps the newline?

  • Kernel 3.2.0-4-686-pae.
  • Debian 7.8
  • mdadm - v3.2.5 - 18th May 2012
A.L
  • 121
  • 1
  • 7
Halfgaar
  • 7,921
  • 5
  • 42
  • 81
  • can you share your /etc/mdadm/mdadm.conf? I suspect you have used native names there, not UIDs. – asdmin Sep 15 '15 at 07:36
  • "Any line that starts with white space (space or tab) is treated as though it were a continuation of the previous line." it's not the newline – asdmin Sep 15 '15 at 08:07

1 Answers1

5

I think you might have a too old mdadm.conf in your initramfs and/or mdadm gets confused during the discovery&initialization of the arrays.

Try telling mdadm to consider only disks on the PCI bus, by adding the following line in mdadm.conf:

DEVICE /dev/disk/by-path/pci*

Going a step further, you can directly specify the disks themselves. Make sure that you use an order independent addressing (for example, by UUIDs), and that you put the spare at the last entry:

DEVICE /dev/disk/by-uuid/<uuid1>
DEVICE /dev/disk/by-uuid/<uuid2>
DEVICE /dev/disk/by-uuid/<uuid3>

To go even further, after the previous step, you can also add devices= attributes to the ARRAY tags, giving the exact layout of your raid to mdadm. Consider the order of the devices here too.

After this don't forget to update initramfs, as mdadm.conf is also part of the boot initialization process:

sudo update-initramfs -k all -u
asdmin
  • 2,020
  • 16
  • 28
  • 1
    I ended up putting `DEVICE /dev/disk/by-id/ata-*` in it; the raw partitions are not named by uuid, and the usb disk is present in the PCI listing. I hope the order doesn't matter, because the alphabetical order is wrong. Besides, when the spare is activated at some point, it should keep working, so I assume the order is not relevant. And, it works with any of the disks as spare when I boot the server without the usb disk, so I think this will work. I will let you know when I reboot the server. – Halfgaar Sep 15 '15 at 08:28
  • I am happy if I could help. Thanks for the interesting question! :) – asdmin Sep 15 '15 at 08:31
  • 1
    Finally rebooted the machine (not by choice; power outage...). The problem was fixed. – Halfgaar Oct 26 '15 at 15:31