ZFS on LUKS not recognized at boot

Question

I've got 6 physical drives in RAID-Z2, which I intend to one-by-one convert to dm-crypt devices.

My process was roughly:

dd if=/dev/zero of=/dev/sdf
Create keyfile /etc/crypttab.d/crypt-1.key
cryptsetup luksFormat /dev/sdf
Append crypt-1 <raw-disk-uuid> /etc/crypttab.d/crypt-1.key luks to /etc/crypttab
cryptsetup luksOpen /dev/sdf crypt-1
zfs replace my_pool <raw-disk-uuid> /dev/mapper/crypt-1

Once the resilvering was finished (which worked fine), I rebooted the machine to verify the setup before continuing to other disks. What I found, however, was that ZFS labeled crypt-1 as UNAVAIL.

ls /dev/mapper verified that dm-crypt activated the LUKS container correctly. Running zpool online my_pool crypt-1 causes ZFS to begin resilvering but then completes and resumes healthy operation in a manner of seconds.

I'm guessing the dm-crypt device is simply not loaded when ZFS first tries to accesses my_pool? Is it a question of load-order or do I need to be using a different identifier for the LUKS device in /etc/crypttab? How do I ensure that ZFS sees these LUKS devices on reboot?

This is a systemd box (Arch) if that matters.

Thanks!

EDIT 1:

During cryptsetup creation, I used the SCSI identifiers (e.g. /dev/sdf) to initialize the device with LUKS. However, in /etc/crypttab I'm specifying the devices through the UUID of the underlying physical disk. Is the cryptsetup utility sensitive to how you identify targets? In other words, do I need to re-do my cryptsetup and pass it the disk UUID instead of the SCSI name?

EDIT 2:

I see the following ls -alsvh /dev/disk/by-id:

0 lrwxrwxrwx 1 root root  10 Jul  8 08:18 dm-uuid-CRYPT-LUKS1-6bed03ceaafe4539a375536d11309ff0-locker-1 -> ../../dm-0

From what I know, if it's in /dev/disk/by-id it is - by definition? - not subject to change (even across reboots). I will replace the definition of the dm-crypt-name locker-1 in my zpool with the id-name /dev/disk/by-id/dm-uuid-CRYPT-LUKS1-6bed03ceaafe4539a375536d11309ff0-locker-1 and report back. Same drive, same LUKS container, just a different way of addressing it.

EDIT 3:

My proposal from edit #2 above did not work. I had to wipe the drive and re-cryptsetup the device because ZFS would not allow me to replace the device with itself. After resilvering was complete I rebooted and zpool status is DEGRADED and device dm-uuid-CRYPT-LUKS1-71e12fa7dc034d919e800ba89aec3b17-locker-1 is UNAVAIL.

It's worth noting that locker-1 does appear in ls /dev/disk/by-id as well as lsblk, so it is being loaded correctly. I can verify this by running:

zpool online inground dm-uuid-CRYPT-LUKS1-71e12fa7dc034d919e800ba89aec3b17-locker-1

Which exits cleanly, and brings the device back into the pool.

Perhaps this is due to the load order of different modules during boot? Maybe the activation of dm-crypt devices is done such that ZFS begins importing pools before the LUKS container is properly open?

Agreed, and I've been using UUIDs in all of my ZFS configurations - are `/dev/mapper` entries not considered WWNs? If not, how can I retrieve a UUID or WWN for the dm-crypt devices in `/dev/mapper`? — Chris Tonkinson, Jul 05 '14 at 17:09
I've just updated step #6 - when I ran the command I fully qualified the path to `/dev/mapper/crypt-1` but neglected to reflect that in my example above. — Chris Tonkinson, Jul 05 '14 at 17:11

score 0 · Answer 1 · answered Aug 29 '14 at 09:59

You could try to export your pool, then symlink the device nodes of call constituent devices into e.g. /dev/vdevs, and run

zpool import -d /dev/vdevs poolname

If the vdev is found this way, then you can make the symlinking occur before the zpool import in the bootprocess (either maybe via udev, or a script) as a workaround.

ZFS on LUKS not recognized at boot

1 Answers1