I've got 6 physical drives in RAID-Z2, which I intend to one-by-one convert to dm-crypt devices.
My process was roughly:
dd if=/dev/zero of=/dev/sdf- Create keyfile
/etc/crypttab.d/crypt-1.key cryptsetup luksFormat /dev/sdf- Append
crypt-1 <raw-disk-uuid> /etc/crypttab.d/crypt-1.key luksto/etc/crypttab cryptsetup luksOpen /dev/sdf crypt-1zfs replace my_pool <raw-disk-uuid> /dev/mapper/crypt-1
Once the resilvering was finished (which worked fine), I rebooted the machine to verify the setup before continuing to other disks. What I found, however, was that ZFS labeled crypt-1 as UNAVAIL.
ls /dev/mapper verified that dm-crypt activated the LUKS container correctly. Running zpool online my_pool crypt-1 causes ZFS to begin resilvering but then completes and resumes healthy operation in a manner of seconds.
I'm guessing the dm-crypt device is simply not loaded when ZFS first tries to accesses my_pool? Is it a question of load-order or do I need to be using a different identifier for the LUKS device in /etc/crypttab? How do I ensure that ZFS sees these LUKS devices on reboot?
This is a systemd box (Arch) if that matters.
Thanks!
EDIT 1:
During cryptsetup creation, I used the SCSI identifiers (e.g. /dev/sdf) to initialize the device with LUKS. However, in /etc/crypttab I'm specifying the devices through the UUID of the underlying physical disk. Is the cryptsetup utility sensitive to how you identify targets? In other words, do I need to re-do my cryptsetup and pass it the disk UUID instead of the SCSI name?
EDIT 2:
I see the following ls -alsvh /dev/disk/by-id:
0 lrwxrwxrwx 1 root root 10 Jul 8 08:18 dm-uuid-CRYPT-LUKS1-6bed03ceaafe4539a375536d11309ff0-locker-1 -> ../../dm-0
From what I know, if it's in /dev/disk/by-id it is - by definition? - not subject to change (even across reboots). I will replace the definition of the dm-crypt-name locker-1 in my zpool with the id-name /dev/disk/by-id/dm-uuid-CRYPT-LUKS1-6bed03ceaafe4539a375536d11309ff0-locker-1 and report back. Same drive, same LUKS container, just a different way of addressing it.
EDIT 3:
My proposal from edit #2 above did not work. I had to wipe the drive and re-cryptsetup the device because ZFS would not allow me to replace the device with itself. After resilvering was complete I rebooted and zpool status is DEGRADED and device dm-uuid-CRYPT-LUKS1-71e12fa7dc034d919e800ba89aec3b17-locker-1 is UNAVAIL.
It's worth noting that locker-1 does appear in ls /dev/disk/by-id as well as lsblk, so it is being loaded correctly. I can verify this by running:
zpool online inground dm-uuid-CRYPT-LUKS1-71e12fa7dc034d919e800ba89aec3b17-locker-1
Which exits cleanly, and brings the device back into the pool.
Perhaps this is due to the load order of different modules during boot? Maybe the activation of dm-crypt devices is done such that ZFS begins importing pools before the LUKS container is properly open?