3

I created a raidz1-0 pool with three devices. Two where added by their /dev/disk/by-id ID and somehow I decided to use /dev/sdg1 for the third one.

After a reboot years later, I can't get all three devices online again. Here's the current status:

# zpool status safe00
  pool: safe00
 state: DEGRADED
status: One or more devices has been taken offline by the administrator.
    Sufficient replicas exist for the pool to continue functioning in a
    degraded state.
action: Online the device using 'zpool online' or replace the device with
    'zpool replace'.
  scan: scrub repaired 0 in 2h54m with 0 errors on Sun Jan 12 03:18:13 2020
config:

    NAME                                          STATE     READ WRITE CKSUM
    safe00                                        DEGRADED     0     0     0
      raidz1-0                                    DEGRADED     0     0     0
        ata-ST3500418AS_9VM89VGD                  ONLINE       0     0     0
        13759036004139463181                      OFFLINE      0     0     0  was /dev/sdg1
        ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E1NYTHJF  ONLINE       0     0     0

errors: No known data errors

The drives in this machine are:

# lsblk -f 
NAME   FSTYPE     LABEL      UUID                                 MOUNTPOINT
sda                                                               
├─sda1 ext4       Ubuntu LTS 8a2a3c19-580a-474d-b248-bf0822cacab6 /
├─sda2 vfat                  B55A-693E                            /boot/efi
└─sda3 swap       swap       7d1cf001-07a6-4534-9624-054d70a562d5 [SWAP]
sdb    zfs_member dump       11482263899067190471                 
├─sdb1 zfs_member dump       866164895581740988                   
└─sdb9 zfs_member dump       11482263899067190471                 
sdc                                                               
sdd                                                               
├─sdd1 zfs_member dump       866164895581740988                   
└─sdd9                                                            
sde    zfs_member dump       866164895581740988                   
├─sde1 zfs_member safe00     6143939454380723991                  
└─sde2 zfs_member dump       866164895581740988                   
sdf                                                               
├─sdf1 zfs_member dump       866164895581740988                   
└─sdf9                                                            
sdg                                                               
├─sdg1 zfs_member safe00     6143939454380723991                  
└─sdg9                                                            
sdh                                                               
├─sdh1 zfs_member safe00     6143939454380723991                  
└─sdh9   

which is to say that the safe00 should contain the three devices: sde1, sdg & sdh.

And just to get mapping to the by-id and path:

# cd /dev/disk/by-id
# ls -la ata* | cut -b 40- | awk '{split($0, a, " "); print a[3],a[2],a[1]}' | sort -h
../../sda1 -> ata-INTEL_SSDSC2KW120H6_BTLT712507HK120GGN-part1
../../sda2 -> ata-INTEL_SSDSC2KW120H6_BTLT712507HK120GGN-part2
../../sda3 -> ata-INTEL_SSDSC2KW120H6_BTLT712507HK120GGN-part3
../../sda -> ata-INTEL_SSDSC2KW120H6_BTLT712507HK120GGN
../../sdb1 -> ata-WDC_WD20EARX-00PASB0_WD-WCAZAE573068-part1
../../sdb9 -> ata-WDC_WD20EARX-00PASB0_WD-WCAZAE573068-part9
../../sdb -> ata-WDC_WD20EARX-00PASB0_WD-WCAZAE573068
../../sdc -> ata-SAMSUNG_HD204UI_S2H7JD1ZA21911
../../sdd1 -> ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E0416553-part1
../../sdd9 -> ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E0416553-part9
../../sdd -> ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E0416553
../../sde1 -> ata-ST6000VN0033-2EE110_ZAD5S9M9-part1
../../sde2 -> ata-ST6000VN0033-2EE110_ZAD5S9M9-part2
../../sde -> ata-ST6000VN0033-2EE110_ZAD5S9M9
../../sdf1 -> ata-WDC_WD10EADS-00L5B1_WD-WCAU4C151323-part1
../../sdf9 -> ata-WDC_WD10EADS-00L5B1_WD-WCAU4C151323-part9
../../sdf -> ata-WDC_WD10EADS-00L5B1_WD-WCAU4C151323
../../sdg1 -> ata-ST3500418AS_9VM89VGD-part1
../../sdg9 -> ata-ST3500418AS_9VM89VGD-part9
../../sdg -> ata-ST3500418AS_9VM89VGD
../../sdh1 -> ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E1NYTHJF-part1
../../sdh9 -> ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E1NYTHJF-part9
../../sdh -> ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E1NYTHJF

And zdb (with minor ANNOTATION by me)

# zdb -C safe00

MOS Configuration:
        version: 5000
        name: 'safe00'
        state: 0
        txg: 22826770
        pool_guid: 6143939454380723991
        errata: 0
        hostname: 'filserver'
        vdev_children: 1
        vdev_tree:
            type: 'root'
            id: 0
            guid: 6143939454380723991
            children[0]:
                type: 'raidz'
                id: 0
                guid: 9801294574244764778
                nparity: 1
                metaslab_array: 33
                metaslab_shift: 33
                ashift: 12
                asize: 1500281044992
                is_log: 0
                create_txg: 4
                children[0]:
                    type: 'disk'
                    id: 0
                    guid: 135921832921042063
                    path: '/dev/disk/by-id/ata-ST3500418AS_9VM89VGD-part1'
                    whole_disk: 1
                    DTL: 58
                    create_txg: 4
                children[1]:         ### THIS CHILD USED TO BE sdg1
                    type: 'disk'
                    id: 1
                    guid: 13759036004139463181
                    path: '/dev/sdg1'
                    whole_disk: 0
                    not_present: 1   ### THIS IS sde1 NOW
                    DTL: 52
                    create_txg: 4
                    offline: 1
                children[2]:         ### THIS CHILD IS NOW sdg1
                    type: 'disk'
                    id: 2
                    guid: 2522190573401341943
                    path: '/dev/disk/by-id/ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E1NYTHJF-part1'
                    whole_disk: 1
                    DTL: 57
                    create_txg: 4
        features_for_read:
            com.delphix:hole_birth
            com.delphix:embedded_data
space map refcount mismatch: expected 178 != actual 177

Summary for the pool safe00:

offline: sde1 --> ata-ST6000VN0033-2EE110_ZAD5S9M9-part1  <-- this likely was sdg1 before reboot
online:  sdg1 --> ata-ST3500418AS_9VM89VGD
online:  sdh1 --> ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E1NYTHJF

Trying to online the device that's offline:

# zpool online safe00 ata-ST6000VN0033-2EE110_ZAD5S9M9-part1
cannot online ata-ST6000VN0033-2EE110_ZAD5S9M9-part1: no such device in pool
# zpool online safe00 /dev/sde1
cannot online /dev/sde1: no such device in pool

I also tried to replace the offline device with the real one:

# zpool replace safe00 13759036004139463181 ata-ST6000VN0033-2EE110_ZAD5S9M9-part1
invalid vdev specification
use '-f' to override the following errors:
/dev/disk/by-id/ata-ST6000VN0033-2EE110_ZAD5S9M9-part1 is part of active pool 'safe00'
# zpool replace safe00 /dev/sdg1 ata-ST6000VN0033-2EE110_ZAD5S9M9-part1
invalid vdev specification
use '-f' to override the following errors:
/dev/disk/by-id/ata-ST6000VN0033-2EE110_ZAD5S9M9-part1 is part of active pool 'safe00'

So, finally I tried to online the missing device using it's ID:

# zpool online safe00 13759036004139463181
warning: device '13759036004139463181' onlined, but remains in faulted state
use 'zpool replace' to replace devices that are no longer present

This happily put the disk in FAULTED and a repair was started.

# zpool status safe00
  pool: safe00
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
    invalid.  Sufficient replicas exist for the pool to continue
    functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-4J
  scan: scrub in progress since Sun Feb 23 11:19:00 2020
    14.3G scanned out of 1.09T at 104M/s, 3h0m to go
    0 repaired, 1.29% done
config:

    NAME                                          STATE     READ WRITE CKSUM
    safe00                                        DEGRADED     0     0     0
      raidz1-0                                    DEGRADED     0     0     0
        ata-ST3500418AS_9VM89VGD                  ONLINE       0     0     0
        13759036004139463181                      FAULTED      0     0     0  was /dev/sdg1
        ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E1NYTHJF  ONLINE       0     0     0

errors: No known data errors

What should I do to avoid this from happening again - how do I change the device's "path" property in zdb so it doesn't rely on Linux' enumeration of disks at bootup?

Fredrik Wendt
  • 183
  • 1
  • 6

1 Answers1

2

The most reliable method might be to create pools using GUID, or GPT labels, and personally I think GPT label is a better solution as mentioned in one of the posts in Best practice for specifying disks (vdevs) for ZFS pools in 2021

data-1-sces3-3tb-Z1Y0P0DK

<pool>-<pool-id>-<disk-vendor-and-model-name>-<size-of-disk>-<disk-serial-number>

Naming in this way will help you with these:

  1. Easily understand topology of defined pools.
  2. Easily find vendor name and model name of drives used.
  3. Easily find disk capacities.
  4. Easily identify and find a bad disk(s) in the drive cage(s) while you include a serial number printed on drive inside a GPT label.

There exists other persistent methods to identify disks, such as using some kind of IDs, but it's not intuitive enough on its own, you can't find your disk easily just based on its electric ID, you have to link the ID to its physical location by yourself.

And I also found this might help if you want to remap the disks in the poolMixed gptid and dev names in zpool status:

# zpool import -d /dev/gptid tank
nullstd
  • 21
  • 3
  • 1
    While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes. - [From Review](/review/late-answers/502084) – djdomi Nov 07 '21 at 11:30