2

My system was nice and configured then I had a failed zfs (raidz2) drive. I swapped that drive but then it wouldn't enroll. So when I rebooted the system wont boot until one of the unbeknownst array drives are disconnected (i think its the new one).

I have since booted (by disconnecting them, then re-connecting them early in the boot process), replaced the zfs drive successfully, and have a working system. However I need to fix the boot issue now.

Looking at fstab, it has what appears to be a correct uuid, so I cant see what the hangup is.

UUID=bbc69fc6-12fa-499a-a0c6-e0f65e248ce2 /                       xfs     defaults        0 0
UUID=226e836d-7b8e-424c-b0a0-0397ee458c7c /boot                   xfs     defaults        0 0
UUID=60c94586-7d6a-4e8a-b350-04719990cb69 /home                   xfs     defaults        0 0
UUID=4d91f3bb-8c97-43c8-acea-fb1dd1fe0ed7 swap                    swap    defaults        0 0

here is blkid

/dev/sda1: LABEL="san" UUID="6838649739541725191" UUID_SUB="4029408817980194900" TYPE="zfs_member" PARTLABEL="zfs-288cf7ef18c79daa" PARTUUID="ec08031c-df8f-cd4b-9e38-010b5e967cab"
/dev/sdb1: LABEL="System Reserved" UUID="A2885ECD885EA019" TYPE="ntfs"
/dev/sdb2: UUID="5E2E62DB2E62ABA9" TYPE="ntfs"
/dev/sdb3: UUID="226e836d-7b8e-424c-b0a0-0397ee458c7c" TYPE="xfs"
/dev/sdb5: UUID="60c94586-7d6a-4e8a-b350-04719990cb69" TYPE="xfs"
/dev/sdb6: UUID="4d91f3bb-8c97-43c8-acea-fb1dd1fe0ed7" TYPE="swap"
/dev/sdb7: UUID="bbc69fc6-12fa-499a-a0c6-e0f65e248ce2" TYPE="xfs"
/dev/sdc1: LABEL="san" UUID="6838649739541725191" UUID_SUB="13087102930353693443" TYPE="zfs_member" PARTLABEL="zfs-1e90ee20c4627577" PARTUUID="00e53f8e-9545-844d-9a0e-6c8746643114"
/dev/sdd1: LABEL="san" UUID="6838649739541725191" UUID_SUB="2133500285998926230" TYPE="zfs_member" PARTLABEL="zfs-19ae99cec015d0db" PARTUUID="440f2613-f23b-3c4e-bd90-ce2ef28f3e9f"
/dev/sde1: LABEL="san" UUID="6838649739541725191" UUID_SUB="7987608574075307207" TYPE="zfs_member" PARTLABEL="zfs-8427c3bf89616cda" PARTUUID="6792f785-4803-1643-888b-a98fd6f6743e"
/dev/sdf1: LABEL="san" UUID="6838649739541725191" UUID_SUB="676738182062217510" TYPE="zfs_member" PARTLABEL="zfs-061b31fabbe106cb" PARTUUID="1f50712e-0c01-d445-9ad7-381d08307c2b"
/dev/sdg1: LABEL="san" UUID="6838649739541725191" UUID_SUB="10361692541083745258" TYPE="zfs_member" PARTLABEL="zfs-5d020760c598b14c" PARTUUID="eaae6308-64b3-004d-a7c8-be4e55c8c859"
/dev/sda9: PARTUUID="4aa5c270-b2c6-4342-aea0-5ae7f4a1eba4"
/dev/sdc9: PARTUUID="c06d2bcf-5c87-f24c-8782-aed395d053d7"
/dev/sdd9: PARTUUID="ec587856-71ad-5d42-9ad0-8251ee74f151"
/dev/sde9: PARTUUID="80203adf-4e65-5e42-8e9b-2a6ccf0eafca"
/dev/sdf9: PARTUUID="ea6c550c-f1a7-4a48-bf51-72c4ba44ab00"
/dev/sdg9: PARTUUID="b0e178b5-12ec-ac44-a5a8-1a05228e2015"

The symptom, when they are connected, is somewhere when it POSTS it fails, so it wont quite load the linux kernel, it gets stuck while posting, with a cursor only on the screen. Normally i see this cursor blink, hop down a couple rows, then the linux kernel gives me the boot selection.

Now as it hops, it stops, lol.

Looking closer, i see that ntfs entry in there (sdb1), what the heck is that, could that be the issue? It may well be something i was working with, last year when i set this all up.

Where do I start to debug this?

Per request: @Michael Hampton

The new drive is currently /dev/sdg, the boot drive is /dev/sda. Before I did the zfs replace, I remember the sda boot drive it would randomly switch to /dev/sdb sometimes, but still boot, that might be part of the problem during the times i boot with all connected lately.

Here is my partition tables

$ fdisk -l

Disk /dev/sda: 500.1 GB, 500107862016 bytes, 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x44fdfe06

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *        2048      206847      102400    7  HPFS/NTFS/exFAT
/dev/sda2          206848   256002047   127897600    7  HPFS/NTFS/exFAT
/dev/sda3       256002048   257026047      512000   83  Linux
/dev/sda4       257026048   976773119   359873536    5  Extended
/dev/sda5       257028096   467412991   105192448   83  Linux
/dev/sda6       467415040   479737855     6161408   82  Linux swap / Solaris
/dev/sda7       479739904   563625983    41943040   83  Linux
WARNING: fdisk GPT support is currently new, and therefore in an experimental phase. Use at your own discretion.

Disk /dev/sdb: 2000.4 GB, 2000398934016 bytes, 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk label type: gpt
Disk identifier: 18C699EF-38E1-3B4F-8D2A-07F0101E7B11


#         Start          End    Size  Type            Name
 1         2048   3907012607    1.8T  Solaris /usr &  zfs-1e90ee20c4627577
 9   3907012608   3907028991      8M  Solaris reserve
WARNING: fdisk GPT support is currently new, and therefore in an experimental phase. Use at your own discretion.

Disk /dev/sdc: 2000.4 GB, 2000398934016 bytes, 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk label type: gpt
Disk identifier: 182F023B-C53D-4949-8CA9-209E34A8DCE3


#         Start          End    Size  Type            Name
 1         2048   3907012607    1.8T  Solaris /usr &  zfs-19ae99cec015d0db
 9   3907012608   3907028991      8M  Solaris reserve
WARNING: fdisk GPT support is currently new, and therefore in an experimental phase. Use at your own discretion.

Disk /dev/sdd: 2000.4 GB, 2000398934016 bytes, 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk label type: gpt
Disk identifier: BA7402DC-461B-6A4D-8611-DE3C7889E4F5


#         Start          End    Size  Type            Name
 1         2048   3907012607    1.8T  Solaris /usr &  zfs-8427c3bf89616cda
 9   3907012608   3907028991      8M  Solaris reserve
WARNING: fdisk GPT support is currently new, and therefore in an experimental phase. Use at your own discretion.

Disk /dev/sde: 2000.4 GB, 2000398934016 bytes, 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk label type: gpt
Disk identifier: 3252FCB6-A509-EE45-9A2B-6F6EC7612239


#         Start          End    Size  Type            Name
 1         2048   3907012607    1.8T  Solaris /usr &  zfs-061b31fabbe106cb
 9   3907012608   3907028991      8M  Solaris reserve
WARNING: fdisk GPT support is currently new, and therefore in an experimental phase. Use at your own discretion.

Disk /dev/sdf: 2000.4 GB, 2000398934016 bytes, 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk label type: gpt
Disk identifier: 7DFD1DFA-E3D1-4D4C-BE65-3C971B422D61


#         Start          End    Size  Type            Name
 1         2048   3907012607    1.8T  Solaris /usr &  zfs-5d020760c598b14c
 9   3907012608   3907028991      8M  Solaris reserve
WARNING: fdisk GPT support is currently new, and therefore in an experimental phase. Use at your own discretion.

Disk /dev/sdg: 2000.4 GB, 2000398934016 bytes, 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: gpt
Disk identifier: 8E166E43-4E09-F44B-976F-CB2E0ED93945

blkid again

$ blkid
/dev/sda7: UUID="bbc69fc6-12fa-499a-a0c6-e0f65e248ce2" TYPE="xfs"
/dev/sda3: UUID="226e836d-7b8e-424c-b0a0-0397ee458c7c" TYPE="xfs"
/dev/sda6: UUID="4d91f3bb-8c97-43c8-acea-fb1dd1fe0ed7" TYPE="swap"
/dev/sda1: LABEL="System Reserved" UUID="A2885ECD885EA019" TYPE="ntfs"
/dev/sda2: UUID="5E2E62DB2E62ABA9" TYPE="ntfs"
/dev/sda5: UUID="60c94586-7d6a-4e8a-b350-04719990cb69" TYPE="xfs"
/dev/sdb1: LABEL="san" UUID="6838649739541725191" UUID_SUB="13087102930353693443" TYPE="zfs_member" PARTLABEL="zfs-1e90ee20c4627577" PARTUUID="00e53f8e-9545-844d-9a0e-6c8746643114"
/dev/sdb9: PARTUUID="c06d2bcf-5c87-f24c-8782-aed395d053d7"
/dev/sdc1: LABEL="san" UUID="6838649739541725191" UUID_SUB="2133500285998926230" TYPE="zfs_member" PARTLABEL="zfs-19ae99cec015d0db" PARTUUID="440f2613-f23b-3c4e-bd90-ce2ef28f3e9f"
/dev/sdc9: PARTUUID="ec587856-71ad-5d42-9ad0-8251ee74f151"
/dev/sdd1: LABEL="san" UUID="6838649739541725191" UUID_SUB="7987608574075307207" TYPE="zfs_member" PARTLABEL="zfs-8427c3bf89616cda" PARTUUID="6792f785-4803-1643-888b-a98fd6f6743e"
/dev/sdd9: PARTUUID="80203adf-4e65-5e42-8e9b-2a6ccf0eafca"
/dev/sde1: LABEL="san" UUID="6838649739541725191" UUID_SUB="676738182062217510" TYPE="zfs_member" PARTLABEL="zfs-061b31fabbe106cb" PARTUUID="1f50712e-0c01-d445-9ad7-381d08307c2b"
/dev/sde9: PARTUUID="ea6c550c-f1a7-4a48-bf51-72c4ba44ab00"
/dev/sdf1: LABEL="san" UUID="6838649739541725191" UUID_SUB="10361692541083745258" TYPE="zfs_member" PARTLABEL="zfs-5d020760c598b14c" PARTUUID="eaae6308-64b3-004d-a7c8-be4e55c8c859"
/dev/sdf9: PARTUUID="b0e178b5-12ec-ac44-a5a8-1a05228e2015"
/dev/sdg1: LABEL="san" UUID="6838649739541725191" UUID_SUB="4029408817980194900" TYPE="zfs_member" PARTLABEL="zfs-288cf7ef18c79daa" PARTUUID="ec08031c-df8f-cd4b-9e38-010b5e967cab"
/dev/sdg9: PARTUUID="4aa5c270-b2c6-4342-aea0-5ae7f4a1eba4"

smartctl logs

As mentioned in comment below, I keep some smartctl logs by a script that i made to check the health of the drives.

sda log, notice it switched drives on Dec 28 for e.g.

$ tail -n 80 sda.log
Reallocated sectors -  - 0"
Pending sectors-  - 24"

Mon Oct  2 21:35:53 PDT 2017
                Model Number:       ST2000DM001-1E6164
Temp-  - 35 (0 15 0 0 0)"
Hours-  - 26052"
Reallocated sectors -  - 2136"
Pending sectors-  - 840"

Sun Nov 26 21:17:10 PST 2017
                Model Number:       WDC WD5000AACS-00ZUB0
Temp-  - 37"
Hours-  - 21298"
Reallocated sectors -  - 0"
Pending sectors-  - 0"

Sun Nov 26 21:53:14 PST 2017
                Model Number:       WDC WD5000AACS-00ZUB0
Temp-  - 38"
Hours-  - 21299"
Reallocated sectors -  - 0"
Pending sectors-  - 0"

Sun Nov 26 22:32:53 PST 2017
                Model Number:       WDC WD5000AACS-00ZUB0
Temp-  - 39"
Hours-  - 21299"
Reallocated sectors -  - 0"
Pending sectors-  - 0"

Sun Nov 26 23:24:36 PST 2017
                Model Number:       WDC WD5000AACS-00ZUB0
Temp-  - 40"
Hours-  - 21300"
Reallocated sectors -  - 0"
Pending sectors-  - 0"

Thu Nov 30 18:46:03 PST 2017
                Model Number:       WDC WD5000AACS-00ZUB0
Temp-  - 35"
Hours-  - 21392"
Reallocated sectors -  - 0"
Pending sectors-  - 0"

Tue Dec  5 17:31:57 PST 2017
                Model Number:       ST2000NM0011
Temp-  - 34 (0 25 0 0 0)"
Hours-  - 217"
Reallocated sectors -  - 438"
Pending sectors-  - 0"

Thu Dec 28 00:08:09 PST 2017
                Model Number:       WDC WD5000AACS-00ZUB0
Temp-  - 40"
Hours-  - 22037"
Reallocated sectors -  - 0"
Pending sectors-  - 0"

Tue Jan  2 13:05:22 PST 2018
                Model Number:       WDC WD5000AACS-00ZUB0
Temp-  - 38"
Hours-  - 22170"
Reallocated sectors -  - 0"
Pending sectors-  - 0"

Tue Jan  2 16:46:34 PST 2018
                Model Number:       WDC WD5000AACS-00ZUB0
Temp-  - 39"
Hours-  - 22174"
Reallocated sectors -  - 0"
Pending sectors-  - 0"

Tue Jan  2 23:09:37 PST 2018
                Model Number:       WDC WD5000AACS-00ZUB0
Temp-  - 40"
Hours-  - 22180"
Reallocated sectors -  - 0"
Pending sectors-  - 0"

sdb log

$ tail -n 80 sdb.log
Reallocated sectors -  - 0"
Pending sectors-  - 0"

Mon Oct  2 21:35:55 PDT 2017
                Model Number:       WDC WD5000AACS-00ZUB0
Temp-  - 42"
Hours-  - 19982"
Reallocated sectors -  - 0"
Pending sectors-  - 0"

Sun Nov 26 21:17:11 PST 2017
                Model Number:       ST2000DM001-9YN164
Temp-  - 34 (0 17 0 0 0)"
Hours-  - 70405"
Reallocated sectors -  - 0"
Pending sectors-  - 0"

Sun Nov 26 21:53:16 PST 2017
                Model Number:       ST2000DM001-9YN164
Temp-  - 37 (0 17 0 0 0)"
Hours-  - 70406"
Reallocated sectors -  - 0"
Pending sectors-  - 0"

Sun Nov 26 22:32:55 PST 2017
                Model Number:       ST2000DM001-9YN164
Temp-  - 38 (0 17 0 0 0)"
Hours-  - 70406"
Reallocated sectors -  - 0"
Pending sectors-  - 0"

Sun Nov 26 23:24:37 PST 2017
                Model Number:       ST2000DM001-9YN164
Temp-  - 38 (0 17 0 0 0)"
Hours-  - 70407"
Reallocated sectors -  - 0"
Pending sectors-  - 0"

Thu Nov 30 18:46:04 PST 2017
                Model Number:       ST2000DM001-9YN164
Temp-  - 31 (0 17 0 0 0)"
Hours-  - 70498"
Reallocated sectors -  - 0"
Pending sectors-  - 0"

Tue Dec  5 17:31:58 PST 2017
                Model Number:       WDC WD5000AACS-00ZUB0
Temp-  - 38"
Hours-  - 21510"
Reallocated sectors -  - 0"
Pending sectors-  - 0"

Thu Dec 28 00:08:10 PST 2017
                Model Number:       WDC WD20EZRX-00DC0B0
Temp-  - 36"
Hours-  - 35324"
Reallocated sectors -  - 0"
Pending sectors-  - 0"

Tue Jan  2 13:05:23 PST 2018
                Model Number:       WDC WD20EZRX-00DC0B0
Temp-  - 34"
Hours-  - 35457"
Reallocated sectors -  - 0"
Pending sectors-  - 0"

Tue Jan  2 16:46:34 PST 2018
                Model Number:       WDC WD20EZRX-00DC0B0
Temp-  - 34"
Hours-  - 35460"
Reallocated sectors -  - 0"
Pending sectors-  - 0"

Tue Jan  2 23:09:37 PST 2018
                Model Number:       WDC WD20EZRX-00DC0B0
Temp-  - 36"
Hours-  - 35467"
Reallocated sectors -  - 0"
Pending sectors-  - 0"
Brian Thomas
  • 378
  • 3
  • 14

1 Answers1

0

Problem Solved.

This turned out to be a bios issue.

In the bios, I checked the drive order, and when the new drive was connected, the correct boot disk was no longer set as disk 1. So I needed to go into the drives section(in bios) and set 1st drive to the correct disk. I also disabled external drive as the 1st boot disk just in case.

Brian Thomas
  • 378
  • 3
  • 14
  • Remember not to create zpools using `/dev/sd*` device names, as these may change. Use instead the names in `/dev/disk/by-id` which never chnage. – Michael Hampton Feb 19 '18 at 00:45