1

Good morning all,

I have some NAS software which - for the most part - works great, but when something happens to the Hyper-V host it often fails spectacularly. Today is one of those days.

I'm not great at under the hood stuff but have Google'd around and come to the conclusion the whole thing is confused! I only have one RAID5 with 7 disks in total which seems to correspond to mdadm --detail /dev/md2.

Can anyone give me any guidance on how to bring the volume back to life?

mdadm --detail /dev/md3
/dev/md3:
Version : 1.2
Creation Time : Sun Apr  1 20:22:22 2018
Raid Level : raid5
Array Size : 3906971648 (3725.98 GiB 4000.74 GB)
Used Dev Size : 976742912 (931.49 GiB 1000.18 GB)
Raid Devices : 5
Total Devices : 4
Persistence : Superblock is persistent

Update Time : Fri Dec 21 07:52:22 2018
      State : clean, degraded
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0

Layout : left-symmetric
Chunk Size : 64K

Name : SUPERSYNO:3
UUID : e3f844f6:4bd70c27:c40439c2:8d1a29e9
Events : 12216

Number   Major   Minor   RaidDevice State
0       8       70        0      active sync   /dev/sde6
1       0        0        1      removed
4       8      150        2      active sync   /dev/sdj6
5       8      118        3      active sync   /dev/sdh6
2       8      134        4      active sync   /dev/sdi6

mdadm --detail /dev/md0
/dev/md0:
Version : 0.90
Creation Time : Sat Jan  1 00:04:42 2000
Raid Level : raid1
Array Size : 2490176 (2.37 GiB 2.55 GB)
Used Dev Size : 2490176 (2.37 GiB 2.55 GB)
Raid Devices : 12
Total Devices : 8
Preferred Minor : 0
Persistence : Superblock is persistent

Update Time : Fri Dec 21 08:08:09 2018
    State : clean, degraded
Active Devices : 8
Working Devices : 8
Failed Devices : 0
Spare Devices : 0

UUID : 26aa5fea:f6cb55b4:3017a5a8:c86610be
Events : 0.5720

Number   Major   Minor   RaidDevice State
0       8       33        0      active sync   /dev/sdc1
1       8       49        1      active sync   /dev/sdd1
2       8      145        2      active sync   /dev/sdj1
3       8      129        3      active sync   /dev/sdi1
4       8      113        4      active sync   /dev/sdh1
5       8       97        5      active sync   /dev/sdg1
6       8       81        6      active sync   /dev/sdf1
7       8       65        7      active sync   /dev/sde1
8       0        0        8      removed
9       0        0        9      removed
10       0        0       10      removed
11       0        0       11      removed

mdadm --detail /dev/md1
/dev/md1:
Version : 0.90
Creation Time : Thu Dec 20 22:46:39 2018
Raid Level : raid1
Array Size : 2097088 (2048.28 MiB 2147.42 MB)
Used Dev Size : 2097088 (2048.28 MiB 2147.42 MB)
Raid Devices : 12
Total Devices : 8
Preferred Minor : 1
Persistence : Superblock is persistent

Update Time : Thu Dec 20 22:47:37 2018
      State : active, degraded
Active Devices : 8
Working Devices : 8
Failed Devices : 0
Spare Devices : 0

UUID : 3637e4b4:d5bb08ce:dca69c88:18a34d86 (local to host SuperSyno)
Events : 0.19

Number   Major   Minor   RaidDevice State
0       8       34        0      active sync   /dev/sdc2
1       8       50        1      active sync   /dev/sdd2
2       8       66        2      active sync   /dev/sde2
3       8       82        3      active sync   /dev/sdf2
4       8       98        4      active sync   /dev/sdg2
5       8      114        5      active sync   /dev/sdh2
6       8      130        6      active sync   /dev/sdi2
7       8      146        7      active sync   /dev/sdj2
8       0        0        8      removed
9       0        0        9      removed
10       0        0       10      removed
11       0        0       11      removed

mdadm --detail /dev/md2
/dev/md2:
Version : 1.2
Creation Time : Sun Apr  1 20:22:22 2018
Raid Level : raid5
Array Size : 20478048192 (19529.39 GiB 20969.52 GB)
Used Dev Size : 2925435456 (2789.91 GiB 2995.65 GB)
Raid Devices : 8
Total Devices : 7
Persistence : Superblock is persistent

Update Time : Thu Dec 20 22:34:47 2018
      State : clean, degraded
Active Devices : 7
Working Devices : 7
Failed Devices : 0
Spare Devices : 0

Layout : left-symmetric
Chunk Size : 64K

Name : SUPERSYNO:2
UUID : 7d8b04cf:aebb8d01:6034359b:c4bb62db
Events : 280533

Number   Major   Minor   RaidDevice State
9       8       53        0      active sync   /dev/sdd5
1       8       69        1      active sync   /dev/sde5
2       0        0        2      removed
7       8      149        3      active sync   /dev/sdj5
8       8      117        4      active sync   /dev/sdh5
5       8      133        5      active sync   /dev/sdi5
4       8      101        6      active sync   /dev/sdg5
3       8       85        7      active sync   /dev/sdf5

mdadm --detail /dev/md3
/dev/md3:
Version : 1.2
Creation Time : Sun Apr  1 20:22:22 2018
Raid Level : raid5
Array Size : 3906971648 (3725.98 GiB 4000.74 GB)
Used Dev Size : 976742912 (931.49 GiB 1000.18 GB)
Raid Devices : 5
Total Devices : 4
Persistence : Superblock is persistent

Update Time : Fri Dec 21 08:01:56 2018
      State : clean, degraded
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0

Layout : left-symmetric
Chunk Size : 64K

Name : SUPERSYNO:3
UUID : e3f844f6:4bd70c27:c40439c2:8d1a29e9
Events : 12218

Number   Major   Minor   RaidDevice State
0       8       70        0      active sync   /dev/sde6
1       0        0        1      removed
4       8      150        2      active sync   /dev/sdj6
5       8      118        3      active sync   /dev/sdh6
2       8      134        4      active sync   /dev/sdi6


cat /proc/mdstat

Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md1 : active raid1 sdj2[7] sdi2[6] sdh2[5] sdg2[4] sdf2[3] sde2[2] sdd2[1] 
sdc2[0]
097088 blocks [12/8] [UUUUUUUU____]

md2 : active raid5 sdd5[9] sdf5[3] sdg5[4] sdi5[5] sdh5[8] sdj5[7] sde5[1]
  20478048192 blocks super 1.2 level 5, 64k chunk, algorithm 2 [8/7] [UU_UUUUU]

md3 : active raid5 sde6[0] sdi6[2] sdh6[5] sdj6[4]
  3906971648 blocks super 1.2 level 5, 64k chunk, algorithm 2 [5/4] [U_UUU]

md0 : active raid1 sdj1[2] sdi1[3] sdh1[4] sdg1[5] sdf1[6] sde1[7] sdd1[1] sdc1[0]
  2490176 blocks [12/8] [UUUUUUUU____]

unused devices: <none>

cat /proc/partitions
major minor  #blocks  name

8       32   10485760 sdc
8       33    2490240 sdc1
8       34    2097152 sdc2
8       48 2930266584 sdd
8       49    2490240 sdd1
8       50    2097152 sdd2
8       53 2925436480 sdd5
8       64 3907018584 sde
8       65    2490240 sde1 
8       66    2097152 sde2
8       69 2925436480 sde5
8       70  976743952 sde6
8       80 2930266584 sdf
8       81    2490240 sdf1
8       82    2097152 sdf2
8       85 2925436480 sdf5
8       96 2930266584 sdg
8       97    2490240 sdg1
8       98    2097152 sdg2
8      101 2925436480 sdg5
8      112 3907018584 sdh
8      113    2490240 sdh1
8      114    2097152 sdh2
8      117 2925436480 sdh5
8      118  976743952 sdh6
8      128 3907018584 sdi
8      129    2490240 sdi1
8      130    2097152 sdi2
8      133 2925436480 sdi5
8      134  976743952 sdi6
8      144 3907018584 sdj
8      145    2490240 sdj1
8      146    2097152 sdj2
8      149 2925436480 sdj5
8      150  976743952 sdj6
9        0    2490176 md0
251        0    2430976 zram0
9        3 3906971648 md3
9        2 20478048192 md2
253        0 24385015808 dm-0
9        1    2097088 md1

With 8th drive back in

cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md2 : active raid5 sde5[9] sdf5[3] sdg5[4] sdj5[5] sdh5[8] sdk5[7] sdd5[1]
  20478048192 blocks super 1.2 level 5, 64k chunk, algorithm 2 [8/7] [UU_UUUUU]

md3 : active raid5 sdd6[0] sdj6[2] sdh6[5] sdk6[4]
  3906971648 blocks super 1.2 level 5, 64k chunk, algorithm 2 [5/4] [U_UUU]

md1 : active raid1 sdk2[8] sdj2[7] sdi2[6] sdh2[5] sdg2[4] sdf2[3] sde2[2] sdd2[1] sdc2[0]
  2097088 blocks [12/9] [UUUUUUUUU___]

md0 : active raid1 sdk1[2] sdj1[3] sdh1[4] sdg1[5] sdf1[7] sde1[8] sdd1[1] sdc1[0] sdi1[6]
  2490176 blocks [12/9] [UUUUUUUUU___]

unused devices: <none>

cat /proc/partitions
major minor  #blocks  name

8       32    5242880 sdc
8       33    2490240 sdc1
8       34    2097152 sdc2
8       48 3907018584 sdd
8       49    2490240 sdd1
8       50    2097152 sdd2
8       53 2925436480 sdd5
8       54  976743952 sdd6
8       64 2930266584 sde
8       65    2490240 sde1
8       66    2097152 sde2
8       69 2925436480 sde5
8       80 2930266584 sdf
8       81    2490240 sdf1
8       82    2097152 sdf2
8       85 2925436480 sdf5
8       96 2930266584 sdg
8       97    2490240 sdg1
8       98    2097152 sdg2
8      101 2925436480 sdg5
8      112 3907018584 sdh
8      113    2490240 sdh1
8      114    2097152 sdh2
8      117 2925436480 sdh5
8      118  976743952 sdh6
8      128 3907018584 sdi
8      129    2490240 sdi1
8      130    2097152 sdi2
8      133 2925436480 sdi5
8      134  976743952 sdi6
8      144 3907018584 sdj
8      145    2490240 sdj1
8      146    2097152 sdj2
8      149 2925436480 sdj5
8      150  976743952 sdj6
8      160 3907018584 sdk
8      161    2490240 sdk1
8      162    2097152 sdk2
8      165 2925436480 sdk5
8      166  976743952 sdk6
9        0    2490176 md0
9        1    2097088 md1
251        0    2430976 zram0
9        3 3906971648 md3
9        2 20478048192 md2
253        0 24385015808 dm-0

Edit @ 07:19am post consistency check

cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md2 : active raid5 sdd5[9] sdf5[3] sdg5[4] sdj5[5] sdh5[8] sdk5[7] sdi5[10] sde5[1]
  20478048192 blocks super 1.2 level 5, 64k chunk, algorithm 2 [8/8] [UUUUUUUU]

md3 : active raid5 sde6[0] sdj6[2] sdh6[5] sdk6[4] sdi6[6]
  3906971648 blocks super 1.2 level 5, 64k chunk, algorithm 2 [5/5] [UUUUU]

md1 : active raid1 sdc2[0] sdd2[2] sde2[1] sdf2[3] sdg2[4] sdh2[5] sdi2[6] sdj2[7] sdk2[8]
  2097088 blocks [12/9] [UUUUUUUUU___]

md0 : active raid1 sdc1[0] sdd1[8] sde1[1] sdf1[7] sdg1[5] sdh1[4] sdi1[6] sdj1[3] sdk1[2]
  2490176 blocks [12/9] [UUUUUUUUU___]

unused devices: <none>
  • Could you include the output of `cat /proc/mdstat` and `cat /proc/partitions`? And which distribution is that machine running? – kasperd Dec 21 '18 at 11:43
  • WHY are you using R5 in 2018 - and with 7 disks too! It's dangerous, it'll kill your data - please move onto R1/10 or 6/60 asap. – Chopper3 Dec 21 '18 at 11:47
  • @kasperd - have added the results as requested. This is Xpenology 5.2. This doesn't have SUDO or IPKG on it so trying to get them on now - unless there's another way to get around it? – Graham Jordan Dec 21 '18 at 12:10
  • @Chopper3 - It's Synology SHR (JBOD from what I gather). If I can get through this unscathed I'll be doing whatever's needed to make it more resilient! – Graham Jordan Dec 21 '18 at 12:10
  • Looks like there used to be one more disk in the NAS. You probably need to get an 8th disk and install it. And do it soon as you are just one disk failure away from losing all your data. That 8th disk needs to be at least 4TB. Was the defective disk removed from the NAS already? Or is the defective disk still in there next to the 7 good ones? – kasperd Dec 21 '18 at 12:19
  • @kasperd - looks like during a move a SATA cable came unlodged. Have plugged back in, reconnected it to Xpenology. DSM tried to repair again but still in a crashed state. Have edited my original post with fresh output. What was the giveaway in there? – Graham Jordan Dec 21 '18 at 13:04
  • @kasperd - I've been using this link (https://forum.synology.com/enu/viewtopic.php?f=39&t=102148) as a spring board and came to a conclusion I needed to add sdi6. It's running a parity check as we speak. The volume is still in a crashed state but at least it knows the right drive is in. Will update once completed (probably tomorrow!) Just a heart filled thank you for this. I mean that with all sincerity. Thank you so very much – Graham Jordan Dec 21 '18 at 14:04
  • @GrahamJordan There are two separate RAID-5. You need to add `sdi5` to one and `sdi6` to the other. What does `/proc/mdstat` look like at the moment? – kasperd Dec 21 '18 at 14:07
  • @kasperd - Morning, The consistency check finished early hours of this morning, have updated the main post. It completed successfully but no volumes or anything. In Xpenology it's still showing as crashed. Would I be right in thinking those two rogue 12 drive raid configs are confusing my NAS software? – Graham Jordan Dec 22 '18 at 07:28
  • @GrahamJordan To me those two looks like they are intentionally setup as degraded just so it will be easy to add more mirrors later. I don't know that Xpenology software. At the `md` layer everything looks just fine now. – kasperd Dec 22 '18 at 10:04
  • @kasperd - The website I linked to previously, the story there seems to suggest the drive order is wrong. Would I be right in thinking something similar on mine? In the very first post the letters aren't in chronological order. Does this make sense to you? As in the options? Not sure what the -l6 is. `syno_poweroff_task -d mdadm --stop /dev/md2 mdadm -Cf /dev/md2 -e1.2 -n5 -l6 /dev/sdd5 /dev/sde5 /dev/sdf5 /dev/sdg5 /dev/sdh5 /dev/sdj5 /dev/sdi5 /dev/sdk5 -7d8b04cf:aebb8d01:6034359b:c4bb62db` – Graham Jordan Dec 22 '18 at 10:14
  • @GrahamJordan As far as I know `mdadm` does not care about the order of the drives. And those commands you listed will reformat the RAID, that's what `-C` is for. You can do some non-destructive sanity tests with `file - – kasperd Dec 22 '18 at 10:25

1 Answers1

2

From the cat /proc/mdstat output we can see that sdi has been readded to the md0 and md1 arrays. These are two small RAID-1 arrays with a mirror on every disk. With that many mirrors those were not at risk of data loss.

Unfortunately sdi was not readded to md2 and md3 which are the RAID-5 arrays currently at risk of data loss.

Why they were not automatically readded I don't know. If you want to do it manually the commands to do so are:

mdadm --add /dev/md2 /dev/sdi5
mdadm --add /dev/md3 /dev/sdi6

Do notice that though I have experience with mdadm I have no experience with Synology, so I cannot tell you whether there is any Synology specific risk in running these commands. If there are Synology specific tools to achieve the same effect they may be preferable over the raw mdadm command line tool.

kasperd
  • 29,894
  • 16
  • 72
  • 122
  • Thank you again Kasperd. I'm still struggling to mount the volume but at least I'm in a clean state. Appreciate it. Have started a fresh question in the hope of raising fresh eyebrows. Have a good Christmas – Graham Jordan Dec 22 '18 at 22:25