4

ITNOA

I have two SuperMicro servers and install ESXI 6.0.0 on both of them and create vSAN with them and installs all VMs on vsanStorage. and each of them have two SSD storage with RAID 1 and two HDD with RAID 1. after power failure in my data center, all VM's in one server is orphaned and all VM's in another server is inaccessible. after some investigating around problem I found one of my server could not initialize VSAN, and get many errors like below:

865)CMMDS: MasterAddNodeToMembership:4982: Added node 5777c24c-2568-7ec6-4dd8-005056bb8703 to the cluster membership
0:07:29.240Z cpu27:34329)VSAN Device Monitor: Checking VSAN device latencies and congestion.
519)ScsiDeviceIO: 2651: Cmd(0x439e17f1ca00) 0x1a, CmdSN 0x1 from world 34314 to dev "naa.600304801cb841001f08f1ce0cfa04ce" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.
519)ScsiDeviceIO: 2651: Cmd(0x439e17f1ca00) 0x1a, CmdSN 0x2 from world 34314 to dev "naa.600304801cb841001f08f1ce0cfa04ce" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.
519)ScsiDeviceIO: 2651: Cmd(0x439e17f1ca00) 0x1a, CmdSN 0x3 from world 34314 to dev "naa.600304801cb841001f08f1ce0cfa04ce" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.
519)ScsiDeviceIO: 2651: Cmd(0x439e17f1ca00) 0x1a, CmdSN 0x4 from world 34314 to dev "naa.600304801cb841001f08f1ce0cfa04ce" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.
519)ScsiDeviceIO: 2651: Cmd(0x439e17f1ca00) 0x1a, CmdSN 0x5 from world 34314 to dev "naa.600304801cb841001f08f1ce0cfa04ce" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.
519)ScsiDeviceIO: 2651: Cmd(0x439e17f1ca00) 0x1a, CmdSN 0x6 from world 34314 to dev "naa.600304801cb841001f08f1ce0cfa04ce" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.
4357)Tracing: dropped 707185 traces (707185 total)
3520)ScsiDeviceIO: 2651: Cmd(0x43a580c2a780) 0x1a, CmdSN 0x6bf from world 0 to dev "naa.600304801cb841001f08f1ce0cfa04ce" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.
3520)ScsiDeviceIO: 2651: Cmd(0x43a580c2a780) 0x1a, CmdSN 0x6c4 from world 0 to dev "naa.600304801cb841001f08f209107cfabe" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.
3520)ScsiDeviceIO: 2651: Cmd(0x43a580c2a780) 0x1a, CmdSN 0x6ca from world 0 to dev "naa.600304801cb841001f08f22c1296cd81" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.
3520)ScsiDeviceIO: 2651: Cmd(0x43a580c2a780) 0x1a, CmdSN 0x6d0 from world 0 to dev "naa.600304801cb841001f08f19809c8d99a" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.
3520)ScsiDeviceIO: 2651: Cmd(0x43a580c2a780) 0x1a, CmdSN 0x6d5 from world 0 to dev "naa.600304801cb841001f08f19809c8d99a" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.
3520)ScsiDeviceIO: 2651: Cmd(0x43a580c2a780) 0x1a, CmdSN 0x6da from world 0 to dev "naa.600304801cb841001f08f19809c8d99a" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.
3520)NMP: nmp_ThrottleLogForDevice:3231: last error status from device naa.600304801cb841001f08f19809c8d99a repeated 80 times
3520)ScsiDeviceIO: 2651: Cmd(0x43a580c2a780) 0x1a, CmdSN 0x6df from world 0 to dev "naa.600304801cb841001f08f19809c8d99a" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.
3520)ScsiDeviceIO: 2651: Cmd(0x43a580c2a780) 0x1a, CmdSN 0x6e4 from world 0 to dev "naa.600304801cb841001f08f19809c8d99a" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.
3520)ScsiDeviceIO: 2651: Cmd(0x43a580c2a780) 0x1a, CmdSN 0x6e9 from world 0 to dev "naa.600304801cb841001f08f22c1296cd81" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.
4465)PLOG: PLOGProbeDevice:5213: Probed plog device <naa.600304801cb841001f08f22c1296cd81:1> 0x4305394dd770 exists.. continue with old entry
3520)ScsiDeviceIO: 2651: Cmd(0x43a580c2a600) 0x1a, CmdSN 0x6ef from world 0 to dev "naa.600304801cb841001f08f209107cfabe" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.
3520)ScsiDeviceIO: 2651: Cmd(0x43a580c2a600) 0x1a, CmdSN 0x6f5 from world 0 to dev "naa.600304801cb841001f08f1ce0cfa04ce" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.
4465)PLOG: PLOGProbeDevice:5213: Probed plog device <naa.600304801cb841001f08f1ce0cfa04ce:1> 0x4305390d9630 exists.. continue with old entry
3520)ScsiDeviceIO: 2651: Cmd(0x43a580c2a480) 0x1a, CmdSN 0x6fa from world 0 to dev "naa.600304801cb841001f08f1ce0cfa04ce" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.
4465)PLOG: PLOGProbeDevice:5213: Probed plog device <naa.600304801cb841001f08f1ce0cfa04ce:2> 0x4305390da670 exists.. continue with old entry
3520)ScsiDeviceIO: 2651: Cmd(0x43a580c2a480) 0x1a, CmdSN 0x6ff from world 0 to dev "naa.600304801cb841001f08f22c1296cd81" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.aa.64465)PLOG: PLOGProbeDevice:5213: Probed plog device <naa.600304801cb841001f08f22c1296cd81:2> 0x4305394de7b0 exists.. continue with old entry
3520)ScsiDeviceIO: 2651: Cmd(0x43a580c2a480) 0x1a, CmdSN 0x705 from world 0 to dev "naa.600304801cb841001f08f19809c8d99a" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.
4465)WARNING: LSOMCommon: LSOM_DiskGroupCreate:1448: Disk group already created uuid: 521ae5f3-eac3-cfa7-e10d-01b2f379762c
4465)LSOMCommon: SSDLOG_AddDisk:723: Existing ssd found naa.600304801cb841001f08f1ce0cfa04ce:2
4465)PLOG: PLOGAnnounceSSD:6570: Successfully added VSAN SSD (naa.600304801cb841001f08f1ce0cfa04ce:2) with UUID 521ae5f3-eac3-cfa7-e10d-01b2f379762c
4465)VSAN: Initializing SSD: 521ae5f3-eac3-cfa7-e10d-01b2f379762c Please wait...
2959)PLOG: PLOGNotifyDisks:4010: MD 0 with UUID 52f0ac26-c7b0-8f0f-6dbb-3aeddcae32f2 with state 0 formatVersion 4 backing SSD 521ae5f3-eac3-cfa7-e10d-01b2f379762c notified
2959)WARNING: PLOG: PLOGNotifyDisks:4036: Recovery on SSD 521ae5f3-eac3-cfa7-e10d-01b2f379762c had failed earlier, SSD not published
2959)WARNING: PLOG: PLOGRecoverDeviceLogsDispatch:4220: Error Failure from PLOGNotifyDisks() for SSD naa.600304801cb841001f08f1ce0cfa04ce
4465)WARNING: PLOG: PLOGCheckRecoveryStatusForOneDevice:6682: Recovery failed for disk 521ae5f3-eac3-cfa7-e10d-01b2f379762c
4465)VSAN: Initialization for SSD: 521ae5f3-eac3-cfa7-e10d-01b2f379762c Failed
4465)WARNING: PLOG: PLOGInitAndAnnounceMD:6901: Recovery failed for the disk group.. deferring publishing of magnetic disk naa.600304801cb841001f08f22c1296cd81
3520)ScsiDeviceIO: 2651: Cmd(0x43a580c2a480) 0x1a, CmdSN 0x70a from world 0 to dev "naa.600304801cb841001f08f19809c8d99a" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.








2018-07-15T21:56:58.882Z cpu25:33315)ScsiDeviceIO: 8409: Get VPD 86 Inquiry for device "naa.600304801cb841001f08f22c1296cd81" from Plugin "NMP" failed. Not supported
2018-07-15T21:56:58.882Z cpu25:33315)ScsiDeviceIO: 7030: Could not detect setting of QErr for device naa.600304801cb841001f08f22c1296cd81. Error Not supported.
2018-07-15T21:56:58.882Z cpu25:33315)ScsiDeviceIO: 7544: Could not detect setting of sitpua for device naa.600304801cb841001f08f22c1296cd81. Error Not supported.
2018-07-15T21:56:58.883Z cpu32:33526)ScsiDeviceIO: 2636: Cmd(0x43bd80c5edc0) 0x1a, CmdSN 0x9 from world 0 to dev "naa.600304801cb841001f08f22c1296cd81" failed H:0x0 D:0x2 P:0x0 Valid2018-07-15T21:56:58.883Z cpu25:33315)ScsiEvents: 300: EventSubsystem: Device Events, Event Mask: 40, Parameter: 0x4302972eff40, Registered!
2018-07-15T21:56:58.883Z cpu25:33315)ScsiDevice: 3905: Successfully registered device "naa.600304801cb841001f08f22c1296cd81" from plugin "NMP" of type 0


2018-07-15T21:57:09.321Z cpu20:33315)PLOG: PLOG_InitDevice:262: Initialized device naa.600304801cb841001f08f22c1296cd81:2 0x4305644ed110 quiesceTask 0x4305644ee150 on SSD 00000000-002018-07-15T21:57:09.322Z cpu20:33315)PLOG: PLOG_InitDevice:262: Initialized device naa.600304801cb841001f08f1ce0cfa04ce:2 0x4305644ef770 quiesceTask 0x4305644ee620 on SSD 00000000-002018-07-15T21:57:09.323Z cpu20:33315)VSANServer: VSANServer_InstantiateServer:2885: Instantiated VSANServer 0x4305644eeb58
2018-07-15T21:57:09.323Z cpu20:33315)PLOG: PLOG_InitDevice:262: Initialized device naa.600304801cb841001f08f1ce0cfa04ce:1 0x4305644f07b0 quiesceTask 0x4305644f17f0 on SSD 521ae5f3-ea2018-07-15T21:57:09.323Z cpu20:33315)PLOG: PLOG_InitDevice:262: Initialized device naa.600304801cb841001f08f1ce0cfa04ce:2 0x4305644f1c70 quiesceTask 0x4305644f2cb0 on SSD 521ae5f3-ea2018-07-15T21:57:09.323Z cpu20:33315)PLOG: PLOG_FreeDevice:325: PLOG in-mem device 0x4305644ef770 naa.600304801cb841001f08f1ce0cfa04ce:2 0x1 00000000-0000-0000-0000-000000000000 is b2018-07-15T21:57:09.323Z cpu20:33315)PLOG: PLOG_FreeDevice:496: Throttled: Waiting for ops to complete on device: 0x4305644ef770 naa.600304801cb841001f08f1ce0cfa04ce:2
2018-07-15T21:57:09.336Z cpu20:33315)PLOG: PLOGCreateGroupDevice:592: Allocated 65536 trace entries for 521ae5f3-eac3-cfa7-e10d-01b2f379762c
2018-07-15T21:57:09.336Z cpu20:33315)PLOG: PLOGCreateGroupDevice:611: PLOG disk group for SSD 0x4305644f07b0 521ae5f3-eac3-cfa7-e10d-01b2f379762c is created
2018-07-15T21:57:09.337Z cpu20:33315)PLOG: PLOG_InitDevice:262: Initialized device naa.600304801cb841001f08f22c1296cd81:1 0x4305644ef770 quiesceTask 0x4305648f5120 on SSD 521ae5f3-ea2018-07-15T21:57:09.337Z cpu20:33315)PLOG: PLOG_InitDevice:262: Initialized device naa.600304801cb841001f08f22c1296cd81:2 0x4305648f55a0 quiesceTask 0x4305648f65e0 on SSD 521ae5f3-ea2018-07-15T21:57:09.337Z cpu20:33315)PLOG: PLOG_FreeDevice:325: PLOG in-mem device 0x4305644ed110 naa.600304801cb841001f08f22c1296cd81:2 0x1 00000000-0000-0000-0000-000000000000 is b2018-07-15T21:57:09.350Z cpu20:33315)LSOMCommon: LSOM_DiskGroupCreate:1461: Creating diskgroup uuid: 521ae5f3-eac3-cfa7-e10d-01b2f379762c (Read cache size: 207773478912, Write buffer2018-07-15T21:57:09.350Z cpu20:33315)LSOMCommon: LSOMGlobalMemInit:1257: Initializing LSOM's global memory


2018-07-15T21:57:25.776Z cpu30:32970)PLOG: PLOG_Recover:882: Doing plog recovery on SSD naa.600304801cb841001f08f1ce0cfa04ce:2
2018-07-15T21:57:26.168Z cpu6:33577)Created VSAN Slab PLOGRecovSlab_0x4305644f1c70 (objSize=40960 align=64 minObj=32769 maxObj=32769 overheadObj=1310 minMemUsage=1499476k maxMemUsage2018-07-15T21:57:26.184Z cpu10:33562)PLOG: PLOGHandleLogEntry:320: Recovering SSD state for MD 52f0ac26-c7b0-8f0f-6dbb-3aeddcae32f2
2018-07-15T21:58:39.226Z cpu0:33525)WARNING: LSOMCommon: SSDLOG_EnumLogCB:1450: SSD corruption detected. device: naa.600304801cb841001f08f1ce0cfa04ce:2
2018-07-15T21:58:39.226Z cpu10:33562)WARNING: PLOG: PLOGEnumLogCB:411: Log enum CB failed with Corrupt RedoLog
2018-07-15T21:58:39.226Z cpu10:33562)LSOMCommon: SSDLOG_EnumLogHelper:1401: Throttled: Waiting for 1 outstanding reads
2018-07-15T21:58:39.226Z cpu0:33525)LSOMCommon: SSDLOG_IsValidLogBlk:132: Invalid version device: naa.600304801cb841001f08f1ce0cfa04ce:2
2018-07-15T21:58:39.226Z cpu0:33525)WARNING: LSOMCommon: SSDLOG_EnumLogCB:1450: SSD corruption detected. device: naa.600304801cb841001f08f1ce0cfa04ce:2
2018-07-15T21:58:39.337Z cpu7:33578)Destroyed VSAN Slab PLOGRecovSlab_0x4305644f1c70 (maxCount=32769 failCount=0)
2018-07-15T21:58:39.337Z cpu22:33742)PLOG: PLOGRecDisp:823: PLOG recovery complete 521ae5f3-eac3-cfa7-e10d-01b2f379762c:Processed 2271342 entries, Took 73154 ms
2018-07-15T21:58:39.337Z cpu22:33742)PLOG: PLOGRecDisp:832: Recovery for naa.600304801cb841001f08f1ce0cfa04ce:2 completed with Corrupt RedoLog
2018-07-15T21:58:39.337Z cpu37:33315)WARNING: PLOG: PLOGCheckRecoveryStatusForOneDevice:6702: Recovery failed for disk 521ae5f3-eac3-cfa7-e10d-01b2f379762c
2018-07-15T21:58:39.337Z cpu37:33315)VSAN: Initialization for SSD: 521ae5f3-eac3-cfa7-e10d-01b2f379762c Failed
2018-07-15T21:58:39.337Z cpu37:33315)WARNING: PLOG: PLOGInitAndAnnounceMD:6921: Recovery failed for the disk group.. deferring publishing of magnetic disk naa.600304801cb841001f08f222018-07-15T21:58:39.371Z cpu37:33315)Vol3: 2687: Could not open device 'naa.600304801cb841001f08f1ce0cfa04ce:2' for probing: No underlying device for major,minor
2018-07-15T21:58:39.372Z cpu37:33315)Vol3: 2687: Could not open device 'naa.600304801cb841001f08f1ce0cfa04ce:2' for probing: No underlying device for major,minor
2018-07-15T21:58:39.374Z cpu37:33315)Vol3: 1078: Could not open device 'naa.600304801cb841001f08f1ce0cfa04ce:2' for volume open: No underlying device for major,minor
2018-07-15T21:58:39.375Z cpu37:33315)Vol3: 1078: Could not open device 'naa.600304801cb841001f08f1ce0cfa04ce:2' for volume open: No underlying device for major,minor
2018-07-15T21:58:39.375Z cpu37:33315)FSS: 5353: No FS driver claimed device 'naa.600304801cb841001f08f1ce0cfa04ce:2': No underlying device for major,minor
2018-07-15T21:58:39.376Z cpu37:33315)Vol3: 1023: Couldn't read volume header from : I/O error
2018-07-15T21:58:39.377Z cpu37:33315)Vol3: 1023: Couldn't read volume header from : I/O error
2018-07-15T21:58:39.380Z cpu37:33315)Vol3: 1023: Couldn't read volume header from naa.600304801cb841001f08f22c1296cd81:1: I/O error
2018-07-15T21:58:39.381Z cpu37:33315)Vol3: 1023: Couldn't read volume header from naa.600304801cb841001f08f22c1296cd81:1: I/O error
2018-07-15T21:58:39.381Z cpu37:33315)FSS: 5353: No FS driver claimed device 'naa.600304801cb841001f08f22c1296cd81:1': No filesystem on the device
2018-07-15T21:58:39.386Z cpu32:33526)ScsiDeviceIO: 2636: Cmd(0x43bd80c20b80) 0x1a, CmdSN 0x147 from world 0 to dev "naa.600304801cb841001f08f19809c8d99a" failed H:0x0 D:0x2 P:0x0 Val2018-07-15T21:58:39.399Z cpu37:33315)Vol3: 2687: Could not open device 'naa.600304801cb841001f08f1ce0cfa04ce:1' for probing: No underlying device for major,minor
2018-07-15T21:58:39.400Z cpu37:33315)Vol3: 2687: Could not open device 'naa.600304801cb841001f08f1ce0cfa04ce:1' for probing: No underlying device for major,minor
2018-07-15T21:58:39.401Z cpu32:33526)ScsiDeviceIO: 2636: Cmd(0x43bd80c1da80) 0x1a, CmdSN 0x19c from world 0 to dev "naa.600304801cb841001f08f1ce0cfa04ce" failed H:0x0 D:0x2 P:0x0 Val2018-07-15T21:58:39.402Z cpu37:33315)Vol3: 1078: Could not open device 'naa.600304801cb841001f08f1ce0cfa04ce:1' for volume open: No underlying device for major,minor
2018-07-15T21:58:39.403Z cpu37:33315)Vol3: 1078: Could not open device 'naa.600304801cb841001f08f1ce0cfa04ce:1' for volume open: No underlying device for major,minor
2018-07-15T21:58:39.403Z cpu37:33315)FSS: 5353: No FS driver claimed device 'naa.600304801cb841001f08f1ce0cfa04ce:1': No underlying device for major,minor
2018-07-15T21:58:39.404Z cpu37:33315)VC: 3551: Device rescan time 90053 msec (total number of devices 7)
2018-07-15T21:58:39.404Z cpu37:33315)VC: 3554: Filesystem probe time 35 msec (devices probed 7 of 7)
2018-07-15T21:58:39.404Z cpu37:33315)VC: 3556: Refresh open volume time 0 msec


2018-07-15T21:58:46.797Z cpu32:33315)WARNING: MemSched: 15593: Group vsanperfsvc: Requested memory limit 0 KB insufficient to support effective reservation 22436 KB
2018-07-15T21:58:46.797Z cpu32:33315)ALERT: Unable to restore Resource Pool settings for host/vim/vmvisor/vsanperfsvc. It is possible hardware or memory constraints have changed. Ple2018-07-15T21:58:46.797Z cpu32:33315)WARNING: MemSched: 15593: Group vsanperfsvc: Requested memory limit 0 KB insufficient to support effective reservation 22436 KB
2018-07-15T21:58:46.798Z cpu32:33315)ALERT: Unable to restore Resource Pool settings for host/vim/vmvisor/vsanperfsvc. It is possible hardware or memory constraints have changed. Ple2018-07-15T21:58:46.798Z cpu32:33315)WARNING: MemSched: 15593: Group vsanperfsvc: Requested memory limit 0 KB insufficient to support effective reservation 22436 KB
2018-07-15T21:58:46.798Z cpu32:33315)ALERT: Unable to restore Resource Pool settings for host/vim/vmvisor/vsanperfsvc. It is possible hardware or memory constraints have changed. Ple2018-07-15T21:58:46.798Z cpu32:33315)WARNING: MemSched: 15593: Group vsanperfsvc: Requested memory limit 0 KB insufficient to support effective reservation 22436 KB
2018-07-15T21:58:46.798Z cpu32:33315)ALERT: Unable to restore Resource Pool settings for host/vim/vmvisor/vsanperfsvc. It is possible hardware or memory constraints have changed. Ple2018-07-15T21:58:46.798Z cpu32:33315)WARNING: MemSched: 15593: Group vsanperfsvc: Requested memory limit 0 KB insufficient to support effective reservation 22436 KB
2018-07-15T21:58:46.798Z cpu32:33315)ALERT: Unable to restore Resource Pool settings for host/vim/vmvisor/vsanperfsvc. It is possible hardware or memory constraints have changed. Ple2018-07-15T21:58:46.798Z cpu32:33315)WARNING: MemSched: 15593: Group vsanperfsvc: Requested memory limit 0 KB insufficient to support effective reservation 22436 KB
2018-07-15T21:58:46.799Z cpu32:33315)ALERT: Unable to restore Resource Pool settings for host/vim/vmvisor/vsanperfsvc. It is possible hardware or memory constraints have changed. Ple2018-07-15T21:58:46.799Z cpu32:33315)WARNING: MemSched: 15593: Group vsanperfsvc: Requested memory limit 0 KB insufficient to support effective reservation 22436 KB
2018-07-15T21:58:46.799Z cpu32:33315)ALERT: Unable to restore Resource Pool settings for host/vim/vmvisor/vsanperfsvc. It is possible hardware or memory constraints have changed. Ple2018-07-15T21:58:46.799Z cpu32:33315)WARNING: MemSched: 15593: Group vsanperfsvc: Requested memory limit 0 KB insufficient to support effective reservation 22436 KB
2018-07-15T21:58:46.799Z cpu32:33315)ALERT: Unable to restore Resource Pool settings for host/vim/vmvisor/vsanperfsvc. It is possible hardware or memory constraints have changed. Ple2018-07-15T21:58:46.799Z cpu32:33315)WARNING: MemSched: 15593: Group vsanperfsvc: Requested memory limit 0 KB insufficient to support effective reservation 22436 KB
2018-07-15T21:58:46.799Z cpu32:33315)ALERT: Unable to restore Resource Pool settings for host/vim/vmvisor/vsanperfsvc. It is possible hardware or memory constraints have changed. Ple2018-07-15T21:58:46.799Z cpu32:33315)WARNING: MemSched: 15593: Group vsanperfsvc: Requested memory limit 0 KB insufficient to support effective reservation 22436 KB
2018-07-15T21:58:46.799Z cpu32:33315)ALERT: Unable to restore Resource Pool settings for host/vim/vmvisor/vsanperfsvc. It is possible hardware or memory constraints have changed. Ple2018-07-15T21:58:46.836Z cpu18:34102)Loading module vmkapei ...





2018-07-15T21:58:51.789Z cpu10:34486)WARNING: lsi_mr3: mfi_Discover:339: Physical disk vmhba2:C0:T0:L0 hidden from upper layer.
2018-07-15T21:58:51.789Z cpu10:34486)WARNING: ScsiScan: 1651: Failed to add path vmhba2:C0:T0:L0 : No connection
2018-07-15T21:58:51.789Z cpu10:34486)WARNING: lsi_mr3: mfi_Discover:339: Physical disk vmhba2:C0:T1:L0 hidden from upper layer.
2018-07-15T21:58:51.789Z cpu10:34486)WARNING: ScsiScan: 1651: Failed to add path vmhba2:C0:T1:L0 : No connection
2018-07-15T21:58:51.789Z cpu10:34486)WARNING: lsi_mr3: mfi_Discover:339: Physical disk vmhba2:C0:T2:L0 hidden from upper layer.
2018-07-15T21:58:51.789Z cpu10:34486)WARNING: ScsiScan: 1651: Failed to add path vmhba2:C0:T2:L0 : No connection
2018-07-15T21:58:51.789Z cpu10:34486)WARNING: lsi_mr3: mfi_Discover:339: Physical disk vmhba2:C0:T3:L0 hidden from upper layer.
2018-07-15T21:58:51.789Z cpu10:34486)WARNING: ScsiScan: 1651: Failed to add path vmhba2:C0:T3:L0 : No connection
2018-07-15T21:58:52.346Z cpu4:34694)Config: 681: "SIOControlFlag1" = 0, Old Value: 0, (Status: 0x0)
2018-07-15T21:58:52.774Z cpu14:34849)VisorFSRam: 700: hostdstats with (0,1303,0,0,755)

I have vSphere vCenter on the same two servers and Witness appliance hosts on one of them.

UPDATE:

I check vsphere with vsan.disks_stats and see below results

/172.16.0.10/Tehran-Datacenter/computers/Cluster-1> vsan.disks_stats .
+--------------------------------------+-------------+-------+------+------------+---------+----------+---------+
|                                      |             |       | Num  | Capacity   |         |          | Status  |
| DisplayName                          | Host        | isSSD | Comp | Total      | Used    | Reserved | Health  |
+--------------------------------------+-------------+-------+------+------------+---------+----------+---------+
| naa.600304801cb841001f08f1ce0cfa04ce | 172.16.0.11 | SSD   | 0    | 276.43 GB  | 0.00 %  | 0.00 %   | OK (v3) |
+--------------------------------------+-------------+-------+------+------------+---------+----------+---------+
| naa.600304801cb8a3001f08ea0914333933 | 172.16.0.12 | SSD   | 0    | 276.43 GB  | 0.00 %  | 0.00 %   | OK (v3) |
| naa.600304801cb8a3001f08ea8b1bef44fa | 172.16.0.12 | MD    | 56   | 1645.87 GB | 48.72 % | 4.75 %   | OK (v3) |
+--------------------------------------+-------------+-------+------+------------+---------+----------+---------+

As you can see my first server MD hard does not exist in this list, and I think this hard exit from vsan. how to rejoin this hard to vsan?

I try check storage on first server (172.16.0.11) with esxcli vsan storage list and see below results

[root@esxi-1:/etc] esxcli vsan storage list
naa.600304801cb841001f08f1ce0cfa04ce
   Device: naa.600304801cb841001f08f1ce0cfa04ce
   Display Name: naa.600304801cb841001f08f1ce0cfa04ce
   Is SSD: true
   VSAN UUID: 521ae5f3-eac3-cfa7-e10d-01b2f379762c
   VSAN Disk Group UUID: 521ae5f3-eac3-cfa7-e10d-01b2f379762c
   VSAN Disk Group Name: naa.600304801cb841001f08f1ce0cfa04ce
   Used by this host: true
   In CMMDS: true
   On-disk format version: 3
   Deduplication: false
   Compression: false
   Checksum: 5051104294654162127
   Checksum OK: true
   Is Capacity Tier: false

naa.600304801cb841001f08f22c1296cd81
   Device: naa.600304801cb841001f08f22c1296cd81
   Display Name: naa.600304801cb841001f08f22c1296cd81
   Is SSD: false
   VSAN UUID: 52f0ac26-c7b0-8f0f-6dbb-3aeddcae32f2
   VSAN Disk Group UUID: 521ae5f3-eac3-cfa7-e10d-01b2f379762c
   VSAN Disk Group Name: naa.600304801cb841001f08f1ce0cfa04ce
   Used by this host: true
   In CMMDS: false
   On-disk format version: 3
   Deduplication: false
   Compression: false
   Checksum: 13462963856806851387
   Checksum OK: true
   Is Capacity Tier: true

As you can see In CMMDS is false for HDD but i expected true same as another server.

Another Update

I remove vsanStorage from 172.16.0.11 and recreate again, after I run cmmds-tool find -f python | grep CONFIG_STATUS -B 4 -A 6 | grep 'uuid\|content' | grep -o 'state\\\":\ [0-9]*' | sort | uniq -c i can see below results

 44 state\": 28
 13 state\": 7

How do you think it's possible to see virtual machines again in vsanStorage?

My data on my VMs is very important for me.

sorosh_sabz
  • 171
  • 10

2 Answers2

5

You can start recovery from the backup process as it’s the most straight forward and frictionless way to get your data back.

As for VSAN, use more reliable vSAN solution like one from StarWind or HPE StoreVirtual in a combination with Veeam and off-site VM backup storage. For no 3-2-1 backup plan you will have no data.

A.Newgate
  • 1,476
  • 5
  • 11
0

No answer to your question but: Don't run your SSDs or HDDs in RAID1. Storage controller requirements for vSAN are: SAS or SATA host bus adapter (HBA), or a RAID controller that is in passthrough mode or RAID 0 mode.

Mario Lenz
  • 1,612
  • 9
  • 13