2

Ran into an interesting issue. So we have a 3-node vsan cluster (all three nodes contribute computer and storage). We can call these three nodes esxi01, esxi02, and esxi03. Users reported errors and upon investigating the following was noticed:

  1. vCenter was unavailable
  2. Host esxi01 was completely hung
  3. Was able to login to esx02/03 directly and chunk of the VM's...our vCenter VM was showing as invalid. An attempt to unregister and register now shows the name of the vm (vcenter-server).
  4. Shutdown esxi01/02/03 and started the servers back up.

At this point esxi02 and 03 are formed a vsan cluster and esxi01 is in its own cluster (esxcli vsan cluster get). I attempted to leave the cluster on esxi01 (esxcli vsan cluster leave) and rejoin (esxcli vsan cluster join -u <uuid of Sub-Cluster UUID: from esxi02/03 cluster). The command does not fail but when running esxcli vsan cluster get on esxi01, it shows itself as the sub-cluster master with only itself in the cluster.

I have verified that there is not a firewall in between blocking it, all nics are online, the vmk for vsan traffic can communicate between all three hosts, and ran a tcpdump on esxi01 and can see port 12321 traffic.

Any thought on what could be causing this?

IT_User
  • 211
  • 1
  • 2
  • 15

0 Answers0