I'm having a very odd issue with a single ESXI host.
I have 2 identical hosts, core i3, 6 nics, 16g ram. 4 of the nics are used for Management, vmotion, vm network, all on different vlans. They all go to a HP Procurve 24 port gig switch in a static trunk. The other two nics are iSCSI.
There are 2 VSS's, the one with 4nics, and the second with just the 2 and iSCSI traffic.
Configuration on both hosts is identical, hardware is identical. Both hosts are running at about 30% utilization both cpu and memory. They are running ESXI v. 5.1.
What is happening is that all of the sudden host 2 will drop out of vCenter. ( vCenter is hosted on a physical machine ). No error, it just loses connection.
If I try to ping the host from vCenter I cannot. If I try to ping from my workstation I can most of the time and I can SSH into it. If I "test management network" from the DCUI it can ping the gateway and the dns servers. If I restart the management network I still cannot get to it from vCenter.
If I do a services.sh restart it all completes with no error but doesn't help, host is still not able to register with vCenter nor be pinged by vCenter.
The only thing so far that remedies this is to completely restart the host. I did a log export but I'm not really even sure what to look for at this point. What logs should I be looking at? The only other piece of information I can add is that this seems to happen at the same time of the day, early in the morning. There is nothing running at this time, no backup jobs etc.