0

I've implemented VSAN in my home lab and I'm trying to understand why I'm receiving a "Disk Space Utilization" failed alert.

The Cluster consists of a pair of servers and a Witness appliance. Each of the two servers has a 500 GB SSD and a 6 TB SATA drive. The SATA drives show a capacity of 5.46 TB, and the total raw capacity of the VSAN Datastore is reported as 10.81 TB. Everything was healthy when I set up the VSAN (well except for Hardware compatibility checks, but as I said this is a home lab).

After adding a fair amount of data to a thin disk provisioned VM I received the Disk Space Utilization alert. The Summary tab on the Datastore reports 7.29 TB of 10.81 TB used, which I take to mean that the actual raw storage taken by my VMs (which are all thin disks) is 7.29 TB. I'm using the default Storage Policy, so I think this means that 7.29 TB is twice what the VMs would be consuming without VSAN (i.e. RAID 1), so I should be consuming 3.64 TB on each host. However the alert says I am at 134% utilization (7465GB of 5533GB). What's going on here?

Here are some screenshots of my setup and the alert:

Datastore Summary

Storage Policy used by my VMs

Disk Space Utilization Alert

Note the Cluster warning in that last screen shot is complaining about Disk Balance, which I am also troubleshooting but I believe is unrelated to this issue.

Mario Lenz
  • 1,612
  • 9
  • 13
stdout
  • 241
  • 2
  • 10

2 Answers2

0

I'm not familiar with this product, but it says "number of disk failures to tolerate" is 1. The only way to do that in a two disk system is to keep two copies. Therefore, whatever you store will take twice as much space.

longneck
  • 22,793
  • 4
  • 50
  • 84
  • Yes, sorry maybe I wasn't clear in my question what the base amount of storage consumed vs what will be required by VSAN. I.e. normally the VMs should require 3.64 TB, and with the VSAN policy of RAID-1 it should require 7.29 TB, but spread across the two nodes. I updated my question to make that more clear. – stdout Dec 30 '16 at 16:07
  • Maybe that "Disk space utilization" is represents the storage required if it was not thin provisioned? – longneck Dec 30 '16 at 16:51
0

Ok, after stumbling upon this I think I know what's going on (sorry for the Google webcache link but the VMware forums are down right now for maintenance).

With the Storage Policy I've told vSAN to tolerate one failure, which of course means keep two copies of the data (with the default failure tolerance method that is). To vSAN, "tolerate" means still maintain two copies of the data even if a host fails (so really RAID 1 + Spare). Which I guess is nice if you have several VSAN hosts, but with only two hosts it appears that it tries to make sure there is enough capacity to put two copies of the data on a single host. Which seems odd, and requires that you stay below 50% of your usable capacity (below 25% of your raw capacity) or the warning will trigger.

I'm willing to accept that there is only one copy of my data if one of my two hosts goes down, so my solution was to disable the vSAN Health Checks. Which is not fantastic, but I won't abide a red X on my Cluster all the time. That's no way to live.

Note the docs do say:

When the fault tolerance method is mirroring: to tolerate "n" failures, "n+1" copies of the object are created and "2n+1" hosts contributing to storage are required...

I didn't think that was applicable to a two-node vSAN cluster, but it is, with the +1 being the Witness Appliance.

stdout
  • 241
  • 2
  • 10