3

I'm planning a small ESXi standalone server to host a number of VMs I use here. I haven't decided what facilities to use from the wider vSphere system. The underlying VM storage is local HDD RAID using enterprise HDDs and LSI megaraid, with the LSI card's onboard battery backed ram +ssd caching systems enabled.

My concern relates to data corruption and bit rot in the VM store over time - I don't really know what my options are, and I'd like to be sure that stored VMDKs and snapshots, and other VM files, don't get corrupted over time and can be set to be periodically scrubbed and any bit-level corruption (within reason) detected and repaired.

As background, for casual desktop use, I've tended to use RAID 1 (mirroring) rather than higher levels (reasons: fast read speeds, complete portability of drives without tie-in to specific brands or cards, no disruption if a drive fails). For my file server I use ZFS on a mirrored volume. But ESXi and VMware's suite use their own data store design for local storage. So I don't know how resilient against silent corruption, a setup would be "out of the box", especially when it holds many TBs of large files that might sometimes only be accessed years later, and with a local store rather than a dedicated separate storage system. I also gather VMFS uses a journaled filing system but not one with the self correcting capability of ZFS.

Are the inbuilt capabilities of ESXi (and if necessary other parts of their suite) sufficient to protect against routine data corruption concerns? If not, what are my options for peace of mind?

Update

@mzhaase - I didn't feel confident about passing through to a second server that would act as a file store, because then every file access and snapshot has to be done remotely across a LAN or a second device and even if 10G was used (which is still cost prohibitive for most home setups) the slowdown would be a major concern.

Part of the whole reason for getting this specific card is to get true cache-on-write for speed, so that bulk writes or rollbacks are less likely to slow everything down by "chugging" the main HDDs, which should be helpful whatever the file store location. Issues with latency impact sound like they would also happen with any remote data store, whethet a server appliance or a home build such as a second FreeNAS box (although if I had to choose, I'd use a second FreeNAS).

I'm perhaps overlooking using a dedicated NIC port and multiple parallel 1Gb ports as a way round this, but latency and traffic implications for snapshots and rollbacks are a big concern. I'm also possibly overlooking running a FreeNAS VM on a small dedicated disk, which services the main VM store array off the raid card as a passthrough device, which keeps it local. (Meaning that ESXi can boot and can load the FreeNAS VM off one disk, once that's running it can act as a ZFS based file server for any other VMS with - hopefully - low latency). But running the file server virtualized might increase latency more than keeping it local would reduce it, and latency and disk bottlenecks are already an issue I'm trying to overcome.

I will however look up the LSI card info, and - can you install file integrity checking/repairing software on the underlying ESXi platform itself to check and repair VM files? I didn't know that. And would iSCSI be that much of a latency-killer to make a remote store usable? Once a VM is up and running, how much does access speed/latency to the VM store affect the running of ESXi or other VMs currently running on it?

Stilez
  • 664
  • 6
  • 14

1 Answers1

1

ESXi by itself will not do this. You have a couple options here. Either you install third party software to monitor file integrity, get a SAN that has file integrity monitoring, for example by using ZFS, or rely on the LSI cards 'data protection' capability. As mentioned here, this is exactly what 'data protection' is supposed to do.

You could also passthrough the drives to FreeNAS and then you expose the FreeNAS storage via iSCSI to the ESXI, but then the LSI card was a waste of money.

mzhaase
  • 3,778
  • 2
  • 19
  • 32
  • Question updated with comments on this (which might be a bit of a brain-dump, so I hope they make sense!) – Stilez Aug 11 '16 at 16:23