0

I have the following environment

  • VMware ESXi 5.5 U3 free, with datastore using VAAI
  • Linux VM
  • CentOS 6.8, 2 thin provisioned disks (PVSCSI thin VMDK's)
  • 1st disk has reference OS partitioning (partition for boot, then LVM with regular root/home/swap LV's)
  • 1st disk is 100GB thin provisioned thin VMDK (about 12GB physical)
  • 2nd disk is 2TB thin provisioned thin VMDK disk I added
  • /opt folder is currently empty
  • OS was recently installed, so no cruft

My intention is to use the second disk to host the /opt mountpoint, as I will be adding data that may need to be grown in the future.

The problem here is adding the second disk via the GUI LVM manager. LVM manager sees the disk fine. Adding the disk to the existing volume group or a new one seems to work fine. Adding a new logical volume without a file system seems to work without causing the underlying VMDK file to expand.

But, the moment I try to configure an ext4 filesystem and hit OK, the LVM manager GUI stops responding, and the underlying VMDK files starts to expand. By all appearances, it seems to be doing a full format of some kind, writing to the thin provisioned disk, causing it to inflate.

Is there a way to avoid causing the underlying VMDK file to increase in physical size when adding the LV? Is the expansion due to a full format, or because LVM is trying to squeeze the /opt LV into the middle of the filesystem, so it effectively is copying OS data to the end of the partition (which would imply that simply waiting, it would eventually finish without causing the disk to fully inflate)

update-1

Did some experiments, it seems for a 2TB thin provisioned VMDK, ext4 formatting is both slow, and caused approximately 32GB inflation. So at least not full inflation. With the VM stopped, a punchzero operation using vmkfstools recovered approximately 31GB and shrunk the VMDK, which if followed up with a VAAI UNMAP, will completely free up the 31GB from the backend datastore storage as well. Cleanup actions are IO/IOps intensive, so prepare for datastore pain. A lot of manual steps, but may be worth it if you have a throwaway datastore and are building VM templates.

A partial solution may be doing a sparse_super formatting of the LVM LV, as that will cause ext4 to format on the fly as blocks are written, but that postpones the pain, and there is no way to do this from the LVM GUI.

Asteroza
  • 31
  • 3
  • What do you mean by "configure an ext4 filesystem and hit OK"? This doesn't make sense. – Michael Hampton Oct 22 '16 at 17:29
  • LVM manager GUI allows selecting an area within a configured volume group to be used for a new logical volume. The new logical volume configuration dialog box allows creation of the name of the LV, the amount of space it will it use, the filesystem to be used within the LV, the mountpoint to be attached to, and whether it will be mounted now and on reboot. The filesystem selector is a dropdown, with the choice of none, or other supported filesystems. I am choosing ext4 to match with the exiting filesystems of the preexisting LV's for swap, /home, and filesystem root. – Asteroza Oct 23 '16 at 01:08
  • OK, what GUI is this? Where did you find it? How did it get onto the system? It's extremely unusual to be doing anything like this from a GUI. – Michael Hampton Oct 23 '16 at 01:46
  • CentOS standard system managment GUI for LVM, though I think in typical default installs it isn't included for some reason. Shows under the GNOME menus System:Administration:Logical Volume Management. Process name shows as system-config-lvm, which I believe is the yum package name. – Asteroza Oct 23 '16 at 03:16
  • What happens when you run `fstrim` on the newly-created filesystem, does your kernel propagate the discards to VMware? – Josip Rodin Oct 23 '16 at 19:49
  • That is unfortunately a known problem with linux VM's on VMware, as VMware does only SCSI 2 commands, and linux will only send SCSI UNMAP commands to SCSI 5 compliant hardware (which means it would only send that to remote storage it was directly attached to, not intermediated by VMware). Direct guest initiated UNMAP also seems to only be correctly supported on VMware EXi 6.0 regardless. I could run an experiment to see how far it would inflate, but the VAAI storage is slow, and having to run a zeroing operation within the VM and an unmap operation at the datastore level will take a long time. – Asteroza Oct 23 '16 at 20:55
  • @Asteroza Have you tried using the Ext4 `sparse_super` option to reduce the number of superblocks written across the thin LV? Or perhaps another kind of a filesystem, perhaps XFS? – Josip Rodin Oct 24 '16 at 14:41
  • As I understand it, sparse_super is simply delaying the inevitable in terms of formatting a block device for ext4, as it ends up doing partial format work on the fly as blocks are written. The descriptions seem to indicate the intention is to allow faster initial usage, rather than waiting for a format to finish. – Asteroza Oct 26 '16 at 00:09
  • Based on some experiments, I have determined that (at least in my environment) the GUI initiated ext4 format is not a full format. It did not cause full VMDK file inflation. However, it did cause approximately 32GB of inflation (it was also quite slow, but that may be related to the specific environment and may be faster for others). The light at the end of the tunnel is that with the VM stopped, running the ESXi CLI command `vmkfstools --punchzero test.vmdk` I was able to recover about 31GB. This is also a slow process as it needs to do a full read of the 32GB at least. – Asteroza Oct 26 '16 at 00:38
  • For VAAI users, you would also have to follow up with an UNMAP such as `esxcli storage vmfs unmap -l datastore1 -n 5000` as the backend storage has not yet been freed. Doing punchzero and UNMAP are heavy operations that should be done during a maintenance window, and UNMAP block operation size should be fitted to your environment (default is 200 1MB VMFS blocks per SCSI UNMAP cycle within the full unmap operation, here I used 5000 to speed it up a bit). – Asteroza Oct 26 '16 at 00:45

0 Answers0