Very strange problem with VMDK thin/thick disks

Question

I have some ESXi (4.1) virtual machines which were initially created using thin-provisioned disks, but where then physically moved to different datastores (all of them iSCSI); since these are stand-alone hosts, there is no vCenter Server available to manage them, so the operation was done from the ESXi command line using the mv command.

Now the VMs exhibit a rather interesting behaviour: the disk format in the VM settings is shown as "thick", the VMDK files do not have the ddb.thinProvisioned = "1" line, but the actual file size is much less than the virtual disk size. When examined via the Datastore Browser, it shows two different columns for "Size" and "Provisioned Size", just as it would do with a thin-provisioned disk.

Howewer, this doesn't seem to be a problem, as the machines are working fine.

Then, another copy of them was taken for backup purpose; this copy was also made from the command line, using the cp command, between two datastores on the same host (again, both of them iSCSI).

Then we lost the original VMs, and were in need of the backed-up ones.

And these don't work anymore, complaining about corruption of the VMDK files.

So, to recap:

VM created with thin-provisioned disk -> working
VM physically moved between two datastores -> disk shows as thick but behaves as thin, yet VM works fine
VM physically copied between two datastores -> disk behaves the same, VM doesn't work anymore

I tried manually editing the VMDK file to add the ddb.thinProvisioned = "1" line, but this didn't fix the problem. I tried inflating the virtual disk, cloning it and converting it: nothing works, every command complains about the disk being corrupted.

I'm in quite a struggle to bring these VMs up again; can someone please help?

score 4 · Accepted Answer · answered Feb 04 '12 at 22:49

4

Looks like physically copying thin-provisioned disks is a bad idea. VMFS does strange things with thin-provisioned disks, things that cp or mv can't cope with. Corruption ensues.

Don't do it. vmkfstools handles them much better (and safer).

answered Feb 04 '12 at 22:49

Massimo

68,714
56
196
319

the-wabbit · Answer 2 · 2011-09-14T11:43:14.460

I suspect your storage is a VMFS datastore and not NFS?

You could try downloading / copying the vmdk in question somewhere safe, where you then could start working on it using vmware-vdiskmanager or vmware-mount - both part of VMWare Server / Workstation or available as a standalone download IIRC.

If it all fails, try using 3rd party tools on it - tools like qemu-img from the QEMU package or VBoxManage clonehd from VirtualBox are able to convert vmdks to raw files which you then might simply dd back to your ESX guest.

There is also an alternative disk driver called "vdk" for Windows (the original author's page seems down, but there are downloads of older builds and a 64-bit build around) which is rumored to be able to read partially corrupted vmdk files rather well - although I yet have to try that out.

As for the possible cause of corruption, I'd speculate that the "cp" command was not the culprit. There are plenty of ways to get data corrupted - using iSCSI targets which do not support / implement SCSI reservations is one of them. Anyway, I think the best documented way for a copy process is using vmkfstools - you should stick with that.

All involved storage was VMFS on iSCSI, but a third copy was made on a NFS datastore on a Windows system, too; I didn't mention this because I know NFS (at least on Windows) doesn't support thin-provisioned disks, so I'm just assuming that's the least salvageable copy I have. — Massimo, Sep 14 '11 at 12:27
I already tried vmware-mount, too; it says the virtual disk is corrupted, too. — Massimo, Sep 14 '11 at 12:27

Very strange problem with VMDK thin/thick disks

2 Answers2

Linked