8

Today I wanted to increase the size of a VM, so I did what I always do (have done it before):

qemu-img resize diskimage.qcow2 +22GB

Then the file broke and the VM does not start any more. I tried booting the VM from CD to adjust partitioning but the system will not read the disk any more:

qemu-img check -r all diskimage.qcow2
tcmalloc: large alloc 389841715200 bytes == (nil) @  0x7fdb4ea66bf3 0x7fdb4ea88488 0x7fdb4e5674a6 0x7fdb50236a37 0x7fdb50236bc8 0x7fdb50237011 0x7fdb5023941e 0x7fdb5023d891 0x7fdb5027848b 0x7fdb5027c196 0x7fdb491efb35 0x7fdb5021ee4d (nil)
No errors were found on the image.

No errors? Sounds good, but virsh start vm does not work and the logs say:

2017-05-21T10:02:30.755824Z qemu-system-x86_64: -drive file=/.../diskimage.qcow2,format=qcow2,if=none,id=drive-virtio-disk0: could not open disk image /.../diskimage.qcow2: qcow2: Image is corrupt; cannot be opened read/write

I tried converting to raw but the conversion fails (exit 1):

qemu-img convert -f qcow2 diskimage.qcow2 -O raw diskimage.raw
qcow2: Image is corrupt: L2 table offset 0x2d623039326500 unaligned (L1 index: 0); further non-fatal corruption events will be suppressed
qemu-img: error while reading block status of sector 0: Input/output error

The process creates a 354334801920 byte file (much larger than it should have been with +22GB) but it is apparently unusable - when I try to convert it back to qcow2 I get a 200kB file.

Is there a way to extract data from the qcow2 file, or mount it read-write somehow even if there is corruption? I do not have the nbd kernel module on the machine.

Ned64
  • 283
  • 1
  • 3
  • 10

2 Answers2

4

Did you run the "qemu-img resize diskimage.qcow2 +22GB" while the QEMU process was still running with the same disk open ? If so, that would certainly explain the data corruption, as you would potentially have 2 processes writing to the qcow2 file at the same time and if both writes required qcow2 metadata allocations that could corrupt internal file data structures.

The "qemu-img check" result looks very bogus. In particular tcmalloc is complaining that it can't allocate a 360 GB block of memory. It looks like qemu-img is misinterpreting this error as success, printing the bogus message "No errors found". This is a bug you should certainly report to QEMU.

The 'convert' error just looks to be a followup to the same error that tcmalloc hit.

Unfortunately I don't have any suggestions to fix the problem - I was just going to recommend "check -r" to try to fix it. Your only likely remaining chance is to mail qemu-devel and see if any of the qcow2 maintainers have suggestions.

DanielB
  • 1,510
  • 6
  • 7
  • 1
    Thanks. I issued the resize after `virsh shutdown vm` but thinking back, if it took a long time it may still have been running. Of course that was unwise but ought not result in a full loss of 150GB of data?! Thanks for the recommendation with the mailing list, I might just do that. – Ned64 May 22 '17 at 18:51
  • Thanks for posting this. Not that I want to blame anyone but myself, but I'd assumed that it would have no problems resizing while the image was in use - or perhaps that it would abort with an error. It barely occured to me that it might cause immediate disk corruption. – mwfearnley Jan 02 '21 at 14:59
  • Note that current QEMU releases now have built-in protection for qcow2 file format. It will acquire locks on the file in an attempt to prevent users shooting themselves in the foot like this. – DanielB Jan 12 '21 at 10:11
2

Treat qcow2 corruption like a hard drive with bad blocks.

Shutdown that VM.

Then do:

modprobe nbd
qemu-nbd --connect=/dev/nbd0 diskimage.qcow2
ddrescue /dev/nbd0 new_diskimage.raw
qemu-nbd --disconnect /dev/nbd0
qemu-img convert -O qcow2 new_diskimage.raw new_diskimage.qcow2

Now try to boot and pray, hopefully it will get you to the rescue mode, where you can run fsck on that disk.

Danila Ladner
  • 5,241
  • 21
  • 30
  • 1
    Thanks. I have tried that, but `qemu-nbd` fails: `qemu-nbd -c /dev/nbd1 /my/diskimage.qcow2` yields `Failed to blk_new_open '/my/diskimage.qcow2': qcow2: Image is corrupt; cannot be opened read/write` :-( In fact, it cannot be opened, period. I would settle for read-only! (`-r` does not complain, but no partition table can be found). – Ned64 Jun 09 '17 at 09:54