Manual clone/recreate virtual disk

7

I am trying to clone a virtual disk image in a fairly manual fashion. The overview of my methodology so far is as follows:

  1. Create virtual machine in VirtualBox with 120GB HDD (hypervisor and HDD size don't matter, mostly included for completeness and consistency with the rest of my question, e.g. partition sizes)
  2. Install Ubuntu 12.04.3 on virtual machine
  3. Close virtual machine
  4. Mount virtual hard disk associated with virtual machine
  5. Extract operating system files and data to store in a directory
  6. Save virtual hard disk metadata
  7. Create fresh virtual disk and restore partitions and boot information from (6)
  8. Restore data from (5) to the correct partition

The problem

My duplicated VM won't quite boot. Grub seems to copy, and appears to acknowledge my root partition (with Ubuntu installed on it). I can boot past Grub once and get a purple screen, as if Ubuntu is about to load. Then it stops. After that, I can boot into Grub, select my OS, then I get a blinking command line cursor. No input possible. I suspect there's something I'm missing in the cloning process (see below for more detail). Note: I am using grub2, not legacy.

Why are you doing this?

As part of a contractual requirement, I need to store the virtual disk in version control. Having an enormous binary blob (virtual disk) in version control is a pain, mostly for clone(git)/checkout(svn), but also for diffs. I have considered compressing to multiple files, but I need to be able to manipulate the OS/data extracted in (5) above. Note that my VCS repository still needs all the information required to build a complete VM.

Detail

Detailed instructions to reproduce what I've described:

  1. Create a VM and boot the Ubuntu Live CD
  2. Choose "Try Ubuntu"
  3. Open a terminal
  4. Create an msdos partition: sudo parted /dev/sda mklabel msdos
  5. Create a 2GB swap file: sudo parted /dev/sda mkpart primary linux-swap 2048s 4198399s
  6. Use the rest of the drive for the root partition: sudo parted /dev/sda mkpart primary ext4 4198400s 100%
  7. Reboot the machine, choose "Install Ubuntu"
  8. Choose the advanced partitioning option
  9. Double-click the swap partition, choose to use it as swap
  10. Double-click the root partition, choose to format it and use it for root (/) mount point

Now, perform the following to clone the disk:

# Set up some parameters
ORIG_DEV="/dev/nbd0"
ORIG_MNT=$(mktemp -d)
ORIG_IMG="orig.vdi" 
CLONE_DEV="/dev/nbd1"
CLONE_MNT=$(mktemp -d)
CLONE_IMG="clone.vdi"
qemu-img info $ORIG_IMG # save the "virtual size" output (in bytes) in the
                        # VIRT_SIZE variable in the next command
VIRT_SIZE="128849018880"

# Create the clone disk
qemu-img create -f vdi $CLONE_IMG $VIRT_SIZE

# Use qemu to make both disks accessible
modprobe nbd
qemu-nbd -c $ORIG_DEV $ORIG_IMG
qemu-nbd -c $CLONE_DEV $CLONE_IMG

# Set up the clone disk partition table and partitions
parted $CLONE_DEV mklabel msdos
parted $CLONE_DEV mkpart primary linux-swap 2048s 4198399s
parted $CLONE_DEV mkpart primary ext4 4198400s 100%

# Format the clone disk partitions and clone the UUIDs
mkswap $CLONE_DEVp1 -U $(blkid $ORIG_DEVp1 -s UUID -o value)
mkfs.ext4 $CLONE_DEVp2 -U $(blkid $ORIG_DEVp2 -s UUID -o value)

# Mount both disks and copy root from the original to the clone
mount $CLONE_DEVp2 $CLONE_MNT
mount $ORIG_DEVp2 $ORIG_MNT
find $ORIG_MNT -maxdepth 1 -mindepth 1 | xargs -I{} cp -ar {} $CLONE_MNT
umount $ORIG_MNT
umount $CLONE_MNT

# Copy the boot sector and partition table from the original
dd if=$ORIG_DEV of=$CLONE_DEV bs=$((2048*512)) count=1

# Disconnect the disks
qemu-nbd -d $CLONE_DEV
qemu-nbd -d $ORIG_DEV

What else have you tried?

  1. grub-install --root-directory=/path/to/clone/device/boot/ /dev/clone_device. This installed Grub on the correct device, but with my host's device details. The VM would not boot.
  2. chroot into the clone disk, then grub-install. Encountered trouble because I must be able to use 64 bit hosts to clone 32 bit guests. This seems like a hopeful avenue to investigate, but I'm stuck as to how to achieve this.
  3. Mount the virtual disk, move all files off the data partition using mv, zero the data and swap partitions (dd if=/dev/zero of=/dev/nbd0p2) and compress the virtual disk (using VBoxManage modifyhd clone.vdi --compress). The disk began to expand on my host file system as this was filling it with empty space (hah!). I stopped dd when I realised this was happening, then compressed the disk image. It was still over 3GB. (I haven't tried using gzip/bzip, I'll begin to attempt this this evening. I will also attempt letting the dd wipe run to completion, but I'd prefer a less time-consuming solution, even if that works).
  4. e2image. See my other question: e2image restore file system metadata. I have not resolved this. Note that the steps I provide in the Detail section, including partition creation, formatting, and boot sector copy, but before I copy the root partition, produce a very similar sized image file to that created by e2image.
  5. Booting into another VM to chroot into this one to run grub-install. I haven't actually done this, but I've included it here in case someone suggests it. For my users, I need the recombination of the virtual machine to be scriptable; which precludes an involved setup process.
  6. Install extlinux instead of Grub. While unsuccessful, this exercise indicates that (I think!) the bootloader is successfully loading the ram disk from my partition, but gets stuck at this point.

If you've come this far, thank you already! Any suggestions for avenues of investigation, however undetailed, will be much appreciated. Thanks in advance.

mkingston

Posted 2013-09-06T17:20:24.717

Reputation: 372

+1 for the completeness; I wish I could help further though – Canadian Luke – 2013-09-08T17:46:18.843

1

e2image man mentions the -r parameter rather than -Q. You could try to contact its creator Theodore Ts'o, who could surely help you out.

– harrymc – 2013-09-08T18:42:29.397

Thanks @CanadianLuke, so do I! @harrymc, I hadn't considered using e2image -r because in my experience Windows doesn't make a very good job of storing sparse files, meaning a checkout of the repository would take a lot of space if a dev wants to use Windows. However, my aforementioned "experience" with sparse files in Windows is very limited, I have barely pursued this and it certainly appears to warrant further investigation; thanks very much for the suggestion, I'll have a look into it and let you know how it goes. – mkingston – 2013-09-09T08:47:11.883

Answers

4

I have an alternate proposal that omits the need of extracting and recreating your virtual disk's contents.

If you are using git, you can directly work on the mounted virtual disk and have your .git directory somewhere else. Only thing is you probably need to have your .gitignore (if any) in the root dir of the root partition on your virtual disk.

EDIT:
For cloning, you can use the normal mechanism of VirtualBox after initial installation. Whenever you need to restore a specific version, create another clone from the original, then mount it and do a git checkout.
As long as the grub version does not differ, it's all you need to do. If grub version differs, you will need to boot the VM from your 12.04.3.iso and do a grub-install.

This way, the alternate workflow is (added new step 4, modified step 5):

  1. Create virtual machine in VirtualBox with 120GB HDD
  2. Install Ubuntu 12.04.3 on virtual machine
  3. Close virtual machine
  4. Clone virtual machine, put original aside
  5. Mount virtual hard disk of oringinal or first clone (e.g. in /media/virtual)
  6. cd /media/virtual
  7. git --git-dir=/somewhere/else/virtual.git --work-tree=. init
  8. git --git-dir=/somewhere/else/virtual.git --work-tree=. add .
  9. git --git-dir=/somewhere/else/virtual.git --work-tree=. commit -m "Initial import"
  10. ... any other git tasks ...

If you don't want to always add --git-dir=/somewhere/else/virtual.git --work-tree=., there is a question on Stackoverflow that explains how to get rid of it: Can I store the .git folder outside the files I want tracked?

Not exactly what you asked for, but your problem description gives me the impression you are more interested in getting your job done than in the exact way of doing it.

Markus N.

Posted 2013-09-06T17:20:24.717

Reputation: 565

When writing this answer, I forgot to mention the cloning. I added this now. – Markus N. – 2013-09-15T01:29:18.033

Thank you for your answer. I apologise for the delay in responding. I may be mistaken, but I don't think this quite solves my problem. It will only store OS/data files in version control, not the VHD itself (correct me if I'm mistaken). I was trying to reconstruct the VHD because it must be in version control as well as the OS and data. As a "shell" containing basic partition information and the boot sector, it's a manageable size, but once ext4 metadata and(/or) OS files are on it, it's far too large. +1 for a clever idea in any case, thanks again. – mkingston – 2013-09-15T22:39:42.133

Yes, that's right ... it only stores files. Maybe I misunderstood you, but I thought that is what you want in order to avoid handling the whole virtual disk as blob. – Markus N. – 2013-09-15T22:46:00.470

The communication error is my own. Within the VCS repository, I need available all the information to create a working OVA. Ideally this will include a copy of the OS/data (as individual files), a very small "shell" VHD and a script to recombine the VHD and OS/data into a working VM. I think my question has the relevant information to deduce this requirement, however it is less explicit than it could be. I will update the question to reflect this. Thanks again for a clever idea and the effort you've put into answering the question, I really do appreciate it. – mkingston – 2013-09-15T22:54:18.963

I've awarded you the bounty, not explicitly because your answer solves my problem (I haven't awarded you the answer!) but because you were to receive half of it anyway, and just as a thanks for your clever thinking and your suggestion. – mkingston – 2013-09-16T17:30:47.680

3

Maybe you've already tried this, maybe you haven't. But have you tried re-installing Grub2 from a Live CD within the "duplicated VM?" Everything I read sounded like you were installing Grub2 from the host machine.

  • Once you get to the point where you have a duplicated VM that won't boot
  • Mount a Live CD such as the Ubuntu Installing disk to the VM
  • Boot the VM and hit F12 to boot from the Live CD
  • Re-install grub from the command line (inside the VM)

If you want to automate the process you could use VBoxManage to mount a custom Ubuntu Live CD that runs a script to re-install Grub2 upon startup.

VBoxManage storageattach "io" --storagectl "IDE Controller" \
--port 1 --device 0 --type dvddrive --medium debian-6.0.2.1-i386-CD-1.iso

Example Source

Hopefully not too outdated, there is a guide on help.ubuntu.com for customizing the Live CD, and a Stack Exchange question/answer on askubuntu.com involving adding a startup scrip to a customized Live CD

Drew Chapin

Posted 2013-09-06T17:20:24.717

Reputation: 4 839

I absolutely expect that would work; but I need to be able to execute the solution with a script (see point 5 under What else have you tried?), so that when a user transfers the repository to their machine (git clone or svn checkout) they can very easily build the vm from the constituent files. I suspect it would be possible to script this, but would probably require more work than further pursuing some of my current efforts. If you think that's incorrect I'd love to know. Thanks very much for your contribution in any case. – mkingston – 2013-09-09T08:43:05.943

I must have misunderstood what you meant "another VM." Something that comes to mind though is using the VBoxManage command to mount the Live CD, and creating a custom live CD that runs a script at startup to re-install Grub2. The obvious challenge here is the custom Live CD, but I have several tutorials and tools online that let you build custom Live CDs very easily. There used to be an online tool that would let you create an OpenSUSE Live CD, allowing you to select what packages you want, and upload a startup script. – Drew Chapin – 2013-09-09T13:59:58.620

If I get time, I'll look around later today and make a suggestion on how to go about making the custom Live CD. – Drew Chapin – 2013-09-09T14:00:56.157

Just wanted to mention that I've seen your update and comments and haven't had a time to take any action yet. I'll do some reading/experimentation as soon as I can. Thanks for that. – mkingston – 2013-09-10T10:23:20.287

Unfortunately I wasn't able to really get around to this until today. As it turns out, I can't actually get grub to install successfully from the live CD. I think it's a problem with some of the file system metadata. Thanks for your suggestion anyway. – mkingston – 2013-09-16T17:29:20.037

3

It sound like you didn't have much idea about the Ubuntu mount point? Have you consider to separate the VM into multiple partitions?

from the link below:

http://www.easy-ubuntu-linux.com/ubuntu-installation-606-12.html

a basic install is only 4G, +1G for SWAP, +2G for /tmp, then you can make separate partition for /usr, /var, /home, /opt

just make 1G for each of them in beginning, you can dynamically growth it with virtual box when needed

reference:

http://www.ubuntugeek.com/linux-or-ubuntu-directory-structure.html

Afterwards, you can determine your SVC scope, only user files or logs or the OS ? for which you can have much less pain in version.

in most case, you can simply make the /, /usr and /opt read-only and unchangeble after setup, and thus reduce the overall version headache.

but one thing is to reminder, as your contract state you need to store your virtual disk, i guess storing extract data from virtual disk doesn't match with your contract. But storing multiple virtual disk match. That's why i suggest you to create mount point and then make part of them read only (so there is only 1 version of those).

just a further note, remember to check if you can switch off the file-access date of the mount point? (some very high security setup does not allow that to be turn off, computer forensics)

Since the access date is actually written to disk if the access date is turn on. So, a virtual disk / mount point which is access on day 2 is actually different version with the virtual disk / mount point on day 1, even if nothing is written.

reference (no access time):

https://askubuntu.com/questions/59179/how-do-i-make-noatime-mounts-default

user218473

Posted 2013-09-06T17:20:24.717

Reputation:

In this scenario, I would rather suggest multiple virtual disks than one virtual disk with multiple partitions. Otherwise, you cannot seperate them from the view of a VCS. – Markus N. – 2013-09-15T16:03:17.410

This by itself doesn't solve the problem of transferral of large files during checkout. However, using multiple VHDs containing different partitions might do. This might be a little dependent on my release requirements (which I'm not yet that clear about). Thank you, and +1 for your suggestion. – mkingston – 2013-09-15T21:22:45.240

0

One more thought into that, I will actually look into /etc/fstab. How are your partitions defined in there? If they are defined by their UUID, then perhaps there is a chance that this information doesn't get replicated with the cloning process. You may want to make them defined by device names, like /dev/sda0 for example.

One other point, why not to run git from inside the virtual machine itself? I can think of having a directory on the host having your repository. The virtual machine can then mount this directory and run git on the virtual machine itself.

Bichoy

Posted 2013-09-06T17:20:24.717

Reputation: 238

Thank you for your suggestion. I do indeed clone the UUIDs, and have confirmed that fstab references these instead of device names. Unfortunately, the VHD itself needs to be in the VCS repository, unless I've misunderstood your answer running git from inside the VM doesn't mitigate the problem arising from clone/checkout/diff of large binary files in the repository. – mkingston – 2013-09-15T21:30:50.370