24

I'm not quite sure how to phrase this question (hence the poor title), so let me provide an example of what I'm trying to do.

On my (old) Xen host, I'm able to present LVM filesystems directly to each guest. These filesystems are actually created and formatted on the host, and passed directly through. Eg., for one of my hosts using a separate tmp and swap partitions, I define the storage like this:

disk = [
'phy:/dev/vg1/guest1-swap,sda1,w',
'phy:/dev/vg1/guest1-disk,sda2,w',
'phy:/dev/vg1/guest1-tmp,sda3,w',
]

So, guest1-swap is formatted as a swap partition, guest1-disk and guest1-tmp are formatted with ext4, and from the guest's perspective it simply sees them as three formatted partitions under /dev/sda.

(This may sound like a lot of work, but there are provisioning scripts, such as the awesome xen-tools, that automated pretty much everything).

This provides some really useful capabilities, two of which I'm especially interested in figuring out for KVM:

  • Mount the guest filesystems from the host OS. I can do a read-only mount of any guest filesystem at any time, even while the guest is running. This has the side benefit of allowing my to create LVM snapshots of any existing volume while the guest is running. This way, I'm able to centrally backup all my guests, while running, from the host.

  • Online volume resizing. Because the volumes contain standard Linux filesystems, I can use a combination of lvextend and resize2fs to grow my guest filesystems, again while they're online.

I'm currently setting up a KVM host that will replace the Xen host. Similar to the Xen setup I'm leveraging LVM to provide direct filesystem access, but KVM/qemu behaves differently in that it always creates a image file for the guests, even on the LVM volume. From the guest's perspective, it sees this as an unpartitioned disk, and it's up to the guest to apply a partition label, then create the partitions and filesystems.

From a guest perspective that's fine, but from a server/management perspective it seems to be far less flexible than the Xen setup I described. I'm still new to KVM, so I may be (hopefully) missing something.

I ran into this problem when trying to re-implement my former backup solution on the KVM host and the mount command chocked when I tried to mount one of the guest's filesystems. So, addressing that is my current concern, but it also made me concerned about the resizing thing, because I'm sure that issue will come up at some point as well.

So, here are my questions:

  1. Is there any way to have kvm/qemu use LVM volume filesystems directly as I described for my Xen setup? I use libvirt for management if that makes a difference.

  2. If not, what can I do to get similar mounting/backup functionality under KVM? I've seen discussions about using libguestfs w/ FUSE to do this, but is that really the best option? I'd prefer to stick with a native filesystem mount if at all possible.

  3. Also if not, is it possible to do an online filesystem resize under KVM? I've found several discussions/howtos about this, but the answers seem to be all over the place with no clear, and definitely no straightforward, solutions.

Sorry for the long post, just wanted to make sure it was clear. Please let me know if I can provide any other info that would be helpful. Looking forward to the discussion. :-)

Jared
  • 373
  • 1
  • 3
  • 7
  • I just logged in to set a bounty on my version of this question: http://serverfault.com/questions/409543/kvm-booting-off-image-kernel-and-existing-partition. Let's see if you save me 50 points :) – Bittrance Jul 23 '12 at 07:42

4 Answers4

11
  1. qemu-kvm can use LVs as virtual disks instead of files. this is quite a common use case actually.
  2. libguestfs (and just look for a set of virt-* tools) can provide access to guest filesystems in a cleaner way than anything you remount to the host directly, though both are possible.
  3. Online FS resizing is not a feature of kvm, but something the guest OS should be capable of. resize2fs will work in a VM as well as it does on physical hardware, the only problem being the guest redetecting the size changes. Try virt-resize as the standard tool, but lvresize and qemu-img can also easily be used (though in offline mode, requiring a guest restart usually).

I think lvresize with resize2fs will actually work without a guest restart, but I haven't tried it yet

dyasny
  • 18,482
  • 6
  • 48
  • 63
  • Thanks for the reply. "qemu-kvm can use LVs as virtual disks instead of files." Do you know if this is true for libvirt/virsh as well? I have seen some things alluding to doing this with qemu (though nothing definite), but nothing for libvirt, which I'm using for domain management. – Jared Jul 23 '12 at 08:33
  • 1
    qemu doesn't really care whether you provide a block device or a file as the backing store for the virtual disk. block devs are actually better because this way qemu reaches the actual blocks faster than by going through a filesystem. libvirt is not amazing at managing storage, but it does support LVM based block access, a bit hard to get to through `virsh` but easy enough through `virt-manager`. The more serious systems like RHEV/oVirt actually use LVM all the time for FC/iSCSI based storage – dyasny Jul 23 '12 at 08:41
  • @Jared: libvirt/virsh definitely does support this; we use it for all our VM storage. – womble Jul 23 '12 at 11:03
  • dyasny, womble - appreciate the comments, but I still can't get this to work. Even tried manually editing the domain config XML based on the libvirt [http://libvirt.org/formatdomain.html#elementsDisks](reference), but I can't get the machine to boot when using a root filesystem as I've described. The best I've done is use `attach-disk` to connect it dynamically, but this is not permanent and I can't get it to work for /. Can you point to any documentation for this, or provide specific hints? Thanks! – Jared Jul 23 '12 at 14:20
  • what errors are you seeing? Does the VM boot at all? I would start the VM with a liveCD ISO attached and investigate what it sees and what it doesn't see, the culprit is probably around the fact that the disk interfaces changed form the Xen /dev/xvdX to kvm's /dev/vdX (unless you picked to use IDE and then it's /dev/hdX. If you did then don't :) ) – dyasny Jul 23 '12 at 17:45
  • It just doesn't boot - no kernel messages, no display, nothing, and I get a few messages like "daemonStreamEvent:237 : stream had unexpected termination" in the libvirt log file. If you've done this before, can you please share how? or point to relevant documentation? The virsh/virt-manage stuff doesn't matter; I'll be happy hand editing the config file. I just want to see if/how it actually works. – Jared Jul 24 '12 at 02:46
  • Have you tried booting the kvm VM with a liveCD ISO attached and investigating the reasons why it doesn't boot? As I said, you probably have a change in the disk access paths as well as probably need to replace the kernel if you were using a xen kernel before. If you have a second host, I'd suggest you simply use virt-v2 or virt-p2v to automate this process – dyasny Jul 24 '12 at 08:22
5

I use qemu-kvm+libvirt with exactly the configuration you're asking about, for the reasons you listed, but additionally because I get much better performance without the KVM host's filesystem layer in scope. If you add the VG as a 'storage pool' in virt-manager, you can create such VMs using its user-friendly wizard. (But I just write the XML by hand these days using an existing VM as a template).

Here's sanitised output of 'virsh dumpxml' for one of my guests:

<domain type='kvm'>
  <name>somevm</name>
  <uuid>f173d3b5-704c-909e-b597-c5a823ad48c9</uuid>
  <description>Windows Server 2008 R2</description>
  <memory unit='KiB'>4194304</memory>
  <currentMemory unit='KiB'>4194304</currentMemory>
  <vcpu placement='static'>2</vcpu>
  <os>
    <type arch='x86_64' machine='pc-1.1'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <pae/>
  </features>
  <cpu mode='custom' match='exact'>
    <model fallback='allow'>Nehalem</model>
    <vendor>Intel</vendor>
    <feature policy='require' name='tm2'/>
    <feature policy='require' name='est'/>
    <feature policy='require' name='monitor'/>
    <feature policy='require' name='smx'/>
    <feature policy='require' name='ss'/>
    <feature policy='require' name='vme'/>
    <feature policy='require' name='dtes64'/>
    <feature policy='require' name='rdtscp'/>
    <feature policy='require' name='ht'/>
    <feature policy='require' name='ds'/>
    <feature policy='require' name='pbe'/>
    <feature policy='require' name='tm'/>
    <feature policy='require' name='pdcm'/>
    <feature policy='require' name='vmx'/>
    <feature policy='require' name='ds_cpl'/>
    <feature policy='require' name='xtpr'/>
    <feature policy='require' name='acpi'/>
  </cpu>
  <clock offset='localtime'/>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/bin/kvm</emulator>
    <disk type='block' device='disk'>
      <driver name='qemu' type='raw'/>
      <source dev='/dev/vg1/somevm'/>
      <target dev='hda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <target dev='hdc' bus='ide'/>
      <readonly/>
      <address type='drive' controller='0' bus='1' target='0' unit='0'/>
    </disk>
    <controller type='usb' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
    </controller>
    <controller type='ide' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'/>
    <interface type='bridge'>
      <mac address='00:00:00:00:00:00'/>
      <source bridge='br0'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </interface>
    <serial type='pty'>
      <target port='0'/>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
    </console>
    <input type='tablet' bus='usb'/>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <graphics type='vnc' port='-1' autoport='yes'/>
    <video>
      <model type='vga' vram='9216' heads='1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </memballoon>
  </devices>
  <seclabel type='none' model='none'/>
</domain>

Another thought (not relevant to your question but it might help): if you can, make sure you're using the 'paravirtualised' network, block, random, clock etc drivers - they're significantly faster than the fully virtualised ones. This is the "model=virtio" stuff above. You have to load driver modules into the host's kernel such as virtio_net.

Here is output of 'virsh pool-dumpxml vg1':

<pool type='logical'>
  <name>vg1</name>
  <uuid>9e26648e-64bc-9221-835f-140f6def0556</uuid>
  <capacity unit='bytes'>3000613470208</capacity>
  <allocation unit='bytes'>1824287358976</allocation>
  <available unit='bytes'>1176326111232</available>
  <source>
    <device path='/dev/md1'/>
    <name>vg1</name>
    <format type='lvm2'/>
  </source>
  <target>
    <path>/dev/vg1</path>
    <permissions>
      <mode>0700</mode>
    </permissions>
  </target>
</pool>
2

I don’t know of a way of exactly replicating the Xen behaviour you describe. However, you can use kpartx to expose the partitions within an LV that contains a whole-disk image as block devices on the host, which you can then mount, etc.

  • Thanks for the comment, Richard. I've actually come across that option already, as well as losetup, which works similarly. The issue there is that I have to shutdown the guest first in order to mount its filesystems from the host. If I try to mount read-only, if complains about filesystem corruption, wants to run a fsck, and then aborts because it's read-only. I haven't tried mounting it read-write, because that could well _cause_ corruption. This is a great tip for anyone wanting to do this with qemu images in general, though, without the online requirement. – Jared Jul 25 '12 at 00:51
2

See my answer to my own question on this issue at KVM booting off-image kernel and existing partition. In short, getting virt-install to create a config for this is pretty straight-forward, given a slight modification of guest /etc/fstab.

Bittrance
  • 2,970
  • 2
  • 21
  • 27
  • Just tried that. It's another great idea, but still doesn't work, at least not for new guests. The centos installer actually sees vda and vdb as formatted with ext4 and swap, but it still insists on treating them as disks rather than partitions and won't use them directly, so I can't complete the install. I supposed I could install "normally", then dump the filesystems out to separate volumes and fiddle with grub/fstab like you mentioned to get it to work, but that's not really a usable solution for deploying guests. I'm beginning to resign myself to the fact that this simply won't work. – Jared Jul 25 '12 at 03:44
  • I misunderstood. I'm using yum --installroot to build the partitions directly from the host without involving pesky installers. My use case is to get guests that are as similar as possible, while remaining up-to-date. This is why I want partitions rather than disks. – Bittrance Jul 27 '12 at 11:53