4

We are planning to virtualize the existing infrastructure by an open source virtualization solution. KVM and Xen are on the final list. Big cloud players are still using Xen, and we found that KVM is gaining popularity and adopted by quite a few VPS providers.

Our biggest concern is stability. So the question is, is KVM stable enough for production use in 2011?

cooldfish
  • 43
  • 1
  • 3
  • 1
    I think Xen will stay on top for a little while longer because Xen will be included in the kernel from now on (Linux 3.0) – Bart De Vos Jun 04 '11 at 10:43

3 Answers3

5

KVM for production is OK. Having a bunch of Windows and Linux VMs, including Remote Desktops, Databases (MS and MySQL), Router, Firewall, even Backups (guest to guest) and everything is running fine.

What I like about KVM is the ability to scale management layers. I actually prefer managing the lot I have without libvirt, adjusting (and learning about) every single parameter kvm/qemu accepts. Others use the libvirt based tools and if you need full scale cloud management, there's Open Stack and friends.

There are some settings to stay away from, though. Use the default cache=writethrough, do not enable native async, and stay away from qcow, qed or whatever file formats. Give your machines LVM volumes.

korkman
  • 1,647
  • 2
  • 13
  • 26
  • Which Linux distro are you using? I know RHEL/CentOS might be the best choice. How about Debian and Ubuntu? – cooldfish Jun 04 '11 at 20:12
  • Using Debian stable for base install, but added unstable to sources.list for up-to-date kvm and kernel-image. Recent versions of KVM have network performance gains with the vhost_net module (6 to 20 gbit/s guest to guest on same host). – korkman Jun 06 '11 at 08:21
  • @korkman Are you using shared storage for your KVM hosts? – Joel E Salas Jun 04 '13 at 00:15
  • @JoelESalas: I currently prefer DRBD below the VMs and cache=none. cache=writethrough was bad advice actually, and it is not default anymore. cache=writeback is OK for low IO machines, but heavy writers gain problems with it, so cache=none is best option now. – korkman Jun 11 '13 at 11:41
  • @korkman Are you doing live migration with that setup? How do you do a live migration with DRBD + LVM? – Joel E Salas Jun 12 '13 at 18:01
  • @JoelESalas: Live migration is possible with dual primary mode, but I do not use that. I assume DRBD protocol C is mandatory in that scenario. DRBD behaves much like shared storage in dual primary. I have LVM below DRBD and create one DRBD pair per VM disk. With LVM on top, use LVM cluster support. – korkman Jun 15 '13 at 22:43
2

I agree with dyasny as for stability, although I'm not sure the feature set of KVM compares with Xen/VMware/etc. at this point.

I know they have live migration with/without shared storage ("vMotion and Storage vMotion" in VMware parlance), but I'm not sure if they have HA/clustering and load balancing ("Distributed Resource Scheduler") or distributed switches ("Distributed vSwitch") at this point. What makes it tricky is that this could all change, search-wise, once the marketing drones get a hold of it.

Also, I would suspect that the centralized management is not quite there yet ("vCenter"), but again, you'd have to do more research or possibly even venture in development/beta versions to achieve these bits of functionality.

Hopefully someone with more KVM experience/knowledge can chime in here.

gravyface
  • 13,947
  • 16
  • 65
  • 100
  • 1
    HA/clustering/Loadbalancing are there, but those are not hypervisor features. Those features are implemented in the management tools, the most mature and feature-rich of which is probably RHEV. – dyasny Jun 04 '11 at 15:27
  • as for "distributed vswitch", if you explain the use case, I'm sure the same goals can be achieved without proprietary technologies – dyasny Jun 04 '11 at 15:28
  • @dyasny: VMware's distributed vSwitch spans across hypervisor hosts allowing you to have consistent networking across your hypervisor host clusters. Think Cisco's Nexus modular switches. – gravyface Jun 04 '11 at 15:30
  • 1
    well, under RHEV this is already the case, without any marketing palaver... The system simply manages networking configs for hosts on a per-cluster basis – dyasny Jun 04 '11 at 15:35
  • @dyasny: good points, and I think they should go into you answer; maturity and stability go hand and hand. – gravyface Jun 04 '11 at 15:40
1

yes, it is stable enough, and there are a lot of cloud providers (IBM leading the list) and other types of users, using it in production for years now.

dyasny
  • 18,482
  • 6
  • 48
  • 63