3

When virtualisation was new, we tried to virtualise everything, but then we noticed use cases where the our virtual machines were much slower than a bare metal.

For us, we use the following rules when deciding not to virtualise:

  • Network IO intensive applications (i.e. with many interrupts/packets)
  • Disk IO intensive applications (if not on SAN storage)
  • RAM intensive (this is the most precious resource)

We have had these experiences with Xen and DRDB as well has Hyper-V's shared-nothing with DAS. Is this the case with all hypervisors?

What (other) metrics should I look for when deciding to virtualise an application/server or not?

Mark Henderson
  • 68,316
  • 31
  • 175
  • 255
Nils
  • 7,657
  • 3
  • 31
  • 71
  • Have you tried VMWare or KVM? – ewwhite Nov 23 '12 at 22:02
  • @ewwhite no. But to me it does not seem relevant which virtualization technique is being used - at least this is my suspicion - which lead to this question. – Nils Nov 24 '12 at 21:54
  • 3
    But it *does* matter... You've skipped two major players in the virtualization space. Not all hypervisors are the same. Nor are their management features. In the VMWare world, for example, resource pools, storage I/O control and good planning open the workloads you've mentioned up to being virtualized. – ewwhite Nov 24 '12 at 22:00
  • @ewwhite I see now, why this question has been closed... :-/ – Nils Nov 24 '12 at 22:04
  • FWIW, we virtualized everything but our internet gateway, PDC, DB, a file server, and some priority phone system equipment that is really just a stock server with special software. Likely when the PDC hardware is up for replacement I'll let that go virtual, too. – Joel Coel Nov 27 '12 at 23:02
  • 1
    That's a reasonable list you have already. I'd virtualise anything that sits idle most of the time. – hookenz Nov 28 '12 at 03:26
  • One great thing about virtualisation is you can redeploy a VM onto another machine should the hardware have a fault. In some instances, live migrate so you have virtually no downtime. – hookenz Nov 28 '12 at 03:36

2 Answers2

6

You've hit the major metrics in your question:

  • Network IO
    You want to be sure that your proposed virtualization workload won't saturate your host system's network connection. In these days of 10Gbit NICs this is less of an issue for larger enterprises, and smaller enterprises can often get the performance they need from gigabit (or teamed/aggregated gigabit) NICs.

  • Disk IO
    You want to be sure that your disk subsystem (local disks, SAN, NAS) can handle the disk I/O you're proposing.
    When sizing this bear in mind that your SAN fabric (switches, etc.) needs to be able to handle the load too -- you may have an über-SAN storage system that can push terabits-per-second to its disks, but if that monster is connected to a lousy 100Mbit iSCSI fabric you'll saturate your network before the storage device breaks a sweat.

  • RAM
    More specifically "active" RAM (because the inactive stuff may be swapped out by the hypervisor and nobody will notice). Ideally you have enough physical RAM that your hypervisor doesn't need to swap. In reality you'll probably find a happy medium of overcommitment.

Some others to consider:

  • CPU (and Workload Patterns)
    If you have a bunch of systems that do CPU-intensive tasks you might have trouble if they're all clamoring for the host system's processor at the same time. (e.g. if you have 1 host CPU, and 3 VMs that all want to crunch numbers at midnight each VM is only going to see ~1/3 of the host CPU's performance as the hypervisor tries to split the contested resource between them).
    The flip side of this is if you have a bunch of systems that do CPU-intensive tasks at different times (say midnight, 3AM, and 6 AM, and always finishing before the next guy starts) you can virtualize them and they'll never know the difference.

  • Custom Hardware
    Some hypervisors (like VMWare) allow PCI and Storage Pass-thru. Others may not.
    If you need access to hardware on the host (like a graphics card or direct disk access) you need to factor that in when planning your virtualization.

  • Timekeeping
    Hypervisors have gotten better at this, but precision timekeeping tasks are still better suited for dedicated physical servers. Your organization's primary NTP server, for example, should be a physical host (or a router if your routers are capable of acting as NTP servers).

  • Things that generally don't virtualize well
    There's a lot of anecdotal data about this, so do a little research before you virtualize something.
    As a few examples, the timekeeping issues I mentioned above, VOIP systems (like an Asterisk PBX), and heavily-used databases are generally bad candidates for virtualization (the first two due to the timekeeping precision issues, the databases generally because they cause, and suffer from, resource contention more than other workloads).
    Every company amasses a local list of things they know they can't virtualize -- as you find your items make sure you document them (including the reason, in case one day you get a hypervisor that solves the problem).

voretaq7
  • 79,345
  • 17
  • 128
  • 213
  • I've had asterisk virtualized for over a year with no timekeeping issues (it does need to have `dahdi` loaded to get access to a reasonable timer though). It performs perfectly well. – Michael Hampton Nov 28 '12 at 18:25
  • @MichaelHampton I've had the exact opposite experience (lots of call drops and out-of-order audio issues) -- this was pre-`DAHDI` though using the old software timer. Soured me on the whole idea of virtualized telecom. – voretaq7 Nov 28 '12 at 18:34
2

As has been pointed out in the comments, not all virtualisation software is equal.

http://wiki.openvz.org/FAQ#What_is_the_performance_overhead.3F

What is the performance overhead? Near zero. There is no emulation layer, only security isolation, and all checking is done on the kernel level without context switching.

What are performance expectations? Peak performance is achieved when only one container has active tasks. In this case, it could use 100% of available resources: all CPUs, all physical memory, all disk and network bandwidth. OpenVZ is not limiting you to a single-CPU virtual machine.

While this might feel like a non-answer: there are no blanket conditions where you shouldn't use virtualisation. I'm now in the habit of deploying hardware with just 1 OpenVZ container: they're dead easy to migrate using the tools provided because of the hardware abstraction that virtualisation inherently provides. As a small side effect, software license costs are generally cheaper, too.

Jay
  • 6,439
  • 24
  • 34