66

Running anything inside a virtual machine will have some level of performance hit, but how much does it really impact the performance of a database system?

I found this academic reference paper with some interesting benchmarks, but it was a limited test using Xen and PostgreSQL only. The conclusion was that using a VM does "not come at a high cost in performance" (although you may think the actual data says otherwise).

What are the technical, administrative, and other drawbacks associated with running a database within a virtual machine?

Please post answers that can be backed up by objective facts, I'm not interested in speculation or any other semi-religious argument (geek passion is good in many ways, but that won't help us here).

That being said,

  • What issues show up when running database in a Virtual Machine? (please post references)
  • Are those issues significant?
    • Are they only significant under certain scenarios?
  • What are the workarounds?
makerofthings7
  • 8,821
  • 28
  • 115
  • 196
Russ
  • 713
  • 1
  • 5
  • 7
  • +1 I'm primarily interested in hearing feedback about SQL Server and Windows 2008 R2 scenarios – makerofthings7 Sep 01 '11 at 21:35
  • 4
    @Shane Madden - Can you please explain the closure a bit? I expect that the motivation was driven by one non-specific *answer* (which then got derailed in the comments), not the question itself. Regarding the question, 44 votes and 12 favorites within roughly one day of pre-closure existence implies to me that it was a good question with useful answers/information (especially compared to what seems to be typical for ServerFault question traffic). This is what the various SE sites are aiming at. Would you have preferred a more specific question phrasing, vs the loose "how bad is it?". – Russ Sep 06 '11 at 02:25
  • 1
    @ErikA ,Shane ,Womble ,mikeyb ,Ben - I made a community edit that may make this question more constructive. Do consider reopening this, or posting a similar question on a new/clean question. – makerofthings7 Sep 07 '11 at 14:40
  • At almost 10 years later, I'm curious about how things look today. – Alex Jul 16 '20 at 17:09

7 Answers7

40

Though many DB vendors were very slow to do this, nearly all of them now officially support their software running in a virtualized environment.

We run many Oracle 11g instances in linux on top of ESXi, and it is certainly possible to get very good performance. As with all hardware scaling, you just need to make sure that the virtualization host has plenty of resources (RAM, CPU), and that your disk layer is up to the task of delivering whatever IO performance you require.

EEAA
  • 108,414
  • 18
  • 172
  • 242
  • 7
    +1 As noted, Critical that resources be up to the task. Disk has been the big bottleneck for us and carefull planning is needed. – Dave M Sep 01 '11 at 14:41
  • 2
    +1 You need to do your homework on the database **usage** ahead of time. If your physical box is getting hammered above 40% utilization then your advantages for vm'ing it start to dissolve. That being said we have tons of small application-specific isolated sql's running on vm's with no problem. But our large heavy-usage machines have dedicated hardware because of the lack of advantage. – Nate Sep 01 '11 at 15:04
  • 5
    Definitely Disk IO is the big culprit, and what virtualised environments tend to be flaky at. – lynxman Sep 01 '11 at 15:13
  • 1
    @lynxman - Agreed. We run all of our Oracle instances on our Tier 1 SAN disks, which are 15k SAS. From what I can tell, we get *very* close to near native performance. – EEAA Sep 01 '11 at 15:19
  • I assume you need to dedicate SAN LUNs to each VM and potentially a single fibre port to each VM as well? VT-d is probably key too, correct? – AngerClown Sep 01 '11 at 17:05
  • Our database servers typically get their own VMFS lun, but the FC ports are shared with all the other VMs that happen to be on the host. Single LUN-per-VM doesn't matter much in our case anyway, though, as our SAN (Compellent) stripes LUNs across all the disks, so they're all shared anyway. – EEAA Sep 01 '11 at 17:42
  • 10
    "An ounce of test is worth a pound of guess." – Chris B. Behrens Sep 01 '11 at 21:05
21

As ErikA says, this is becoming more and more common. I'm in the SQL Server camp and don't personally have any production systems running in VM's, but I would not be hesitant to (after a little more study on the topic). There are definitely some things to take into consideration before you go down that path, though (at least for SQL Server). Disk IO (as others have mentioned) and memory allocation are just 2 examples. Things will be different between different hypervisors as well.

Brent Ozar is a recognized expert in virtualizing SQL Server, specifically in VMWare. I would highly recommend reading through his material.

http://www.brentozar.com/community/virtualization-best-practices/

squillman
  • 37,618
  • 10
  • 90
  • 145
11

There is can and then there is should. A corvette can go 150 mph, but should you on public highways? You can harm yourself unnecessarily.

Databases are guest operating systems. By design when they start they grab blocks of a resource and manage it directly for performance reasons. As soon as you make the core operating system of the database server a guest in virtualized hosting environment then you are placing an arbitration layer with the hypervisor between the block allocated element of disk and RAM and the database server. It will slow down. The more inefficient your queries, the more it will slow. These inefficiencies may be masked today on dedicated hardware, but as soon as you introduce arbitration to your dependent resource you are going to find out real fast.

What a lot of bean counters who are demanding virtualization fail to recognize is that database servers, as guest operating systems, offer their own consolidation layer. There is no reason why you cannot move consolidate multiple logical database instances on one physical server, even to the point of moving IP addresses, setting up additional host names, etc..., to allow for this natural coalescing of services to take place. And, with this model not only do you retain the cost savings that the management is pushing for reduced number of physical hosts, but you retain the block access to physical resources without the impingement of the arbitrary hypervisor, which can make beneficial decisions sometimes and not others.

The same holds true for other guest operating systems, like Java. Virtualization solutions are typically busy environments and the hypervisor has to make lots of decisions on who "gets the token" on a resource. Anytime you can eliminate that layer you are going to be better off.

Coalesce multiple instances using the natural guest operating system layer first. Odds are you will be able to hit your platform consolidation and performance targets easier.

James Pulley
  • 456
  • 2
  • 6
  • 4
    Interesting definition of "guest operating system." While your point is taken with regard to pure, unadulterated performance, how often do your databases really bottleneck at the CPU? I/O is much more likely, and for higher performance applications you're already sharing I/O time at a SAN. I would hope that you'd reconsider your virtualization philosophy when a security issue with one application compromises all of your consolidated databases' password hashes, or when one process running within your JVM consumes every byte of available heap space. – Shane Madden Sep 01 '11 at 23:03
  • 5
    To be clear, I agree completely that a finely tuned, massively busy, high performance database server should have its own physical hardware. But those are not the norm, and the other benefits of virtualization tend to outweigh the performance hit, which is indistinguishable with most workloads. – Shane Madden Sep 01 '11 at 23:07
  • 3
    I disagree with your point about always going to the existing consolidation layers first. Sometimes that make sense. But look, for instance, at the cost tradeoff in rebalancing resources between consolidating multiple databases on a single OS and consolidating multiple database/OS combinations on top of a hypervisor. The first is more efficient. The second is much easier to rebalance. Migrating and OS/database to a new host is much less disruptive than migrating a database to a new OS. – Jake Oshins Sep 02 '11 at 00:43
  • My comments come from direct in-the-field observations of successful and failed migrations to virtualization solutions over the past decade as a performance engineer. There are tons of bad database apps out there whose promiscuous use of hardware masks performance issues. Add virtualization and those issues come to light. If you have an app which demands a precise clock for timing or audit purposes, then with the clock float in software virtualization you are out of the hunt. – James Pulley Sep 02 '11 at 12:56
  • With the bean counters making the push, the trend is oversubscription on the virtual machine hosts, which pushes the hypervisor decisions on resource allocation to almost universally poor to all of the guests. The hypervisor layer is also not as robust on the throughput front as the standard OS drivers so you do suffer a loss in maximal throughput vs the standard non-virtualized interface. – James Pulley Sep 02 '11 at 13:01
  • Databases, JVM's, etc... are defined as guest operating systems for the fact that they provide their own namespace for access to resources, they manage resources directly in a block and can run software defined for these environments. Databases also tend to have their own file systems for the storage of data. I am not totally virtualization averse as I use the technology every day as a part of a services delivery practice, but where performance is the primary concern (as it is in my field) I do not recommend or deploy virtualized solutions. – James Pulley Sep 02 '11 at 13:04
  • For many organizations the use of virtualization, particularly in Microsoft environments, is a lazy solution for platform consolidation. They do not see a clear path to retain the domain or internet namespaces resident on client computers for access to services on remote hosts and so virtualization is an easy solution. Where you have the right knowledge you can easily roll up dozens of individual computers to a single host without virtualization, retain the namespaces, even retain the IP addresses if you wish, and keep client computers blissfully ignorant of the change. – James Pulley Sep 02 '11 at 13:06
  • Unfortunately neither Microsoft nor Novell really embraced fully the directory model for both administrator and user namespaces as Banyan pioneered, for if they had platform consolidation would have been very easy. An admin could then simply migrate a service from one host to another and retain the same logical namespace as resolved by the directory server without users having any knowledge of systems being changed in the background. – James Pulley Sep 02 '11 at 13:09
  • 1
    Wow, just wow James. I don't have the time nor patience to trash all of the points you made in your answer and subsequent comments, but I just felt I needed to drop a comment here for anyone that might happen upon this answer. James's views are, well, his own, and don't reflect what is truly possible. If you're oversubscribed then *of course* you're going to have poor performance. So don't oversubscribe. It's perfectly possible to have a very high-performing virtualization environment. It's folly to make a blanket recommendation against it because it "performs poorly". – EEAA Sep 02 '11 at 14:56
  • "I am not totally virtualization averse as I use the technology every day as a part of a services delivery practice, but where performance is the _primary_concern_(as_it_is_in_my_field)_ I do not recommend or deploy virtualized solutions" – James Pulley Sep 02 '11 at 15:07
  • @James But you're making the generalization that every virtualized environment is oversubscribed due to overzealous bean-counters, and making the assertion that the few-percentage-points difference from native performance is a deal-breaker for most database loads. I understand where you're coming from, but your assertions don't apply well to the modern IT industry as a whole. – Shane Madden Sep 02 '11 at 15:20
  • I only have a decade worth of in-the-field observations to draw from. Where I am generally called primarily falls into two categories related to poor performance, environments are horribly oversubscribed, with IT decisions being driven by accounting managers unfamiliar with technology and with applications which are poorly designed/have high clock dependencies which are not well sorted for performance. In all cases the delta from physical to virtual us more than a few %. In some cases both are present. My observations may be biased by spending most of my time fixing these issues. – James Pulley Sep 02 '11 at 15:32
  • Do not get me wrong, I am not trying to "blame" virtualization here. I am a very happy user of the same technology in specific areas of my IT infrastructure. There are bad IT decisions chasing cost savings being lead by non-IT folks and bad applications that are leading the way to poor performance in virtualized/cloud environments. These bad decisions keep my organization very busy. – James Pulley Sep 02 '11 at 15:55
6

There are two things to realize here:

  • Unit of DB performance per unit of Hardware is a bit lower for a virtualized db. This means you need to buy a little more hardware to get the same level of performance.
  • That doesn't mean the same level or a desired level of performance is unobtainable. The gains you get from improved management and other benefits (like easier HA) often way more than offset the marginally increased hardware costs.

That said, where I work our Sql Server installation is one of only two servers that I have no intention of virtualizing any time soon (the other is the primary DC).

Joel Coel
  • 12,910
  • 13
  • 61
  • 99
4

Running SQL Server is a VM will be fine, provided that you can provide enough resources to the VM to run your application. If in the physical world you need 24 cores and 256 Gigs of RAM then you need to provide 24 vCPUs and 256 Gigs of RAM in the virtual world.

I just wrote an article in last months SQL Server magazine all about running SQL Server under VMware's vSphere.

mrdenny
  • 27,074
  • 4
  • 40
  • 68
2

I run two databases, one PostgreSQL and the other MySQL, in a virtual environment (Xen) where the dom0s are highly available. domU file systems are all located on an iSCSI SAN LUN, carved up with LVM2 logical volumes. The MySQL database is solely for Cacti, and so does not see very much usage at all, and is located on the iSCSI LUN as well.

The PostgreSQL database is the database for our staging environment, and therefore sees higher utilization than the MySQL db. For this reason, the database is located on a local RAID10 set, and DRBD replicated to the second cluster node. However, in terms of real load, this staging database doesn't see very high load at all. Which, in my opinion, makes it a good/great candidate to virtualize.

Some of the benefits to our organization was the reduced power consumption, saved rack space, and less hardware administrative overhead.

Our main production database, on the other hand, I cannot imagine going virtual....

Kendall
  • 1,043
  • 12
  • 24
2

I work with MSSQL and MySQL servers on numerous servers. A couple years ago I was hesitant to start setting up SQL servers on VMs because I had heard about the performance issues of running a SQL server on a VM. However, I was surprised after I setup my first couple SQL servers and saw no change in performance. More and more of the servers I work on are on VM and almost all of the larger enterprise clients I work for have virtulized SQL servers.

Yes, the VM does add some overhead cost and if you are going to be hosting multiple VMs on a single box you are going to need a nice beefy server. A common resource problem to look out for is adding additional VM's and thinning out the available resources. It's common practice to plan for some growth, but if you bought your server to host 2 or 3 VMs and now its running 10 VMs your probably going to see a performance hit.

I would be lying if I said I have never seen performance issues running a SQL server on a VM. But, I have learned that if you are seeing poor performance, there probably is something wrong with the environment.

Chris
  • 21
  • 1