I found that there isn't simple and absolute answer for questions like yours. Each virtualization solution behaves differently on specific performance tests. Also, tests like disk I/O throughput can be split in many different tests (read, write, rewrite, ...) and the results will vary from solution to solution, and from scenario to scenario. This is why it is not trivial to point one solution as being the fastest for disk I/O, and this is why there is no absolute answer for labels like overhead for disk I/O.
It gets more complex when trying to find relation between different benchmark tests. None of the solutions I've tested had good performance on micro-operations tests. For example: Inside VM one single call to "gettimeofday()" took, in average, 11.5 times more clock cycles to complete than on hardware. The hypervisors are optimized for real world applications and do not perform well on micro-operations. This may not be a problem for your application that may fit better as real world application. I mean by micro-operation any application that spends less than 1,000 clock cycles to finish(For a 2.6 GHz CPU, 1,000 clock cycles are spent in 385 nanoseconds, or 3.85e-7 seconds).
I did extensive benchmark testing on the four main solutions for data center consolidation for x86 archictecture. I did almost 3000 tests comparing performance inside VMs with the hardware performance. I've called 'overhead' the difference of maximum performance measured inside VM(s) with maximum performance measured on hardware.
The solutions:
- VMWare ESXi 5
- Microsoft Hyper-V Windows 2008 R2 SP1
- Citrix XenServer 6
- Red Hat Enterprise Virtualization 2.2
The guest OSs:
- Microsoft Windows 2008 R2 64 bits
- Red Hat Enterprise Linux 6.1 64 bits
Test Info:
- Servers: 2X Sun Fire X4150 each with 8GB of RAM, 2X Intel Xeon E5440 CPU, and four gigabit Ethernet ports
- Disks: 6X 136GB SAS disks over iSCSI over gigabit ethernet
Benchmark Software:
CPU and Memory: Linpack benchmark for both 32 and 64 bits. This is CPU and memory intensive.
Disk I/O and Latency: Bonnie++
Network I/O: Netperf: TCP_STREAM, TCP_RR, TCP_CRR, UDP_RR and UDP_STREAM
Micro-operations: rdtscbench: System calls, inter process pipe communication
The averages are calculated with the parameters:
CPU and Memory: AVERAGE(HPL32, HPL64)
Disk I/O: AVERAGE(put_block, rewrite, get_block)
Network I/O: AVERAGE(tcp_crr, tcp_rr, tcp_stream, udp_rr, udp_stream)
Micro-operations AVERAGE(getpid(), sysconf(), gettimeofday(), malloc[1M], malloc[1G], 2pipes[], simplemath[])
For my test scenario, using my metrics, the averages of the results of the four virtualization solutions are:
VM layer overhead, Linux guest:
VM layer overhead, Windows guest:
Please note that those values are generic, and do not reflect the specific cases scenario.
Please take a look at the full article: http://petersenna.com/en/projects/81-performance-overhead-and-comparative-performance-of-4-virtualization-solutions