0

Does anyone have hands-on experience to share with regards to software raid performance under XenServer 5.5?

I've recently moved from VMware Server 1.x on Ubuntu to XenServer 5.5 hoping for an increase in performance by utilizing paravirtualization technology. Unfortunately I'm seeing very low performance on my md type software raid.

While the XenServer host is seeing numbers above 100 MB/s the paravirtualized guests are unable to get much more than 20 MB/s.

Is this to be expected, or should I look for a problem in my configuration?

edit>

I realize that software raid isn't ideal on a virtualization host, but for a home lab I can't justify a true hardware raid controller, while I still want some level of storage redundancy.

The host is not showing high cpu usage, system load or even excessive I/O wait cycles. This combined with the fact that VMware Server 1.x gave at least twice the I/O performance suggests that there must be some issue with the paravirtualization. Possibly due to lack of certain functionality in my hardware.

Since the full hardware virtualization of XenServer isn't all that great, I guess I'll be going back to VMware Server on Ubuntu, giving me a chance to try out the new 2.x version.

Roy
  • 4,256
  • 4
  • 35
  • 50

4 Answers4

2

Linux software raid is damn good and it beats low end raid controllers and usually matches performance of mid-end ones.
I recently did some performance tests for a couple virtualization technologies. The disk i/o performance loss in Xen VMs (xenserver 5.5 for that matter ) was about 70%. I used iozone which tests 10+ read/write patterns. The machine had 2x 160G sata-II drives in software raid1. Note that the 70% penalty on speed can vary Depending on the type of disk operations.
One thing you can do in Xenserver is to set higher or lower priority on certain storage resources (click around, should be in there ) that help a little i/o intensive VMs. But that's pretty much it - you want VMs you must pay the price :)

If you want to run linux vms in linux hosts using containers would get you better performance. For example, OpenVZ disk i/o performance loss is around 7%.

  • Thanks. Your numbers are very helpfull, and similar to mine. I ended up using VMware Server 2.x on Ubuntu 9.04 x86_64. Large sequential I/O from an Ubuntu guest is now averaging at 70 MB/s. Random reads nearing 3500 iops. I guess there must be some sata or md related issues with XenServer. – Roy Oct 19 '09 at 09:38
  • Weird. XenServer or Xen community version paravirtualized guests are waaay faster on all types of i/o (network/memory/disk) than VMware Server's 1/2.x. Are you sure your Xen guest was booting with a domU patched kernel (i.e. "accelerated" or paravirtualized) ? Otherwise it defaults to hardware virtualized mode which is slow... –  Oct 20 '09 at 02:51
1

I'm going to disagree with tptech on this one.

Software RAID isn't that bad. Especially with its very mature development in Linux and Windows Server operating systems.

In fact, software RAID can offer better performance than an el-cheapo RAID controller, as a cheapo RAID controller will use your CPU for its parity calculations anyway.

In rebuttal to tptech:

  • Harddrives are slow enough as it is, RAID adds overhead.

Sure. But this applies to ANY RAID, not just software RAID. This is also not true for RAID10, as you get the benefit of striping with 100% redundancy.

  • Your virtualized environment is sharing CPU and I/O between VM's and software RAID is stealing both CPU and I/O cycles.

This is true, and would be a concern for a host that has very I/O heavy VM's. But if your VM's are very I/O heavy then a VM is probably not the right place for them anyway, or they should have access to a SAN (which moots software RAID all together).

The fact of the matter is that the host alone is getting 100MB/s bandwidth, it's only the VM's that are suffering, and I can't see how having software RAID would cause performance to drop to 1/5th of its host just because the request is coming from a VM and not the host.

I would suggest that it's probably something in the VM configuration, or maybe even the guest machine. It's also worth mentioning that if your host CPU or your guest OS does not support paravirtualisation it will cause performance degredations. Not all guests support it, and AFAIK no 64-bit guest kernels support it.

Mark Henderson
  • 68,316
  • 31
  • 175
  • 255
1

I have no experience of Xen but I would suspect that the performance problems you are seeing is not directly due to the RAID arrangement. Do you have a non-RAID volume that you can copy a VM to rerun performance tests on to compare the results?

If you are testing write performance specifically, then there is an option in VMWare Server that may explain a difference between it and Xen, that being you can specify to VMWare Server that it should let the host OS cache disk access as it see fit, instead insisting all writes that the guest OS thinks are physical writes are performed synchronously. With this option on you will usually see improved write performance in the VM, sometimes massively improved, but there is greater risk of corruption in VM filesystems in the event of a power cut or some error that causes the hypervisor to halt the VM unceremoniously. Someone with specific knowledge of Xen may be able to tell you how it defaults to behaving in this area (and if they behaviour can be tweaked).

Another issue that may (I say "may" here as really I'm only guessing) explain a difference is that even if the guestls writes are delivered to the hosts IO stack synchronously (so they are sent from cache to disk before the IO call returns to the VM's OS), the way VMWare Server accesses the vdisk files allows their content to be held in the guest OS's cache and buffers, reducing the need for physical IO operations if the host machine has an abundance of RAM available for cache+buffers. This may not be the case with Xen's arrangement, though you'll need to check that with someone who knows more about Xen's specifics than I do.

Additional idea:

One other thing to check: how is the guest kernel accessing the virtual device? If it is running in PIO mode rather accessing the device with DMA or UDMA then it will be chewing many more CPU cycles than it needs to for each write operation which will have a noticeable effect even if the I/O operations are translated to DMA based ones by the time they reach the physical devices.

David Spillett
  • 22,534
  • 42
  • 66
  • When testing with iozone we generally use files so large that caching becomes of little influence. In this case I'm writing a 16 GB file on a host with 4 GB memory. Additionally, I monitor the I/O on the host while the test is running to see what is actually written to the disks. 30 second averages using VMware ranges from 50 to 80 MB/s but were stable at 20 - 22 MB/s using XenServer. VMware server is configured to not cache writes on the host. – Roy Oct 19 '09 at 10:54
  • Enabling cached writes (I know you aren't in either case, but for future reference...) can give a performance boost even if operating over datasets larger than physical RAM, as it can remove latency from write operations (instead of waiting for a block write to complete, the process can be streaming the next block(s) ready for when the first write completes). The overall effect can be quite noticeable over long operations (though I wouldn't expect it to make the 3- or 4- fold difference you are seeing here). – David Spillett Oct 19 '09 at 12:15
  • Ah, yes. The write cache could certainly have a positive effect on the write and rewrite. After all, thats why one would employ write back cache in different layers of the IO subsystem. Notably a write back cache could also have a negative effect on certain I/O profiles which is why we sometimes disable it for database servers. . I think we can be pretty sure, however, that after writing a 16 GB file the first 4 GB will have left any of the caches. So when we start the read test the only positive cache effect should be from the read ahead. – Roy Oct 19 '09 at 17:27
  • Your point around DMA access sounds very interesting. After all, VMware presents the virtual disks through a scsi interface while XenServer seems to use an IDE interface. – Roy Oct 19 '09 at 17:31
0

One word for software RAID - don't use it. Well actually that was three words. But you get my drift.

  • Harddrives are slow enough as it is, RAID adds overhead.
  • Your virtualized environment is sharing CPU and I/O between VM's and software RAID is stealing both CPU and I/O cycles.

A good RAID hardware controller can improve performance many times by offloading RAID computation from CPU and buffering disk I/O - the more diskdrives you then add to your RAID, the better performance you (usually) get!

tplive
  • 444
  • 2
  • 9
  • Well, yes and no. While software raid adds overhead, it also adds significant performance gains when utilizing striping. In the case of RAID 5 there is some CPU overhead involved, but lack of CPU is less of a problem these days. Naturally a nice hardware raid controller would be beneficial, but it would also double the cost of my server. – Roy Oct 07 '09 at 07:39