2

I'm conducting a number of tests in order understand why some of our workloads generate a really poor "user experience" with db related activities.

We have a dl160g9 server with a B140i controller (4 LFF disks 7.2k rpm). host is centos 7, the virtualization engine is kvm, version is 105.el7_2.4 (1.5.3) from centos virt sig. Disks are thick LVM volumes on top of the fake-raid 5 generated by the b140i controller. tuned is set for performace (setting to virt guest doesn't change so much).

We run both win and linux VMs. In any case the storage is local, attached to the VM via virtio with cache=none and io=native.

As our db files are small I'm simulating workloads with:

iozone -i 0 -i 8 -s 4m -t X

where X is either 1 (simple case) or 25 (actual userbase).

These are results on host:

Children see throughput for 1 mixed workload    = 1768421.25 kB/sec
Parent sees throughput for 1 mixed workload     =  293368.05 kB/sec
Min throughput per process          = 1768421.25 kB/sec 
Max throughput per process          = 1768421.25 kB/sec
Avg throughput per process          = 1768421.25 kB/sec
Min xfer                    =    4096.00 kB

Children see throughput for 25 mixed workload   = 12061204.12 kB/sec
Parent sees throughput for 25 mixed workload    =   39519.61 kB/sec
Min throughput per process          = 1017999.00 kB/sec 
Max throughput per process          = 1237047.00 kB/sec
Avg throughput per process          =  482448.16 kB/sec
Min xfer                    =    2212.00 kB

The linux VM (ubuntu 14.04 LTS) shows:

Children see throughput for 1 mixed workload    = 1901520.62 KB/sec
Parent sees throughput for 1 mixed workload     =  176180.65 KB/sec
Min throughput per process          = 1901520.62 KB/sec 
Max throughput per process          = 1901520.62 KB/sec
Avg throughput per process          = 1901520.62 KB/sec
Min xfer                    =    4096.00 KB

Children see throughput for 25 mixed workload   = 5338608.75 KB/sec
Parent sees throughput for 25 mixed workload    =   15434.67 KB/sec
Min throughput per process          =       0.00 KB/sec 
Max throughput per process          = 2675395.75 KB/sec
Avg throughput per process          =  213544.35 KB/sec
Min xfer                    =       0.00 KB

Windows 10 64bit pro VM results:

 Children see throughput for 1 mixed workload    =  496220.16 KB/sec
 Parent sees throughput for 1 mixed workload     =  162133.06 KB/sec
 Min throughput per process                      =  496220.16 KB/sec
 Max throughput per process                      =  496220.16 KB/sec
 Avg throughput per process                      =  496220.16 KB/sec
 Min xfer                                        =    4096.00 KB

 Children see throughput for 25 mixed workload   = 1298231.58 KB/sec
 Parent sees throughput for 25 mixed workload    =    7626.09 KB/sec
 Min throughput per process                      =       0.00 KB/sec
 Max throughput per process                      =  285706.31 KB/sec
 Avg throughput per process                      =   51929.26 KB/sec
 Min xfer                                        =       0.00 KB

I see here a number of issues:

  • linux perf on 1x threads seems legit according to virtio overhead but threaded concurrent access seems a mess, especially the min at 0 KB/sec is strange

  • while linux VM seems to scale badly, win (and the related virtio drivers) seems a total mess even at 1 thread. While I expected some overhead, people got better results here

Of course win 10 is new and supported only by latest (117) drivers, but win 8.1 gives similar results (omitted as this post is going large).

I'm open to any doc or suggestion which can point me to the right direction: my knowledge was that cache=none, raw LVM and correct tuned-adm were enough to get good results.

Updated info

kernel (stock): Linux dl190g9 3.10.0-327.18.2.el7.x86_64 #1 SMP Thu May 12 11:03:55 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

qemu-kvm (centos virt sig): 105.el7_2.4 (1.5.3)

libvirtd: 1.2.17

when I say virtio drivers I mean libvirt xml files report virtio as bus.

  • Win shows "Red Hat virtIO SCSI Disk Device"

  • linux reports disks as "00:06.0 SCSI storage controller: Red Hat, Inc Virtio block device"

Update 2

  • I've previously said that the 1 thread perf on windows was unexpectedly bad. It actually fits results posted here: a ratio of 1:3 between native and Win VM in throughput.

  • This leaves both scenarios with the badly scaling issue: 1:3 scaling in host vs 1:10 in VMs

As per comments, I hope in 7.3 improvements, which, according to Red Hat release schedules should happen in the really close future (around 6 months between releases).

Thanks

matteo nunziati
  • 624
  • 1
  • 4
  • 13
  • 1
    Hmm. If you can wait for RHEL 7.3 there will be some updates which significantly improve virtual disk performance, especially if you switch from virtio-blk to virtio-scsi. – Michael Hampton May 24 '16 at 16:36
  • well, the biggest issue is with windows, as we should migrate our "ERP" on it. we have a time frame of 3/4 weeks, than we need a plan B. any pointers to 7.3 release date and/or cited improvements? thank you. – matteo nunziati May 24 '16 at 16:44
  • Yes, the biggest performance improvements are with Windows guests. I've been testing the upcoming stuff as it's already in Fedora and, frankly, Windows performance used to suck. Now it's near native. As for when Red Hat will release it, I have no idea. They don't share their schedule with me. – Michael Hampton May 24 '16 at 16:47
  • just a last comment: do you think it would worth to use nested kvm to test it in a fedora VM (which version)?! this could really change our plans! – matteo nunziati May 24 '16 at 16:48
  • All the necessary bits are in Fedora 24, which is in beta right now. But I wouldn't bother testing a nested VM (yet). If you can run it on bare metal, that might make a good test. – Michael Hampton May 24 '16 at 16:49

0 Answers0