I've set up several KVM based networks before, and never encountered this issue, can't for the life of me think what I'd have set up differently previously.
Setup
Basically, Ive got and entirely Dell Stack:
- 2x Dell N2024's (stacked gigabit switches)
- Several Dell R720's for KVM Hypervisors
- 2x Dell R320's for gateway/firewalls
All machines run CentOS6.5, the hypervisors, basically standard install with a few sysctl tweaks.
At the moment, I've got a few test VM's setup, with similar setup to their masters (CentOS 6.X, base install with basic puppet driven configuration). All VM's are:
- Bridged to one of two physically separated networks (i.e each hypervisor has two ethernet connections, one for a public/DMZ bridged LAN, the other, a private one)
All VM's use virtio for network, block devices (basically bog standard result of running the virt-install command) -- e.g (example libvirt config)
<interface type='bridge'> <mac address='52:54:00:11:a7:f0'/> <source bridge='dmzbr0'/> <model type='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface>
and all VM's have access to between 2 and 8 VCPU's and 8 and 64GB RAM, and their drives are LVM volumes on the host machine
Some simple file copies within the VM, and dd tests yield perfectly acceptable results (300MB/s - 800MB/s in these small scale synthetic tests)
Network Performance between Physical Machines
I've left Jumbo Frame/MTU configurations for now, and server to server transfer will quite happily max out the gigabit connection (or there about) (100MB/s -> 118MB/s flat over several large file tests to/from each machine)
Network Performance between a Physical Machine and VM (and VM to VM)
Rsync/SSH transfer consistently changing (unstable) but always between 24MB/s and a max of about 38MB/s
I've performed several other tests: - Between a Physical machines IP on one bridge to the VM (on another bridge) - Between a Physical machines IP on one bridge to the VM (on the same bridge) - Tried starting the VM's using e1000 device drivers instead of virtio
Nothing seems to have worked, has anyone encountered this much of a performance degradation before? I've just checked my older network (hosted at another DC), and apart from the fact it uses a different switch (a very much cheaper old PowerConnect 2824) the VM network performance seems to be closer to 80-90% of raw network performance (not less than half)
If I can provide any setup/configs or extra information, I'm more than happy to!
Update (14/08/2014)
Tried a few things:
- Enabled Jumbo frames/MTU 9000 on host bridge and adapter and VM's (marginal performance improvement (average above 30MB/s)
- Tested GSO,LRO,TSO off/on on host (no noticeable effect)
- Tested further sysctl optimisations (tweaking rmem/wmem, with sustained 1-2% performance increase)
- Tested vhost_net driver (small increase in performance)
- vhost_net driver enabled (as above) with the same sysctl optimisations (at least a 10-20% performance jump on previously)
- as per redhat's performance optimisation guide they mentioned enabling multiqueue could help, though I noticed no difference.
The host seems to sit at 125% CPU (for the host process), could this have something to do with assigning too many VCPU's to the guest or CPU/Numa affinity?
However, after all that, I seem to have increased the average sustained rate of between 25-30MB/s to 40-45MB/s. It's a decent improvement, but I'm sure I can get closer to bare metal performance (it's still a fair way under half at the moment).
Any other ideas?