4

I am seeing latency on a Hyper-V host and its guests lately. When I run ping -f from a Linux prompt, I see the dots "pulsate" like a heartbeat indicating 2 successive latency spikes within 1 second followed by about 1-1.5 seconds of "normal" operation.

Details

This is what it looks like for ICMP echo request/reply packets send at an 50-ms interval (total length of the sequence is 5.5 seconds):

# time ping -c 100 -i 0.05 192.168.111.199
PING 192.168.111.199 (192.168.111.199) 56(84) bytes of data.
64 bytes from 192.168.111.199: icmp_seq=1 ttl=64 time=1.03 ms
64 bytes from 192.168.111.199: icmp_seq=2 ttl=64 time=0.480 ms
64 bytes from 192.168.111.199: icmp_seq=3 ttl=64 time=0.911 ms
64 bytes from 192.168.111.199: icmp_seq=4 ttl=64 time=3.01 ms
64 bytes from 192.168.111.199: icmp_seq=5 ttl=64 time=0.473 ms
64 bytes from 192.168.111.199: icmp_seq=6 ttl=64 time=0.439 ms
64 bytes from 192.168.111.199: icmp_seq=7 ttl=64 time=0.896 ms
64 bytes from 192.168.111.199: icmp_seq=8 ttl=64 time=0.425 ms
64 bytes from 192.168.111.199: icmp_seq=9 ttl=64 time=0.654 ms
64 bytes from 192.168.111.199: icmp_seq=10 ttl=64 time=2.58 ms
64 bytes from 192.168.111.199: icmp_seq=11 ttl=64 time=0.598 ms
64 bytes from 192.168.111.199: icmp_seq=12 ttl=64 time=0.511 ms
64 bytes from 192.168.111.199: icmp_seq=13 ttl=64 time=0.609 ms
64 bytes from 192.168.111.199: icmp_seq=14 ttl=64 time=0.621 ms
64 bytes from 192.168.111.199: icmp_seq=15 ttl=64 time=0.451 ms
64 bytes from 192.168.111.199: icmp_seq=16 ttl=64 time=1.05 ms
64 bytes from 192.168.111.199: icmp_seq=17 ttl=64 time=0.447 ms
64 bytes from 192.168.111.199: icmp_seq=18 ttl=64 time=0.462 ms
64 bytes from 192.168.111.199: icmp_seq=19 ttl=64 time=0.635 ms
64 bytes from 192.168.111.199: icmp_seq=20 ttl=64 time=1.90 ms
64 bytes from 192.168.111.199: icmp_seq=21 ttl=64 time=0.654 ms
64 bytes from 192.168.111.199: icmp_seq=22 ttl=64 time=0.634 ms
64 bytes from 192.168.111.199: icmp_seq=23 ttl=64 time=794 ms      <<<---- SPIKE
64 bytes from 192.168.111.199: icmp_seq=24 ttl=64 time=742 ms
64 bytes from 192.168.111.199: icmp_seq=25 ttl=64 time=690 ms
64 bytes from 192.168.111.199: icmp_seq=26 ttl=64 time=638 ms
64 bytes from 192.168.111.199: icmp_seq=27 ttl=64 time=586 ms
64 bytes from 192.168.111.199: icmp_seq=28 ttl=64 time=534 ms
64 bytes from 192.168.111.199: icmp_seq=29 ttl=64 time=482 ms
64 bytes from 192.168.111.199: icmp_seq=30 ttl=64 time=430 ms
64 bytes from 192.168.111.199: icmp_seq=31 ttl=64 time=379 ms
64 bytes from 192.168.111.199: icmp_seq=32 ttl=64 time=327 ms
64 bytes from 192.168.111.199: icmp_seq=33 ttl=64 time=275 ms
64 bytes from 192.168.111.199: icmp_seq=34 ttl=64 time=223 ms
64 bytes from 192.168.111.199: icmp_seq=35 ttl=64 time=171 ms
64 bytes from 192.168.111.199: icmp_seq=36 ttl=64 time=119 ms
64 bytes from 192.168.111.199: icmp_seq=37 ttl=64 time=66.8 ms
64 bytes from 192.168.111.199: icmp_seq=38 ttl=64 time=15.1 ms
64 bytes from 192.168.111.199: icmp_seq=39 ttl=64 time=0.657 ms
64 bytes from 192.168.111.199: icmp_seq=40 ttl=64 time=0.755 ms
64 bytes from 192.168.111.199: icmp_seq=41 ttl=64 time=0.744 ms
64 bytes from 192.168.111.199: icmp_seq=42 ttl=64 time=0.652 ms
64 bytes from 192.168.111.199: icmp_seq=43 ttl=64 time=0.507 ms
64 bytes from 192.168.111.199: icmp_seq=44 ttl=64 time=0.338 ms
64 bytes from 192.168.111.199: icmp_seq=45 ttl=64 time=0.455 ms
64 bytes from 192.168.111.199: icmp_seq=46 ttl=64 time=0.755 ms
64 bytes from 192.168.111.199: icmp_seq=47 ttl=64 time=0.530 ms
64 bytes from 192.168.111.199: icmp_seq=48 ttl=64 time=0.730 ms
64 bytes from 192.168.111.199: icmp_seq=49 ttl=64 time=0.385 ms
64 bytes from 192.168.111.199: icmp_seq=50 ttl=64 time=0.664 ms
64 bytes from 192.168.111.199: icmp_seq=51 ttl=64 time=0.440 ms
64 bytes from 192.168.111.199: icmp_seq=52 ttl=64 time=0.721 ms
64 bytes from 192.168.111.199: icmp_seq=53 ttl=64 time=0.963 ms
64 bytes from 192.168.111.199: icmp_seq=54 ttl=64 time=0.452 ms
64 bytes from 192.168.111.199: icmp_seq=55 ttl=64 time=0.535 ms
64 bytes from 192.168.111.199: icmp_seq=56 ttl=64 time=0.467 ms
64 bytes from 192.168.111.199: icmp_seq=57 ttl=64 time=0.591 ms
64 bytes from 192.168.111.199: icmp_seq=58 ttl=64 time=0.749 ms
64 bytes from 192.168.111.199: icmp_seq=59 ttl=64 time=0.440 ms
64 bytes from 192.168.111.199: icmp_seq=60 ttl=64 time=0.498 ms
64 bytes from 192.168.111.199: icmp_seq=61 ttl=64 time=816 ms      <<<---- SPIKE
64 bytes from 192.168.111.199: icmp_seq=62 ttl=64 time=764 ms
64 bytes from 192.168.111.199: icmp_seq=63 ttl=64 time=712 ms
64 bytes from 192.168.111.199: icmp_seq=64 ttl=64 time=660 ms
64 bytes from 192.168.111.199: icmp_seq=65 ttl=64 time=607 ms
64 bytes from 192.168.111.199: icmp_seq=66 ttl=64 time=556 ms
64 bytes from 192.168.111.199: icmp_seq=67 ttl=64 time=504 ms
64 bytes from 192.168.111.199: icmp_seq=68 ttl=64 time=452 ms
64 bytes from 192.168.111.199: icmp_seq=69 ttl=64 time=400 ms
64 bytes from 192.168.111.199: icmp_seq=70 ttl=64 time=348 ms
64 bytes from 192.168.111.199: icmp_seq=71 ttl=64 time=296 ms
64 bytes from 192.168.111.199: icmp_seq=72 ttl=64 time=244 ms
64 bytes from 192.168.111.199: icmp_seq=73 ttl=64 time=192 ms
64 bytes from 192.168.111.199: icmp_seq=74 ttl=64 time=140 ms
64 bytes from 192.168.111.199: icmp_seq=75 ttl=64 time=88.6 ms
64 bytes from 192.168.111.199: icmp_seq=76 ttl=64 time=36.6 ms
64 bytes from 192.168.111.199: icmp_seq=77 ttl=64 time=0.466 ms
64 bytes from 192.168.111.199: icmp_seq=78 ttl=64 time=0.559 ms
64 bytes from 192.168.111.199: icmp_seq=79 ttl=64 time=0.655 ms
64 bytes from 192.168.111.199: icmp_seq=80 ttl=64 time=0.555 ms
64 bytes from 192.168.111.199: icmp_seq=81 ttl=64 time=2.16 ms
64 bytes from 192.168.111.199: icmp_seq=82 ttl=64 time=0.660 ms
64 bytes from 192.168.111.199: icmp_seq=83 ttl=64 time=0.452 ms
64 bytes from 192.168.111.199: icmp_seq=84 ttl=64 time=0.454 ms
64 bytes from 192.168.111.199: icmp_seq=85 ttl=64 time=0.608 ms
64 bytes from 192.168.111.199: icmp_seq=86 ttl=64 time=0.471 ms
64 bytes from 192.168.111.199: icmp_seq=87 ttl=64 time=0.506 ms
64 bytes from 192.168.111.199: icmp_seq=88 ttl=64 time=0.435 ms
64 bytes from 192.168.111.199: icmp_seq=89 ttl=64 time=0.450 ms
64 bytes from 192.168.111.199: icmp_seq=90 ttl=64 time=0.582 ms
64 bytes from 192.168.111.199: icmp_seq=91 ttl=64 time=0.537 ms
64 bytes from 192.168.111.199: icmp_seq=92 ttl=64 time=0.452 ms
64 bytes from 192.168.111.199: icmp_seq=93 ttl=64 time=0.533 ms
64 bytes from 192.168.111.199: icmp_seq=94 ttl=64 time=0.638 ms
64 bytes from 192.168.111.199: icmp_seq=95 ttl=64 time=1.28 ms
64 bytes from 192.168.111.199: icmp_seq=96 ttl=64 time=1.86 ms
64 bytes from 192.168.111.199: icmp_seq=97 ttl=64 time=0.770 ms
64 bytes from 192.168.111.199: icmp_seq=98 ttl=64 time=0.822 ms
64 bytes from 192.168.111.199: icmp_seq=99 ttl=64 time=0.452 ms
64 bytes from 192.168.111.199: icmp_seq=100 ttl=64 time=385 ms      <<<---- SPIKE

--- 192.168.111.199 ping statistics ---
100 packets transmitted, 100 received, 0% packet loss, time 5112ms
rtt min/avg/max/mdev = 0.338/137.383/816.359/237.291 ms, pipe 16

real    0m5.502s
user    0m0.028s
sys     0m0.028s

From the ping result it looks like a full stop of data transmission for about 800 ms followed by a period of resumed operations for 23-24 request/reply sequences (around 1200 ms).

History and environment

This seems to be somewhat new and associated with a recent reboot 2 days ago - I have not seen it before on this host. Host is running Hyper-V 2008 R2 on a recent HP DL165 G7 (AMD Opteron 6238, 4x Intel 82576). The host's management interface as well as all guests (connected on a separate interface) are uniformly affected. The connection to the testing machine is either a local Ethernet LAN or a routed local IP Network (Ethernet all through).

Diagnostics so far:

The network itself is "clean":

# ping -f 192.168.112.187
PING 192.168.112.187 (192.168.112.187) 56(84) bytes of data.
....^C
--- 192.168.112.187 ping statistics ---
8633 packets transmitted, 8629 received, 0% packet loss, time 11208ms
rtt min/avg/max/mdev = 0.330/1.096/25.797/1.643 ms, pipe 2, ipg/ewma 1.298/1.499 ms

I also tried a different switch port / interface combination to rule out a hardware issue without any effect - the problem seems to persist on any interface.

Hyper-V guests of this host which on the same network (and get their traffic switched by the Hyper-V virtual switch) seem not to have any latency issues.

I've read about Hyper-V timing issues with AMD CPUs, but these appeared to only affect guest systems. Also, people occasionally see latencies for aggregated/bonded ethernet channels - we do not have those.

the-wabbit
  • 40,319
  • 13
  • 105
  • 169

1 Answers1

1

I've updated the NIC drivers to the most recent version 16.8.1 available from the Intel website and the problem went away (at least for now). So apparently, it was a driver issue.

the-wabbit
  • 40,319
  • 13
  • 105
  • 169