2

Bug:

I have a qemu-kvm on Ubuntu (host 14.04.5 LTS 3.13.0-95; cannot upgrade; , guest Ubuntu 18.04.03 LTS 4.15.0-65) Guest has two network interfaces, directly (VEPA) connected to two separate host physical interfaces. A program running on guest receives a lot of traffic via its LAN interface. After about 1 hrs, this LAN interface permanently drops the network, and the guest CPU core locks up handling soft IRQ load from that interface. During the actual traffic before interface lock up the IRQ load is insignificant.

HOST:


$ top
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
14393 libvirt+ 20 0 0.101t 7.821g 9524 S 124.7 5.5 1187:32 qemu-system-x86


$ strace -f -p 14393
[pid 14401] ioctl(9, KVM_IRQ_LINE_STATUS, 0x7f2749ffaa90) = 0
[pid 14401] ioctl(9, KVM_IRQ_LINE_STATUS, 0x7f2749ffaac0) = 0


$ apt show qemu-system-x86
Package: qemu-system-x86
Version: 2.0.0+dfsg-2ubuntu1.46

Interface in question configuration

<interface type='direct'>
  <mac address='52:54:00:68:df:1e'/>
  <source dev='eth1' mode='vepa'/>
  <target dev='macvtap1'/>
  <model type='rtl8139'/>
  <alias name='net1'/>
  <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>

GUEST:


$ top

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 52 root 20 0 0 0 0 R 99.7 0.0 1055:23 softirqd/7

Nothing else is running on the guest


$ tail /var/log/kern.log

Oct 5 09:54:07 bigram kernel: [67536.228012] watchdog: BUG: soft lockup - CPU#7 stuck for 23s! [ksoftirqd/7:52] Oct 5 09:54:07 bigram kernel: [67536.228054] Modules linked in: kvm irqbypass input_leds joydev mac_hid serio_raw sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear cirrus ttm drm_kms_helper syscopyarea sysfillrect sysimgblt 8139too psmouse fb_sys_fops floppy drm 8139cp virtio_blk mii i2c_piix4 pata_acpi Oct 5 09:54:07 bigram kernel: [67536.228054] CPU: 7 PID: 52 Comm: ksoftirqd/7 Tainted: G L 4.15.0-65-generic #74-Ubuntu


root@bigram:~$ cat /proc/interrupts 
           CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7
0: 44 0 0 0 0 0 0 0 IO-APIC 2-edge timer 1: 0 0 0 0 0 0 0 9 IO-APIC 1-edge i8042 6: 0 3 0 0 0 0 0 0 IO-APIC 6-edge floppy 8: 1 0 0 0 0 0 0 0 IO-APIC 8-edge rtc0 9: 0 0 0 0 0 0 0 0 IO-APIC 9-fasteoi acpi 10: 0 0 113 0 0 0 0 2912922455 IO-APIC 10-fasteoi virtio1, ens6


root@bigram:~$ ifconfig -v ens6
ens6: flags=4163  mtu 1500
        inet 10.18.8.0  netmask 255.255.0.0  broadcast 10.18.255.255
        inet6 fe80::5054:ff:fe68:df1e  prefixlen 64  scopeid 0x20
        ether 52:54:00:68:df:1e  txqueuelen 1000  (Ethernet)
        RX packets 15019446  bytes 22384515076 (22.3 GB)
        RX errors 0  dropped 581076  overruns 0  frame 0
        TX packets 660384  bytes 95322423 (95.3 MB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 3301735

Alec Matusis
  • 191
  • 2
  • 8
  • 1
    You're asking for help with an end of life distribution. You need to upgrade. I've seen this before and unfortunately for you, it was a bug in old versions of KVM. You need to upgrade. If you really "can't upgrade" then you'll just have to shutdown and start the affected VMs until you can upgrade. – Michael Hampton Oct 06 '19 at 12:19

0 Answers0