I'm running my application on top of Azure Ubuntu 16.04 VMs (Wrapped as a VHD image). In several cases I have observed missing SRIOV associated VFs interfaces, while VMs were starting. Need to mention that this issue isn't repetitive, but it happen in several Standard_F8s_v2 instances we have deployed in our labs.
VMs are attached with 3 network interfaces, 2 of them are set with accelerated network attributes. Instances are being deployed using Terraform scripts.
Below you can see serial console snippet, where "enp2s2" & "enp3s3" VF interfaces, dpdk net_failsafe based are missing, whilst additional console snippet, from another reboot, shows that they exist. I was suspecting that the issue was related to Mellanox kernel modules initialization sequences, but I have gathered many VMs startup statistics which couldn't point out that this is the RCA. In addition, tried to shutdown/restart devices from Azure portal and by reboot VMs via ssh, but still was able to see the problem.
Following missing interfaces snippet: (Can be seen missing under "ci-info: Net Device" list)
*[ 1.961519] random: systemd-udevd: uninitialized urandom read (16 bytes read)
[ 1.961992] random: udevadm: uninitialized urandom read (16 bytes read)
[ 1.966624] random: systemd-udevd: uninitialized urandom read (16 bytes read)
[ 2.013574] hv_vmbus: registering driver hv_netvsc
[ 2.017652] hv_utils: Registering HyperV Utility Driver
[ 2.021791] hv_vmbus: registering driver hv_util
[ 2.026534] hidraw: raw HID events driver (C) Jiri Kosina
[ 2.031197] hv_vmbus: registering driver hyperv_keyboard
[ 2.035793] hv_vmbus: registering driver hid_hyperv
[ 2.041572] hv_vmbus: registering driver hyperv_fb
[ 2.048342] AVX2 version of gcm_enc/dec engaged.
[ 2.052076] AES CTR mode by8 optimization enabled
[ 4.344565] hv_utils: Shutdown IC version 3.0
[ 4.348061] input: AT Translated Set 2 keyboard as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/d34b2567-b9b6-42b9-8778-0a4ec0b955bf/serio2/input/input3
[ 4.358187] input: Microsoft Vmbus HID-compliant Mouse as /devices/0006:045E:0621.0001/input/input4
[ 4.364222] hid 0006:045E:0621.0001: input: <UNKNOWN> HID v0.01 Mouse [Microsoft Vmbus HID-compliant Mouse] on
[ 4.371502] hyperv_fb: Screen resolution: 1152x864, Color depth: 32
[ 4.378396] Console: switching to colour frame buffer device 144x54
[ 4.383835] hv_utils: VSS IC version 5.0
[ 5.560104] hv_utils: Heartbeat IC version 3.0
Begin: Loading essential drivers ... [ 6.624015] raid6: sse2x1 gen() 11383 MB/s
[ 6.672008] raid6: sse2x1 xor() 8608 MB/s
[ 6.720013] raid6: sse2x2 gen() 14144 MB/s
[ 6.768008] raid6: sse2x2 xor() 9824 MB/s
[ 6.816009] raid6: sse2x4 gen() 15697 MB/s
[ 6.864007] raid6: sse2x4 xor() 11120 MB/s
[ 6.912010] raid6: avx2x1 gen() 20281 MB/s
[ 6.960008] raid6: avx2x1 xor() 16369 MB/s
[ 7.008009] raid6: avx2x2 gen() 25744 MB/s
[ 7.056007] raid6: avx2x2 xor() 17842 MB/s
[ 7.104010] raid6: avx2x4 gen() 27707 MB/s
[ 7.152008] raid6: avx2x4 xor() 19589 MB/s
[ 7.200010] raid6: avx512x1 gen() 25916 MB/s
[ 7.248010] raid6: avx512x1 xor() 15400 MB/s
[ 7.296008] raid6: avx512x2 gen() 31486 MB/s
[ 7.344010] raid6: avx512x2 xor() 19259 MB/s
[ 7.392009] raid6: avx512x4 gen() 32802 MB/s
[ 7.440008] raid6: avx512x4 xor() 19748 MB/s
[ 7.442786] raid6: using algorithm avx512x4 gen() 32802 MB/s
[ 7.446455] raid6: .... xor() 19748 MB/s, rmw enabled
[ 7.449751] raid6: using avx512x2 recovery algorithm
[ 7.454113] xor: automatically using best checksumming function avx
[ 7.459720] async_tx: api initialized (async)
done.
Begin: Running /scripts/init-premount ... done.
Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done.
Begin: Running /scripts/local-premount ... [ 7.500972] Btrfs loaded, crc32c=crc32c-intel
Scanning for Btrfs filesystems
done.
Warning: fsck not present, so skipping root file system
[ 7.621580] EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
done.
Begin: Running /scripts/local-bottom ... done.
Begin: Running /scripts/init-bottom ... done.
[ 7.716053] random: crng init done
[ 7.718135] random: 7 urandom warning(s) missed due to ratelimiting
[ 7.754028] EXT4-fs (sda1): re-mounted. Opts: discard
Cloud-init v. 19.2-24-ge7881d5c-0ubuntu1~16.04.1 running 'init-local' at Sun, 06 Oct 2019 11:06:31 +0000. Up 8.04 seconds.
cloud-init-nonet[8.59]: static networking is now up
Cloud-init v. 19.2-24-ge7881d5c-0ubuntu1~16.04.1 running 'init' at Sun, 06 Oct 2019 11:06:32 +0000. Up 8.80 seconds.
ci-info: ++++++++++++++++++++++++++++++++++++++Net device info+++++++++++++++++++++++++++++++++++++++
ci-info: +--------+------+-----------------------------+---------------+--------+-------------------+
ci-info: | Device | Up | Address | Mask | Scope | Hw-Address |
ci-info: +--------+------+-----------------------------+---------------+--------+-------------------+
ci-info: | eth0 | True | 10.29.50.32 | 255.255.255.0 | global | 00:0d:3a:38:c1:f7 |
ci-info: | eth0 | True | fe80::20d:3aff:fe38:c1f7/64 | . | link | 00:0d:3a:38:c1:f7 |
ci-info: | eth1 | True | 10.29.115.10 | 255.255.255.0 | global | 00:0d:3a:38:c3:06 |
ci-info: | eth1 | True | fe80::20d:3aff:fe38:c306/64 | . | link | 00:0d:3a:38:c3:06 |
ci-info: | eth2 | True | 10.29.211.10 | 255.255.255.0 | global | 00:0d:3a:38:c9:4a |
ci-info: | lo | True | 127.0.0.1 | 255.0.0.0 | host | . |
ci-info: | lo | True | ::1/128 | . | host | . |
ci-info: +--------+------+-----------------------------+---------------+--------+-------------------+
Another snippet where interfaces are shown:
[ 1.990765] random: systemd-udevd: uninitialized urandom read (16 bytes read)
[ 1.991261] random: udevadm: uninitialized urandom read (16 bytes read)
[ 1.995831] random: systemd-udevd: uninitialized urandom read (16 bytes read)
[ 2.042179] hv_vmbus: registering driver hv_netvsc
[ 2.042182] hv_utils: Registering HyperV Utility Driver
[ 2.050908] hv_vmbus: registering driver hv_util
[ 2.055113] hv_vmbus: registering driver hyperv_keyboard
[ 2.059757] hidraw: raw HID events driver (C) Jiri Kosina
[ 2.066167] hv_vmbus: registering driver hid_hyperv
[ 2.073141] hv_vmbus: registering driver hyperv_fb
[ 2.082086] AVX2 version of gcm_enc/dec engaged.
[ 2.086060] AES CTR mode by8 optimization enabled
[ 3.200601] hv_utils: Heartbeat IC version 3.0
[ 3.204903] input: AT Translated Set 2 keyboard as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/d34b2567-b9b6-42b9-8778-0a4ec0b955bf/serio2/input/input3
[ 3.219127] input: Microsoft Vmbus HID-compliant Mouse as /devices/0006:045E:0621.0001/input/input4
[ 3.226095] hid 0006:045E:0621.0001: input: <UNKNOWN> HID v0.01 Mouse [Microsoft Vmbus HID-compliant Mouse] on
[ 3.234239] hyperv_fb: Screen resolution: 1152x864, Color depth: 32
[ 3.241442] Console: switching to colour frame buffer device 144x54
[ 3.388646] hv_utils: Shutdown IC version 3.0
[ 3.391537] hv_utils: VSS IC version 5.0
Begin: Loading essential drivers ... [ 4.688013] raid6: sse2x1 gen() 11377 MB/s
[ 4.736006] raid6: sse2x1 xor() 8608 MB/s
[ 4.784008] raid6: sse2x2 gen() 14143 MB/s
[ 4.832012] raid6: sse2x2 xor() 9799 MB/s
[ 4.880013] raid6: sse2x4 gen() 15731 MB/s
[ 4.928012] raid6: sse2x4 xor() 11084 MB/s
[ 4.976012] raid6: avx2x1 gen() 20223 MB/s
[ 5.024009] raid6: avx2x1 xor() 16357 MB/s
[ 5.072009] raid6: avx2x2 gen() 25798 MB/s
[ 5.120011] raid6: avx2x2 xor() 17831 MB/s
[ 5.168009] raid6: avx2x4 gen() 27715 MB/s
[ 5.216010] raid6: avx2x4 xor() 18555 MB/s
[ 5.264012] raid6: avx512x1 gen() 27142 MB/s
[ 5.312011] raid6: avx512x1 xor() 15413 MB/s
[ 5.360011] raid6: avx512x2 gen() 31448 MB/s
[ 5.408010] raid6: avx512x2 xor() 19311 MB/s
[ 5.456012] raid6: avx512x4 gen() 32843 MB/s
[ 5.504011] raid6: avx512x4 xor() 20650 MB/s
[ 5.506896] raid6: using algorithm avx512x4 gen() 32843 MB/s
[ 5.510878] raid6: .... xor() 20650 MB/s, rmw enabled
[ 5.514425] raid6: using avx512x2 recovery algorithm
[ 5.519051] xor: automatically using best checksumming function avx
[ 5.525076] async_tx: api initialized (async)
done.
Begin: Running /scripts/init-premount ... done.
Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done.
Begin: Running /scripts/local-premount ... [ 5.563208] Btrfs loaded, crc32c=crc32c-intel
Scanning for Btrfs filesystems
done.
Warning: fsck not present, so skipping root file[ 5.680481] EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
system
done.
Begin: Running /scripts/local-bottom ... done.
Begin: Running /scripts/init-bottom ... done.
[ 5.803547] EXT4-fs (sda1): re-mounted. Opts: discard
[ 5.825023] Adding 614396k swap on /var/cache/swap/swapfile. Priority:-2 extents:8 across:1114108k FS
[ 5.980054] random: crng init done
[ 5.982620] random: 7 urandom warning(s) missed due to ratelimiting
Cloud-init v. 19.2-24-ge7881d5c-0ubuntu1~16.04.1 running 'init-local' at Sun, 06 Oct 2019 12:06:17 +0000. Up 6.10 seconds.
cloud-init-nonet[6.73]: static networking is now up
Cloud-init v. 19.2-24-ge7881d5c-0ubuntu1~16.04.1 running 'init' at Sun, 06 Oct 2019 12:06:18 +0000. Up 6.97 seconds.
ci-info: ++++++++++++++++++++++++++++++++++++++Net device info+++++++++++++++++++++++++++++++++++++++
ci-info: +--------+------+-----------------------------+---------------+--------+-------------------+
ci-info: | Device | Up | Address | Mask | Scope | Hw-Address |
ci-info: +--------+------+-----------------------------+---------------+--------+-------------------+
ci-info: | **enp2s2** | **True** | . | . | . | 00:0d:3a:38:c3:06 |
ci-info: | **enp3s3** | **True** | . | . | . | 00:0d:3a:38:c9:4a |
ci-info: | eth0 | True | 10.29.50.32 | 255.255.255.0 | global | 00:0d:3a:38:c1:f7 |
ci-info: | eth1 | True | 10.29.115.10 | 255.255.255.0 | global | 00:0d:3a:38:c3:06 |
ci-info: | eth1 | True | fe80::20d:3aff:fe38:c306/64 | . | link | 00:0d:3a:38:c3:06 |
ci-info: | eth2 | True | 10.29.211.10 | 255.255.255.0 | global | 00:0d:3a:38:c9:4a |
ci-info: | eth2 | True | fe80::20d:3aff:fe38:c94a/64 | . | link | 00:0d:3a:38:c9:4a |
ci-info: | lo | True | 127.0.0.1 | 255.0.0.0 | host | . |
ci-info: | lo | True | ::1/128 | . | host | . |
ci-info: +--------+------+-----------------------------+---------------+--------+-------------------+