4

I'm worried about the attack surface that the linux kernel networking stack, including nic drivers and packet filtering, offers to a remote attacker. So I'm planning to isolate as much of the networking code (drivers, packet filtering etc) in a KVM virtual machine (Basically what Qubes OS is doing, but with kvm instead of the type 1 xen hypervisor)

AFAIK, there are two ways to achieve this: Either pass through the host's ethernet device using PCI passthrough, which has the advantage that we can use the machine's IOMMU hardware to isolate DMA, or use an ethernet device attached via USB and pass through the USB device to the virtual machine. In both cases, it's then the virtual machine kernel that actually does the low-level networking. I would use PCI passthrough of the mainboard's NIC, but for several reasons that turns out not to be an option, so I'm looking at USB passthrough of an ethernet-to-usb dongle.

  1. Am I right in assuming that with both methods (PCI & USB passthrough), I'm actually reducing the attack surface in the host kernel in regards to networking? In my mind, with that solution, the host kernel simply passes through data to the virtual machine and none of the network traffic touches any part of host kernel's networking code. Is that right?
  2. Are there glaring problems with passing through a device via USB that renders the isolation of networking in the virtual machine useless/pointless? I'm not worried about the physical security of the machine, e.g. I don't have to defend against someone plugging in a rogue USB adapter that would convince the host's USB 3 controller to do anything it wasn't supposed to do. My main goal ist to make remote exploits more difficult.
  3. Is the whole endeavour a lost cause? E.g. am I worried about holes in a tried-and-true part of the kernel, while adding a larger attack surface to the system by using virtualization? Am I actually decreasing the overall security of the system by using KVM?
Out of Band
  • 9,150
  • 1
  • 21
  • 30
  • I just found out about MirageOS and its unikernels that can run entire TCP/IP stacks on KVM or XEN written in functional language. Didn't try it yet but it looks promising. – PPP Feb 24 '20 at 22:45

2 Answers2

2

Yes, this can reduce host kernel attack surface area. It can also be useful for isolating devices in the case that you have broken DMAR table and cannot protect from DMA attacks otherwise. This is something I have done in the past for most PCI devices. Some things to consider:

Use VFIO, not PCI passthrough

VFIO is newer, and will properly isolate devices in their IOMMU groups, whereas PCI passthrough can potentially be bypassed if there are any devices that are not passed through which are in the same IOMMU group. From the above link:

IOMMU groups try to describe the smallest sets of devices which can be considered isolated from the perspective of the IOMMU. [...] Legacy KVM device assignment will allow a user to assign these devices separately, but the configuration is guaranteed to fail. VFIO is governed by IOMMU groups and therefore prevents configurations which violate this most basic requirement of IOMMU granularity.

Beware option ROMs

There is one thing you should keep in mind, though. When you pass the PCI device directly though to a guest, the guest gets raw access to the PCI configuration space, including read/write access to the option ROM. An option ROM is a bit of firmware stored on certain devices which is executed by the BIOS during early boot. If the guest kernel is hijacked, then even if it cannot escape into the host, it will be able to write to the option ROM on the physical device, and wait until you next reboot, at which time the option ROM is called by the BIOS, subjecting you to a BIOS rootkit.

There is a way to mitigate this risk, however. Intel TXT allows you to verify option ROMs before they are called. This requires you set up tboot, which enables various TXT features at boot, properly. The official documentation is available here. This is very important, so don't ignore it!

Harden userspace components

By far, the most vulnerabilities will be present in the userspace component (QEMU, or maybe kvmtool), so securing this should be a high priority. Assuming you will be using QEMU, you should ensure it restricts its filesystem access to an empty directory with the -chroot directive. It must drop to a dedicated, unprivileged user with the -runas directive, and it must enable a seccomp sandbox with the -sandbox directive. Attack surface can further be reduced in other ways, by disabling unnecessary features (ACPI, RTC, etc.), and setting resource limits. Using a MAC such as AppArmor or SELinux is also recommended. If you will be compiling QEMU from source, you can further harden it at compile-time with the right gcc flags.

Harden both kernels

It's very important, if you want reduced attack surface, to use grsecurity for both the host and the guest. Additionally, grsecurity will provide PaX, which has components that harden userspace components like QEMU. If QEMU is used with -enable-kvm, then it is safe to use the strictest PaX settings with it. The commercial version of grsecurity is not necessary. All it is is a stable kernel which does not update as often. It is not necessary unless you are running a high-availability servers.

guest
  • 21
  • 2
  • As I said, I can't use PCI passthrough - nor VFIO, for various reasons, the most important one being that I can't wake up the host from suspend-to-RAM when I'm working with VFIO, and I want to be able to suspend to RAM. This may have somthing to do with having to assign the memory controller to VFIO because it lives on the same IOMMU group as the ethernet card. Some good thoughts in the rest of your answer though, thank you! I've thought about using the grsecurity patches and will research qemu directives further. – Out of Band Feb 28 '17 at 10:42
1

This is a tough question to answer since measuring attack surface is difficult. But let's assume that we 'measure' it by what we think is the amount of processing the untrusted data goes through in some part of the networking stack.

Am I right in assuming that with both methods (PCI & USB passthrough), I'm actually reducing the attack surface in the host kernel in regards to networking? In my mind, with that solution, the host kernel simply passes through data to the virtual machine and none of the network traffic touches any part of host kernel's networking code. Is that right?

As far as I know, yes. Passthrough methods reduce the processing the host OS does to the data. For example, evidence for that can be found in VirtualBox's documentation:

this feature allows to directly use physical PCI devices on the host by the guest even if host doesn't have drivers for this particular device.

This means that the host OS doesn't even use a driver to handle the (PCI) passthrough data.

Are there glaring problems with passing through a device via USB that renders the isolation of networking in the virtual machine useless/pointless? I'm not worried about the physical security of the machine, e.g. I don't have to defend against someone plugging in a rogue USB adapter that would convince the host's USB 3 controller to do anything it wasn't supposed to do. My main goal is to make remote exploits more difficult.

As you can read here, VirtualBox uses a driver to handle USB passthrough management. This means that the host attack surface is as big as the surface that the driver exposes. This driver lives in your host. Therefore, this is the tradeoff. Either you have the host run the network stack, or push it down to a VM and have the host run only the required drivers. Note that it seems (at least for VirtualBox) that these drivers do not handle the actual data. But I am still counting it as attack surface, just in case.

Is the whole endeavor a lost cause? E.g. am I worried about holes in a tried-and-true part of the kernel, while adding a larger attack surface to the system by using virtualization? Am I actually decreasing the overall security of the system by using KVM?

I think you are reducing the attack surface of the host overall. But the price you are paying (complexity, runtime penalties) is to be considered. There is just a design decision you need to make, according to your requirements.

To sum my thoughts:

It seems that your solution will reduce attack surface at the price of performance. The attack surface left at the host will either not exist (PCI passthrough) or will only be a driver handling VM 'claims' over USB devices (USB passthrough).

MiaoHatola
  • 2,284
  • 1
  • 14
  • 22