8

Short question

Would checking if the faulting address for every page fault points to kernel memory reliably detect an attempted Meltdown exploit, on systems that lack Intel TSX (and thus cannot suppress exceptions)? The Linux kernel do_page_fault() function is called on every page fault, and the address of the faulting memory access is available in that function, so it should be possible to check if that address points to a range of memory specific to the kernel. For a threat model where KPTI is not an option and where trading off availability (unprivileged DoS) for confidentiality is acceptable, is this idea correct?


Long question

The recently disclosed Meltdown vulnerability in Intel processors leverages the out-of-order execution property of all modern processors to read arbitrary physical memory. This relies on being able to repeatedly attempt accesses to kernel memory, despite the fact that the access attempts will fail. These failed access attempts trigger a #PF, or page fault, resulting in the userspace process being sent SIGSEGV due to its attempt to access kernel memory. The Meltdown attack takes advantage of the fact that the cache is still populated with the privileged memory despite the access being denied. The next stage of the attack is to use a timing attack to retrieve the cache contents. Without TSX support however, it seems to be impossible (to the best of my knowledge) to inhibit these page faults. On such systems, Meltdown necessarily triggers a large number of page faults caused by attempts to access kernel memory. This can be detected by the kernel.

From the paper, three steps are described for performing the attack:

Meltdown combines the two building blocks discussed
in Section 4.  First, an attacker makes the CPU execute
a transient instruction sequence which uses an inacces-
sible secret value stored somewhere in physical memory
(cf. Section 4.1). The transient instruction sequence acts
as the transmitter of a covert channel (cf. Section 4.2),
ultimately leaking the secret value to the attacker.

  Meltdown consists of 3 steps:

Step 1  The content of an attacker-chosen memory loca-
  tion, which is inaccessible to the attacker, is loaded
  into a register.

Step 2  A transient instruction accesses a cache line
  based on the secret content of the register.

Step 3  The attacker uses Flush+Reload to determine the
  accessed cache line and hence the secret stored at the
  chosen memory location.

By repeating these steps for different memory locations,
the attacker can dump the kernel memory, including the
entire physical memory.

After reading about a supposed mitigation utilizing the kernel tracing subsystem, it makes sense that detecting an abnormally large number of page faults due to accessing kernel memory would be able to reliably detect Meltdown. Is my thinking correct? Such a mitigation would simply check at each page fault whether or not CR2 (the register holding the faulting address) is greater than 0xffff000000000000, which would indicate that an attempt was made to access kernel memory. The system could then induce a kernel panic, preventing the attack from continuing.

From the kernel source, the do_page_fault() function is defined as:

dotraplinkage void notrace
do_page_fault(struct pt_regs *regs, unsigned long error_code)
{
    unsigned long address = read_cr2(); /* Get the faulting address */
    enum ctx_state prev_state;

    prev_state = exception_enter();
    if (trace_pagefault_enabled())
        trace_page_fault_entries(address, regs, error_code);

    __do_page_fault(regs, error_code, address);
    exception_exit(prev_state);
}
NOKPROBE_SYMBOL(do_page_fault);

This function is called every time there is a page fault, with the faulting address being saved to address. Without TSX support, to the best of my knowledge, it is impossible for the attack to avoid entering this function. It seems to me like adding a simple check to see if the address is in a dangerous range, so something along the lines of BUG_ON(address > 0xffff000000000000) would reliably detect 100% of Meltdown attacks under these constraints. This could be made more reliable (less likely to trigger false positives) by only triggering a panic when a sufficient number of violations occur and to ignore a violation if address is less than mmap_min_addr (which would imply a benign NULL pointer dereference), but that would only be necessary if a given workload triggers false positives.

This mitigation would not work if any of the following are true:

  • Spectre, which causes no page faults, can perform Meltdown without abusing eBPF.
  • There is another way to use Meltdown without triggering a page fault, on systems without TSX.
  • The do_page_fault() function can be avoided when a page fault occurs.

When re-reading the paper, it mentions exception suppression using other methods than TSX, but it is not entirely clear to me. It makes it sound as if a page fault could be avoided like this:

if (condition_mispredicted_as_false)
    access_kernel_memory();

If this is true, then this detection mechanism (and the one by Capsule8) would not work.

Would this mitigation work on systems without TSX, assuming natural false positives are not an issue and ignoring the fact that it would open up unprivileged DoS bugs?

forest
  • 64,616
  • 20
  • 206
  • 257

1 Answers1

5

This will prevent two out of three ways of running a Meltdown attack. Unfortunately for you, it does nothing for the third.

The simplest way to do a Meltdown attack is to perform an illegal read with a sufficiently long pipeline delay that the CPU speculatively acts on the read before the page fault is triggered, then catch the fault. Your proposal will spot and stop that before the attacker can perform more than a few reads.

The faster way is to use TSX to suppress the fault, but you've specified that TSX isn't present on the system.

The third way is to combine Meltdown with Spectre's branch-misprediction: train the branch predictor to take the "read memory" branch with innocuous addresses, then branch the other way while specifying a protected address. This is far slower than either of the other ways of reading protected memory, but since the illegal read gets discarded when the branch misprediction is spotted, no page fault is ever raised. As a result, the reads are completely invisible to your proposed code.

Mark
  • 34,390
  • 9
  • 85
  • 134
  • 1
    If I remember correctly, one version of Spectre is mitigated by reptoline (which obviously does not apply here as a malicious program does not need to use any particular compiler instrumentation), and the other is mitigated by a kernel patch to make use of new microcode. Would the version of Spectre that could be abused for a stealthy Meltdown implementation be mitigated by this microcode? – forest Jan 10 '18 at 07:00
  • Also, causing branch mispredictions like this so often in order to avoid triggering a page fault would result in an increase in LLC cache misses which could be detected with PEBS. Though I think Flush+Flush does not affect LLC_MISS and the Spectre paper uses Flush+Reload (which does), but I don't know if it would be possible to do it with Flush+Flush instead. – forest Jan 10 '18 at 07:10
  • 2
    Retpoline is about armoring specific indirect branches against Spectre; every microcode update I've seen is about disabling indirect branch prediction. Neither one does anything against direct branches: those are much harder to make use of when the attacker doesn't control the program, so anti-Spectre efforts mostly ignore them, but if the attacker *does* control the program, they work just fine. And focusing on the cache is a red herring -- *any* change in CPU behavior can be used (eg. timing changes because of a different number of ALU instructions issued), cache is just the easiest. – Mark Jan 10 '18 at 08:12
  • So if they do control the program, then all bets are off wrt Spectre, even Spectre when used to implement exceptionless Meltdown? Perhaps I need to understand more how Spectre which, supposedly, cannot access memory outside of the affected process' address space, could instead access the kernel. I will need to read the Spectre paper again. As for the cache being a red herring, Meltdown (even when used with Spectre) will necessarily trigger cache misses, since that is how the specific side-channel works (i.e. it's specific to the cache, not a generic timing attack), unless I am mistaken. – forest Jan 10 '18 at 08:19
  • 2
    The Meltdown proof-of-concepts use the cache as a side channel because it's easy and has a high signal-to-noise ratio. There are other side channels you can use: for example, the `IDIV` instruction takes a varying number of cycles based on the value being divided. – Mark Jan 10 '18 at 08:47
  • I suppose the best this technique could do is prevent naive attacks or force an attacker to use the slower Spectre-based method. Probably not worth it. I wonder how grsecurity's x64 UDEREF mitigates this without the perf hit of KPTI... – forest Mar 07 '18 at 06:06