My question regards whether or not the mitigations I use are appropriate for my threat model. Please don't jump to conclusions and say "you need to use locks" or "you can't leave your computer unattended" without first reading at least my threat model. I'm not defending from a janitor who got bribed $200 to nab a hard disk from a cheap 1U server. Additionally, exploitation of most network-facing software such as browsers is out of scope. I have sufficient protections on the software level for that to be a non-issue (strong privilege separation with custom seccomp filters in place to reduce kernel attack surface). If the answer turns out to be "there is absolutely no solution which does not involve custom-designed hardware", I will be disappointed, but I will accept it.
I have a workstation computer which I must leave on 24/7, so it is unattended for many hours a day. It also has a high computational demand, so I cannot replace components with significantly lower power versions (e.g. a self contained Intel Edison is surely more resistant to memory acquisition, but it is far too weak for my purposes). Most people who look for physical server or workstation security assume an attacker with very brief or intermittent access, where physical locks could keep them out. Unfortunately in my situation, that is almost completely useless, though I do lock my doors, of course. Recently I've been thinking of some more paranoid solutions, and I'd like some advice to make sure it covers my threat model correctly so I can be sure that I am not putting too much effort in an area that I do not need to worry about, or ignoring an area which I have left wide open. Yes, I am aware of risk assessment, and this is hypothetical for the most part. I would not be putting myself in a risky situation unless I was already familiar enough with the subject that I would not need to ask here.
Threat model
My adversary is capable of:
- Bypassing all physical deterrence measures, given no hard limit on time.
- Denying me access to my own hardware immediately at the onset of the attack.
- Transporting my hardware to a remote facility for extended analysis, without it losing power.
- Possessing state-of-the-art forensic hardware such as bus analyzers and high-grade freezing spray.
- Potentially accessing trade secret design documents or datasheets for hardware components I may use in order to look for bugs which may be used to gain access (who knows how secure that ASPEED chip is?). I do not know how likely this is.
- Observing my public and online behavior for extended periods to customize their attack.
However their limitations are:
- I will always be aware when they attack, so they have one shot. As a result, I do not need tamper evidence.
- They cannot force me to provide them access (no $5 wrenches allowed).
- If the computer is powered on but locked, they cannot guess the password. If the computer is powered down, they cannot break the encryption key.
- They are not quite the level of the NSA, so attacks which are not yet practical such as power analysis attacks or hardware backdoors are also not allowed.
Their goals are to:
- Obtain a partial or complete but forensically useful memory image of the running system.
- Obtain the full plaintext contents of all attached storage devices.
My goals are ANY of the following:
- Harden the system such that physical access is not sufficient for my adversary to accomplish their goal.
- Have the system shut itself down within several seconds of unauthorized physical access, resulting in the physical memory vanishing.
- Have the system introduce massive corruption to memory upon unauthorized physical access, making any subsequent memory dump forensically useless.
Examples of possible methods of memory acquisition could occur if physical access is attained without the system noticing and shutting itself off:
- Sniffing exposed buses such as PCI, QPI, etc.
- Exploiting the exposed GPU hardware to gain DMA over PCI (e.g. resetting the GPU processor and then using JTAG?).
- Getting the JTAG SDK from Intel and then directly hijacking my motherboard (so far, I cannot think of any solid mitigations for this other than de-soldering it, but I will try to find some).
- Exploiting peripherals which I have not confined and which I do not know are at risk.
- Somehow hard restarting the CPU such that the debug registers are not cleared, and reading them (to steal TRESOR keys). I believe the standard states that in all resets, an Intel CPU should clear debug registers, but there may be some exceptions which I do not know about.
In other words, they are a state level adversary, but not quite at the level of the NSA. I have a few mitigations in place. If you don't want to read the following wall of text, a tl;dr with potentially inaccurate simplifications is:
- I am protected from DMA attacks from most compromised PCI devices.
- My storage encryption key is protected from cold boot attacks.
- Certain high-risk processes have their memory partially encrypted, with the key located outside of RAM as well.
- The entire memory is lightly scrambled, although it is probably easy to break (Edit: yup, the scrambling uses LFSR for scrambling, which is broken).
- The system will power down if the chassis is opened.
- If I am removed from the system while it is unlocked, it will shut down.
- The memory will wipe itself if it is hit with freezing spray.
- If the system is shut down improperly in an emergency, the encryption key will become harder to crack.
- The hard drive can in theory be modified to detect hardware write-blockers, and wipe themselves when one is used on them.
- Live BIOS modification will be detected and defeated.
- The computer watches itself with a camera, and shuts off if it detects motion.
These are the mitigations in more detail, along with what they are suppose to mitigate:
DMA protection with VFIO
Because the attacker will only get one shot, I don't have to worry about them taking out some PCI device and replacing it with a malicious one which will mount a DMA attack. However they may be able to exploit an existing and trusted PCI device. Because of this, I've confined most sensitive PCI devices using VFIO. Essentially, I've bound an IOMMU group containing untrusted PCI devices to a very small live system in QEMU, and had QEMU forward all communication to the host. In the case that one of those PCI devices is compromised, it will only be able to see the 32 MiB which QEMU has been allocated. So far, all USB controllers are isolated this way. The network goes through USB as well, instead of Ethernet, so going through the Intel Management Engine is mostly avoided. The LPC's DMA ability is disabled too, though on many motherboards, its ability to become bus master is disabled in hardware. Other PCI devices are simply disabled as well. SATA controllers and the GPU are not yet protected, though it's possible in theory and I'm working on it. While the GPU is pretty much safe (it's only exposed through /dev/dri/*, unless EDID headers and such are parsed by the GPU's own hardware at all), the SATA controllers really should be, considering they are so complex and NCQ does support client->host DMA, if the host allows it. If many types of peripherals are inserted at runtime (excluding some harmless ones like PS/2 and serial ports), a custom kernel patch triggers a kernel panic, and a pseudo-hardware (BIOS) watchdog shuts the system down shortly after.
TRESOR
Disk encryption keys are stored in the x86 debug registers using a Linux kernel patch called TRESOR. This ensures that the key itself never hits RAM, which completely mitigates cold boot attacks and passive DMA attacks. Access to the debug registers is disabled with this patch to complete the protection. The downside is that a hard reset, such as one triggered by a triple exception fault, may preserve the debug registers such that the operating system being booted into can access them. And of course, ring 0 can access them as well. Unfortunately, they are only the encryption keys, and unencrypted process data, kernel data, file system cache, etc still exists so TRESOR is far from a complete solution. I suppose I could create /dev/ram0, encrypt it with TRESOR, then format it with a filesystem that supports DAX (direct access, a filesystem feature which completely bypasses the page cache), but that would not be a complete solution either.
RamCrypt
A modified version of TRESOR was created recently called RamCrypt, which encrypts most of a target process' memory, leaving by default only 4 pages unencrypted. While 4 pages is only 16 KiB of unencrypted memory on most hardware, which is quite good, pages which are marked with VM_SHARED, VM_IO, or VM_PFNMAP are not encrypted. This means that information which may be forensically useful can still remain unencrypted. Additionally, RamCrypt only encrypts individual processes, but not metadata of those processes or the process' task_struct in the kernel, or anything else like that. So while Firefox may be mostly encrypted, the slabs in the kernel dealing with TCP may still give away what websites have been viewed, considering it's the networking stack-related slabs that are the ones which are deferred to the RCU for destruction, so they linger around the longest. If that weren't bad enough, RamCrypt also suffers from a severe performance impact in the default and most secure configuration.
Memory scrambling
Modern DDR3 and DDR4 memory controllers support a feature called memory scrambling, which is designed to reduce excessive di/dt on adjacent lines in memory (in other words, it prevents successive 1s or 0s from causing electromagnetic interference in the memory bus). The scrambling seed is re-initialized at every boot, probably by UEFI. It is strong enough that the reverse engineer Igor Skochinsky apparently could not trivially crack it, but I don't know if he even tried. Memory scrambling may mitigate simple cold boot attacks, but the seed is likely not cryptographically secure, especially considering the goal is only to increase the distribution of 1s and 0s. If memory serves correctly, a quick read through part of the Coreboot source code made it seem like it may be only 32 bits anyway. It looks like there are no full memory encryption solutions on the market, sadly. PrivateCore claims to have VPSes which fully encrypt memory (their vCage product line), and Xbox supposedly encrypts its memory to frustrate RE, but that's about all.
Edit: Just as I thought the scrambling is not cryptographically secure. It does seem like the steps for performing recovery of the seed are rather complex, especially due to interleaving which increase the amount of lost data which may provide a small amount of protection. And there is no analysis yet on DDR4 memory, which may use a stronger seed. Hopefully, in the future, Intel will use a very fast cipher such as Simon in their MCH for memory scrambling.
On-line chassis intrusion detection
My BIOS and hardware has chassis intrusion detection built in. While I have not implemented this yet, I believe it may be possible to poll /dev/nvram once every 0.5 seconds or so and parse it for whatever value stores the chassis intrusion count, and shut down the system immediately upon detection of an intrusion event. If it's not possible for the operating system to obtain that information, then I might have to actually modify the hardware and have it use GPIO or something, but I'm not so familiar with that.
Wrist strap
Tinfoil time! I plan to make a wrist strap connected to a device on my desk which can be pulled out with only a small amount of force. In the case that I am forcibly removed from that area, the strap would be yanked out and the system would shut itself down. While this seems overkill, it would allow me to be almost completely safe during the most sensitive times: my computer unlocked with a root prompt sitting in front of me just waiting for someone to insmod ./crashdev.ko and read all the physical memory from /dev/crash. During all other times when the wrist strap is not in use, the system would be locked using vlock, which is designed in a way that makes it almost impossible to have bugs. If the vlock program crashes, you are simply locked out of your computer, compared to most graphical lock screens where a crashed lock process gives you access back.
RAM temperature polling
As far as I am aware, cold boot attacks can be conducted in two ways: 1) A system can be reset, and made to boot into a live system which extracts memory contents, or 2) memory modules are cooled to a low temperature, removed (and optionally cooled further), then inserted into a different motherboard or bus analyzer to be refreshed and directly read. The former can be partially mitigated with a BIOS password, but that can be fairly easily defeated by removing the CMOS battery, or just shorting the right pins. The latter may be defeated by repeatedly polling the DMI table for memory module temperature, and wiping memory then shutting down if a sudden, inexplicable drop in temperature is experienced. I currently do this with a simple C program. In the future, I may have it directly wipe the key from memory by calling crypt_wipe_key() from a kernel module, and issuing HLT.
Hardening from improper shutdowns
The system should never shut down improperly, unless an attack is occurring. I can take advantage of this with two LUKS keyslots, both with the same password. The first keyslot takes 5 seconds of PBKDF2 time to process, and the second takes an obscenely long time (e.g. 72 hours). When the system boots, an init script copies the first keyslot to tmpfs and wipes it. When the system shuts down properly, it writes the keyslot back to the LUKS header. If the system is ever shut down in an emergency, that keyslot is lost for good, and the only one remaining is the one which takes an obscenely long time to hash. The worst case scenario is I accidentally type poweroff -f or something, and I have to wait 72 hours before I know if I made a typo in my password. Best case scenario is my adversary will be almost completely unable to attack the system, because any time it is on, the physical disk will be encrypted with a key that can be guessed at a rate of one try every few days. On a side-note, I might also be able to make use of the ephemeral nature of /dev/nvram, assuming it is true NVRAM (which should be the case if /dev/nvram has a size of 144 bytes) and not CMOS EEPROM or some sort of emulated NVRAM. Much of its memory is not utilized, so it could be (ab)used as a sort of poor-man's SED, instead of relying on poorly designed SED inside the closed source firmware of "enterprise" nearline SATA drives.
Defeating hardware write-blockers
One common way to obtain a forensically-sound disk image is to use a hardware write-blocker, which is a small device that attaches a hard drive to a computer and drops all writes going from the computer to the drive. Normally, there would be no way to prevent this. However hard drives contain multiple powerful CPUs, and most of the boards they are on support JTAG, which is a method to control a CPU like a puppet. This means that a small device could be put inside a hard drive and attached to the JTAG interface, injecting code into the hard drive's memory to change its behavior. Injecting into memory this way would be preferable to writing to the hard drive's persistent firmware because that would require closed source SDKs which I do not have access to. The behavior could be modified so that the drive could initiate ATA Security Erase if a certain threshold of sectors are read in a row, which would indicate a hardware write blocker. Or alternatively, the drive could initiate erasure if a certain combination of sectors aren't directly read from (a sort of analogue to port knocking... sector knocking?). This is a bit tinfoil hat, but would make an interesting project to harden hard drives from forensic analysis. This isn't a new idea and people have done interesting things with hard drives over JTAG.
Continuously scanning the BIOS for tampering
Cold boot attacks are getting impractical, especially when many other mitigations are in place. However, modification of the BIOS on a running system, and then resetting the system into the new BIOS can have interesting consequences. In the case just linked, the BIOS was modified directly over SPI, then the system was warm reset over LPC into the new BIOS, which promptly began to export the entire contents of memory slowly over serial to the investigator's computer. A mitigation to this would be to have the OS scan the BIOS in a continuous loop and verify that it has not been modified since the last read. As writing to the BIOS is much slower than reading from it, this will likely detect any tampering as it is occurring. The computer can then take defensive action, like shutting down before the write is complete. I've heard someone mention that EEPROM apparently cannot handle millions of reads (not a typo, I said reads), but luckily most modern BIOSes are MLC NAND, which can handle a theoretically unlimited number of reads, so the system should be able to read in a continuous loop indefinitely, making this mitigation practical.
Motion sensitive camera
Pretty simple, but superior to generic chassis intrusion detection. I have a camera pointing at the workstation, hooked up to the workstation, being monitored with the motion program. In essence, the computer is watching itself to make sure no one gets near it. If anyone gets near it, it will take a predetermined action, such as shutting down. This is much harder to circumvent than chassis intrusion detection switches, because it requires fooling the camera. The only way to defeat this would be to cause the camera to freeze with the current image it has in place.
To re-iterate, my question is: what other or more effective methods for protecting an unattended workstation in this threat model have I not thought of? Specifically in the domain of detecting unauthorized access and making chassis intrusion detection more robust.