When most Linux users hear "root", they think of the maximum possible privilege on a computer. Some even think that root runs in ring 0. But in reality, root is just a regular user running in ring 3, albeit one which the kernel trusts (many sensitive kernel operations are guarded with checks along the lines of if (!uid_eq(current_uid(), GLOBAL_ROOT_UID)) return -EPERM;
to prevent abuse, which simply returns an error if uid != 0
).
On most Linux systems, the kernel trusts root so much that it can easily exploit the kernel and gain access to ring 0 at will. However, it is often possible with security tools such as grsecurity, SELinux, etc, to prevent this by reducing root's dangerous abilities, and enforcing these restrictions on the kernel level.
I would like to enumerate a list of methods by which root could exploit the kernel to gain access to kernel mode and ring 0. So far, these are the methods I have thought or learned of by which this could happen, along with potential mitigations:
ioperm()
andiopl()
can set I/O port permissions and can be abused to write to arbitrary regions of memory, including memory where the kernel resides. These syscalls can be disabled by removing them from the syscall table or with grsecurity.- Root can just modify the kernel image in
/boot
or through the block device. A MAC can restrict root's access to both of these. /dev/{k,}mem
are designed to allow rw access to arbitrary memory. These can be disabled completely in the kernel config, by using grsecurity, or with a MAC.- Some MSRs can be used to write to arbitrary memory. Denying writing to MSRs either by disabling them in the kernel config or with grsecurity mitigates this issue.
kexec
allows root to select an alternate kernel to boot into. This is an optional kernel feature, so simply compiling the kernel withoutkexec
support is enough to make this a non-issue.sysfs
provides low level access to much of the hardware, which can hijack a poorly locked-down BIOS/UEFI on many vulnerable systems to gain ring 0, or even ring -2 access. A MAC can restrict access to/sys
, and various tools can detect vulnerabilities in UEFI/BIOS.- If root is allowed to load ACPI tables at runtime (DSDT, SSDT, etc), it can cause the kernel to execute AML, which is ACPI bytecode, and change how the kernel behaves and reacts to the hardware it runs on. I know little about ACPI and AML, but this sounds like an absolute recipe for disaster. Disabling loadable ACPI table support in the kernel should mitigate this.
- Loading malicious kernel modules can directly hijack the kernel. This can be trivially defeated by requiring module signing, or by building a kernel without module support.
Obviously, I will be using principal of least privilege for root, rather than trying to blacklist all the possible ways it can break out of its chains. That doesn't change the fact that it is still extremely useful, and interesting, to understand all the ways it can abuse the kernel's trust.
And so that brings me to my question: Are there any other methods which root can use to gain access to ring 0, without using 0days and without exploiting opsec mistakes, which I have not covered here?