Let me address the "why" part.
One of the reasons for a modern OS is to allow multiple programs (processes) to run at the same time on a system. If you want to do this safely, the following needs to happen:
You probably (unless you have special needs) don't want to divvy up the RAM in the system in a fixed fashion - e.g. 256MB fixed per process - limits you to 8 processes in a 2GB system. You'd like each process to be able to "ask" for memory and return it when it's done.
You also don't want to divvy up I/O devices in a fixed fashion among devices. Typically you want some or all the hardware, like memory to be a shared resource, or at least only temporarily exclusive to specific processes at specific times. This requires that all processes not try to do I/O on their own, but "go through" something to schedule and coordinate the I/O. The scheduling is important since most I/O is much slower than the CPU, so you can have the CPU do work for other processes while it is waiting on I/O even on a single-core system.
To do the above right, one needs to take advantage of several CPU hardware features. One of these is the MMU, the other is protected mode. Could two OSes share these hardware features cooperatively to run two or more OSes?
Sure, but there is nothing in hardware able to stop one OS from stomping all over the other OSes memory. If the CPU is in kernel mode (it only has one kernel mode), any code can do anything. It's 100% possible for code from one OS to run over the code or data of the other OS. And we know operating systems have had vulnerabilities in the past and will have more in the future. So it's very bad for security.
Now, wouldn't it be cool if you could put another "layer" over this and have that needed hardware support for multiple OSes? That's exactly what the hardware virtualization features do, they put a hardware barrier between multiple running OSes, and there is a top-level "interface" for them called a hypervisor. You can only have one hypervisor. And yes, processes running under either OS must go through three layers to do I/O (process - local kernel - hypervisor)
On a mainframe with partitioned memory... – Fiasco Labs – 2013-12-27T08:04:21.993