Swaping, Paging, Segmentation, and Virtual memory on x86 PM architecture

Well, this may seem a common or already asked question but after searching various books, online tutorials and even here on SU, I am still puzzled at how these four beasts work together on a x86 protected mode system.

What is the correct terminology that can be used when discussing about these things?

As far as I understand, all these 4 concepts are completely different, but got related when we talk about protecting memory. This is where it got messed up for me!

I' ll begin with swapping first.

Swaping:

A process must be in physical memory for execution. A process can be swapped temporarily out of physical memory to a backing store and then brought back into memory for continued execution.

This applies specifically to multitasking environments where multiple processes are to be executed at the same time, and hence a cpu scheduler is implemented to decide which process to swap to the backing store.

Paging: aka simple paging:

Suppose a process have all addresses that it use/access in the range 0 to 16MB, say. We can call it the logical address space of the process, as the addresses are generated by the process.

Note that by this definiton, logical address space of a process can be different from that of another process, as the process may be larger or smaller.

Now we divide this logical address space of a process into blocks of same size called pages. Also divide physical memory into fixed-size blocks called frames.

By def. logical address = page# : offset in that page

When a process is chosen to be executed by the cpu scheduler, its pages are loaded from backing store into any available memory frames.

Note that all of the pages that belong to this process are loaded in memory, before control is transfered to this process from scheduler. When this process is to be swapped to backing store, all of its pages are stored on the backing store.

The backing store is divided into fixed size blocks that are of same size as physical memory frames.

This eases the swapping process, as we swap a page and not bytes. This decreases the fragmentation on backing store, as we need not find space for some bytes, instead we look whether the space is available for a page or not.

The paging technique also decreases the fragmentation of physical memory, as we keep a page in memory.

The main memory must have space for all of the pages that belong to a process in order to load that process in memory for execution. If there is space only for a few pages of this process then some other process(ie all pages belonging to process) must be swapped to backing store and then only all pages of the process to be executed, must be loaded in memory.

Thus paging technique gives better performance than simple swapping.

Thus swapping allows us to run multiple processes w/o purchasing too much memory, instead we can work with small amount of memory(this amount must be such that all of the pages of the largest program/process that is to be run on a PC can be loaded in memory - ie you must know how much memory your program requires before running it.) plus an additional backing store usually disk, which has very less cost for much larger capacity that main memory.

So swapping + paging allows efficient management of memory so that multiple processes can be run on a system.

Demand-paging:

But the physical memory installed in a system need not be same as the requirement of the process. Also multiple processes need to be run.

The solution is to load only some pages of a process into memory, and when the process accesses an address in a page that is not in memory then a page fault is generated and the OS loads that page on demand so that the process can continue executing. This save the time to loadd all pages of that process before transferring control to it - as was case in paging + swapping.

This technique of keeping only parts of a process in memory, and rest on backing store, such as disk, is called demand paging.

Thus demand-paging = paging + swapping + only keep some pages(not all) of a process in memory.

This is all about paging and swapping that I know. Please feel to correct me I am wrong in some place above.

Now my questions are:

how exactly virtual memory and virtual address space(aka linear address space) terms are related to demand-paging in the context of x86 protected mode.
Is "virtual memory of a process" is a correct term or virtual memory is defined for all processes currently running in a multitasking system ?
Am I right in : virtual memory available to a process == highest address in the Virtual address space(aka Linear address space) of a process + 1 ?
This is about segmentation: In a x86 protected mode, we are told that each process can have a 4GB virtual address space(VAS), this means that since segmentation is present on the x86 architecture, we can divide this VAS into two or more segments. In x86 Flat model, we create segments in the VAS of a process s.t they all overlap exactly, so effectively segmentation is disabled - there are no segments. But then if say at a virtual address in the VAS of some process, some cpu instructions are present, then it is possible that we overwrite these instructions while allocating memory(in this VAS) or when we create variables or arrays. How do we ensure that this does not occur. The protection bits in descriptor does not distinguish b/w the regions as in flat mode all segments overlap. These bit can only prevent reading code or executing data, and that too only becoz the segments are accessed via selectors.
or, is it something like each segment is treated as its own VAS. But in that case the total virtual memory(or total VAS) available to a process in a flat mode, would then be : " no. of segments belonging to a process x virtual memory for a single segment ". For a x86 protected mode, this would translate to 6 x 4GB = 24GB of VAS ! assuming 6 segments pointed by CS, DS, ES, GS, FS, SS registers. Is this correct ?
How does a environment that support simple paging(not demand-paging) but not virtual memory, will ensure protection b/w various segments in flat memory model? We have two cases here - a single tasking system and a multi-tasking system.

UPDATE: on 2012-07-29

So if I understand it correctly:

Virtual memory is a concept and it is implemented on x86 architecture by using demand-paging technique + some protection bits(U bit and W bit specifically).

IOWs, the VAS of a process is divided into pages, which are then used in demand-paging.

Virtual memory mechanism has basically two uses in a multi-tasking environemnt:

Size of the program may exceed the amount of physical memory available for it. The operating system keeps those parts of the program currently in use in main memory, and the rest on the disk. This is implemented by demand-paging with each page having an associated 'present bit' and 'accessed bit' in its page table entry.
To provide memory protection by giving each process its own virtual address space, so one process can't access other process's VAS. This is implemented by having some protection bits associated with each page. Specifically, 'User/Supervisor bit - U bit', read/write bit W bit' in the page table entry are used for page access protection.

Virtual memory is useful in both single-tasking system and multi-tasking system. For single-tasking systems, only Use#1 is relevant.

Page access protection has 2 aspects: privledge level protection and write protection. These are implemented by U bit(for prviledge) and W bit(for write) respectively. These bits are present in page table entry for that page.

Memory protection has 2 aspects: protecting programs from accessing each other and protecting programs from overwriting itself, in case segments overlap in VAS of that process/program.

Now former problem is solved by VAS or virtual memory concept, but what about latter ?

The page access protection scheme doesn't prevent the latter as far as I know. IOWs, virtual memory technique doesn't prevent the progams from overwriting itself, in case segments overlap in VAS of a process.

But it seems to me that even segment-level protection can't prevent the latter (overwriting itself) issue of memory protection.

x86 cpu always evaluates segment-level protection before performing the page-level protection check - no matter whether it is flat or multi-segment model - as there is no way to disable segmentation on x86 cpu.

Consider a flat model scenario:

Consider a virtual address referred to by CS:off. Now the DS:off will also refer to the same virtual address as referred by CS:off, if 'off' value is exactly same in both cases. This is true for SS:off also.

This also means that the page in which this virtual/linear address lies, is viewed by paging unit as simply a page as it doesn't know about segmentation.

Assume all segments of a program, in flat mode belong to same privilege level, say ring0.

Now what will happen if we try to write or execute data at CS:off = DS:off = SS:off.

Assume that this address does not belong to the OS code mapped in VAS of process - please just keep aside OS for simplicity, I'm talking about hardware-level protection!

First, segment-level protection will be passed, then the privilege level checks will be passed while accessing this page(the page containing CS:off or DS:off or SS:off) as all segments belong to same privilege here, but what about W bit for this page. This should be set to 1 to allow writes, otherwise say a data segment will not be able to do writes at his page. So this means that this page is writable too.

This means that we can read/write/execute the data at this virtual(linear) address: CS:off = DS:off = SS:off.?

I don't understand that how x86 hardware can provide protection on this issue in case segments overlap.

jacks

Posted 2012-07-28T15:13:08.670

Reputation: 171

Answers

Okay so admittedly there were alot of terms flying around and confusing wording, but I will do my best to answer. As far as I could tell you are correct in most of your understading, but there are some points to go over.

It is important to understand how paging and virtual memory work from a hardware context. Paging would prove impractical without hardware support because processes must be agnostic as to how the memory is laid out, and the operating system should not have to use software to babysit every process on the system. That's where the Memory Management Unit (MMU) comes in. This unit basically is programmed by the operating system to arrange pages in a virtual address space and can be controled at will by the operating system. The operating system can tell the unit which pages are actually in physical RAM, and which pages are not loaded yet or are swapped out.

So, how do we keep programs from messing with this memory management stuff? Something we call protection. We can keep processes sandboxed so that they do not interefere with the operating system and other processes. The confusion as to why all of these terms are thrown around together stems from the fact that they are indeed interconnected. The privelages that code have are specified by the page table. The page table tells the MMU how a virtual space is laid out and also tells the MMU whether a page is A) present B) is read/write C) is allowed to execute code and D) what privilege level (ring) the code on said page can execute.

When the scheduler schedules a process, the page table is not recreated, no new memory needs to be arranged, the Operating System simply tells the MMU to use a different page table, which is a O(1) process, or in other words, not dependent on the size of the process or how much memory it uses. Entire processes are rarely swapped in and out of memory at once, usually it is only pages at a time, so the term "swapping" is often clarified as "page swapping."

Okay so with that background, I will attempt to answer each of your questions:

Linear Address Space simply means that you can access things from 0 to 2^32. No need for fancy segmentation as was necessary in the days of 16-bit processors. Virtual memory simply means that the linear address space of a process is defined not by main memory but by the Operating System, this means that the operating system can arrange pages arbitrarily in this address space, placing itself at a high level and the process at a lower level for example. Additionally the processor can specify which parts of this virtual address space are accessable by what privileges. The operating system (kernel) is loaded in every virtual address space so that the processes can do system calls and so that there is someplace to go when they are preempted. They cannot, however, read or write to this area because that is marked by the OS as "privileged code only." They can only accesses it via system call mechanisms in the processor (ie software interrupts). Demand paging simply means that process expects certain parts of this virtual address space to have specific content (perhaps a file, or even parts of itself), but it isnt really there, the OS has marked the area "not present" in the page table. When the process finally does access this area, because it is not present, the CPU throws a fault which is trapped by the OS. The OS then is smart enough to load that page and restart the process where it left off. The result is that the process is not even aware of the hiccup and things are only loaded as they are demanded, saving memory.
Virtual Memory is the name of this entire mechanism of specifying page tables and thier protection, as well as the pages possibly being on another medium like a disk, therefore paging. Virtual Memory is probably the catchall term for your title, excepting perhaps segmentation. When referring to a specific process, I would personally use something like "Virtual address space of a process," since that unambiguosly refers to the virtual memory layout of a specific process.
No. As I mentioned earlier, the OS can arbitrarily map real memory to any location in the virtual address space of a process. That means that it could, for example, have a situation where the process code is at address 0x0, but the heap (growing down) starts at 0xFFFFFFF, clear on the otherside of the address space. There may actually be constraints on where things are mapped due to device drivers needing specific address areas for hardware, but for the purpose of understanding virtual memory, there is no restriction.
Segmentation is simply an addressing scheme. In the 286, it was also used as the mechanism for implementing protection, but that proved to be too inflexible and so in 32-bit processors protection is always done with paging (though as I understand, the 286 protection schemes are retained for when paging is disabled). Since protection is defined by the paging mechanism, segmentation doesnt cause any more or less risk to overwriting data than in a flat memory mode. With most executable file formats, the code segment is clearly separated from the data segment. As we can expect the code to never change, the operating system generally marks the pages of the code segment as read only, thus any attempt to write to code causes a fault and the program exits. This will never occur if all variables and arrays are allocated via the stack or heap in a modern operating system. If, however, the program starts poking around outside of this, it will crash before it is able to overwrite any code. A greater risk (and one which use to be a big problem) is having your stack overwritten in a buffer overrun. Some could take advantage of this to put code on the stack and then cause it to be executed unauthorizedly. As a fix, a new bit was placed in the page table "No eXecute" (NX) bit. This prevents a page from ever being executed.
This is not at all true. The segments simply act as pointers to an area (segment) of the original 2^32 bytes of address space. The idea behind this originally was that it would keep pointers smalle, since you could have a segment pointer and a pointer inside that segment that was smaller than the entire address space. For example, in the 286 (a 16-bit processor) it made sense to keep pointers at 16-bits, yet this presented a problem because the 286 could address 2^24 bytes of memory. The solution? Use segmentation. A segment could be 2^16 bytes large and they could point anywhere in the address space. Then when code had to operate, it would use 16-bit pointers for within that segment only. This was faster and more eifficient. When 32-bit processors came out this mechanism was not longer neccessary, but since it was used so much by code prior and programmers were used to them, they kept segmentation. Newer 64-bit processors do not use segmentation at all.
Confusion here is the fact that virtual memory is the term for many of these different mechanisms. Virtual Memory is required for multitasking for protection of one process from another processes address space. Paging, and by extension preemptive multitasking, is only possible with virtual memory features. Many of these features, however, can be effectively disabled. Perhaps you dont want address translation? Then map every page to itself. Perhaps you dont want memory protection but want address translation? Then give all privileges to every page. In DOS and other single processor systems, a confusion arises when one refers to "protected mode." Usually this refers to 32-bit mode as opposed to 16-bit real mode, so despite the name it does not necessarily mean that protection is used, only that in that mode it can be enabled. There are probably many single process systems that run in this "protected mode" but do not use virtual memory nor protection. The original Xbox is a good example of that. There can be a slight increase in performance when all these features are disabled. However, in DOS it still may be advantageous to use many of these features. The most notable is page swapping, since in the early days when DOS was ubiquitious, RAM was hard to come by and therefore any mechanism that saved on RAM was welcomed and well used. Protection had its advantages in single process systems as well, since it could prevent the program from crashing in an ugly manner, allow for better debugging, and prevented data corruption due to bad hardware access.

I hope this answers your question.

Dougvj

Posted 2012-07-28T15:13:08.670

Reputation: 942

Okay, it is getting clearer. :) But I didn't get your answer to my Q#4 and Q#6 . Regarding Q#4 see my update on 2012-07-29. As far as Q#6 is concerned, I asked it becoz some DPMI environments (those that run applications at ring0, and themselves run on pure DOS) don't support virtual memory but support paging. I didn't get how paging is useful in such a case especially if a single-tasking environment e.g. DOS is present on the system?

– jacks – 2012-07-29T05:07:23.183

@jacks I modified the response to try and answer those questions better. The short version is this: Code overwriting is prevented by making the code pages read only; DPMI does not always mean protection or paging is enabled, but sometimes it is to save memory and to prevent terrible crashing. – Dougvj – 2012-07-29T16:20:50.060

That means, w/o implementing virtual memory, paging is useless. Isn't it? Regarding "x86 hardware can provide protection on this issue in case segments overlap", it seems that w/o OS support, ie say we execute our program in ring0 with no OS, ie our application is a kernel like app., then there is nothing in x86 hardware, that can stop us to execute data and to write to code segment. It is the OS that alongwith x86 hardware support, prevents happining this. Am I right? – jacks – 2012-08-01T07:28:32.277

That's right, although our ring0 app could act like the OS and set up the correct protection mechanisms to prevent that, it would really only be useful in debugging and not actual security. Page swapping would be useful for saving memory but it would be non-trivial to implement for a single app. – Dougvj – 2012-08-01T15:37:16.050

Got it. :) I'll research more on this subject and see if there is something awaiting to confuse me. – jacks – 2012-08-01T15:43:40.223