Linux: The difference between "paging on major page fault" and "swapping enabled manually"

Question

On a Linux machine, we can enable swap by commands like the following

sudo fallocate -l 500M /data/swapfile
sudo chmod 600 /data/swapfile
sudo mkswap /data/swapfile
sudo swapon /data/swapfile

But even when this is not enabled, the kernel still does paging when a page is not in memory.

We can verify this by running the sar -B 1 30 command on a machine without setting any swap file.

03:08:40 AM  pgpgin/s pgpgout/s   fault/s  majflt/s  pgfree/s pgscank/s pgscand/s pgsteal/s    %vmeff
03:08:41 AM      0.00      0.00      3.00      0.00     44.00      0.00      0.00      0.00      0.00
03:08:42 AM      0.00      0.00     19.00      0.00     30.00      0.00      0.00      0.00      0.00
03:08:43 AM      0.00      0.00      0.00      0.00      3.00      0.00      0.00      0.00      0.00
03:08:44 AM     24.00      0.00      2.00      1.00      7.00      0.00      0.00      0.00      0.00
03:08:45 AM    364.00     60.00     18.00      3.00      4.00      0.00      0.00      0.00      0.00
03:08:46 AM    140.00      0.00    392.00      2.00    243.00      0.00      0.00      0.00      0.00

There is still majflt which will trigger paging out data to the disk.

My Questions is:

Can we say there are two types of swapping on the OS?
How do the two mechanisms work differently?
If there is always a paging mechanism working, why is there still a need to enable swap manually?

I know some people said:

Swapping refers to copying the entire process address space, or at any rate, the non-shareable-text data segment, out to the swap device, or back, in one go (typically disk).

Whereas paging refers to copying in/out one or more pages of the address space. In particular, this is at a much finer grain. For example, there are ~250,000 4 KB pages in a 1 GB RAM address space.

However, in the book Understanding the Linux Virtual Memory Manager,it doesn't seem to be this way in Linux.

Strictly speaking, Linux does not swap as “swapping” refers to coping an entire process address space to disk and “paging” to copying out individual pages. Linux actually implements paging as modern hardware supports it, but traditionally has called it swapping in discussions and documentation. To be consistent with the Linux usage of the word, we too will refer to it as swapping.

Could someone shed some light on this? Thanks!

score 2 · Accepted Answer · answered Mar 27 '20 at 14:57

Modern operating systems typically implement their virtual memory in terms of small chunks called pages, including swapping out to disk. This is an improvement over needing to swap out entire programs, as was done in early UNIX System V.

Several have emphasized that paging is distinct from the old swapping, including in Understanding the Linux Virtual Memory Manager. But notice that the swapping term survives.

Confusingly, swapping is only a subset of paging. Executables or memory mapped files are examples where page faults can occur independently of swap space. These file maps are backed by permanent storage already. In contrast, swap space moves around anonymous pages.

So sar paging statistics is a different metric than vmstat swap in / swap out.

Without swap space, there is no means to reclaim anonymous pages. Workloads still need their memory, so the pressure is more intense on file caches. Adding some swap space helps move anon around as well. It isn't helpful as "emergency memory", aggressive reclaim to slow swap space is terrible for performance.

An analogy: consider physically moving house. The "schedule" of memory allocations may force a time critical rush for the kernel "movers" to pick up and move everything. Add an external storage locker, and things not immediately needed can be stored there ahead of time in less disruptive smaller loads. That is swap space used effectively.

The classic example of why you need swap space is all those services that start up when your machine starts up that initialize themselves and dirty lots of pages of memory. Many of those services will sit idle for days. You don't want all those dirty pages locked in RAM serving no purpose. But the OS can't discard them because the services may run in a second, minute, or day. — David Schwartz, Mar 27 '20 at 15:18

c4f4t0r · Answer 2 · 2020-03-28T21:43:44.590

Even if you don't have swap enabled, but the kernel still uses virtual memory

In linux process definition structures are kernel memory(Logical addresses), this mean in case of out of memory, Linux kernel doesn't swap out the process structures.

Linux doesn't swap out himself like solaris, Solaris swap the kernel struct of the processes too.

almost every operating system use paging to translate virtual memory to physical memory

Linux uses page reclamation system to free up memory in case of out of memory or unused pages with user data, For this reason swap is used.

The process pages related to the user space are named virtual memory, these user pages can end up in the swap space.

Many of these terminology are general to unix, with some small differences between linux and unix

The process segments text, data,s tack, bss, heap resides in userspace.

Linux: The difference between "paging on major page fault" and "swapping enabled manually"

2 Answers2