- What is the quickest way to regain control of a Linux system that has become nonresponsive or extremely sluggish due to excessive swapping?
Already answered above with Alt-SysRq-F
- Is there an effective way to prevent such swapping from occurring in the first place, for instance by limiting the amount of memory a process is allowed to try to allocate?
I'm answering this 2nd part. Yes, ulimit
still works well enough to limit a single process. You can:
- set a soft limit for a process you know will likely go out of control
- set a hard limit for all processes if you want extra insurance
Also, as briefly mentioned:
You can use CGroups to limit resource usage and prevent such problems
Indeed, cgroups offer more advanced control, but are currently more complicated to configure in my opinion.
Old school ulimit
Once off
Heres a simple example:
$ bash
$ ulimit -S -v $((1*2**20))
$ r2(){r2 $@$@;};r2 r2
bash: xmalloc: .././subst.c:3550: cannot allocate 134217729 bytes (946343936 bytes allocated)
It:
- Sets a soft limit of 1GB overall memory use (ulimit assumes limit in kB unit)
- Runs a recursive bash function call
r2(){ r2 $@$@;};r2 r2
that will exponentially chew up CPU and RAM by infinitely doubling itself while requesting stack memory.
As you can see, it got stopped when trying to request more than 1GB.
Note, -v
operates on virtual memory allocation (total, i.e. physical + swap).
Permanent protection
To limit virtual memory allocation, as
is the equivalent of -v
for limits.conf
.
I do the following to protect against any single misbehaving process:
- Set a hard address space limit for all processes.
address space limit = <physical memory> - 256MB
.
- Therefore, no single process with greedy memory use or an active loop and memory leak can consume ALL the physical memory.
- 256MB headroom is there for essential processing with ssh or a console.
One liner:
$ sudo bash -c "echo -e \"*\thard\tas\t$(($(grep -E 'MemTotal' /proc/meminfo | grep -oP '(?<=\s)\d+(?=\skB$)') - 256*2**10))\" > /etc/security/limits.d/mem.conf"
To validate, this results in the following (e.g. on 16GB system):
$ cat /etc/security/limits.d/mem.conf
* hard as 16135196
$ ulimit -H -v
161351960
Notes:
- Only mitigates against a single process going overboard with memory use.
- Won't prevent a multi-process workload with heavy memory pressure causing thrashing (cgroups is then the answer).
- Don't use
rss
option in limits.conf. It's not respected by newer kernels.
- It's conservative.
- In theory, a process can speculatively request lots of memory but only actively use a subset (smaller working set/resident memory use).
- The above hard limit will cause such processes to abort (even if they might have otherwise run fine given Linux allows the virtual memory address space to be overcommitted).
Newer CGroups
Offers more control, but currently more complex to use:
- Improves on ulimit offering.
memory.max_usage_in_bytes
can account and limit physical memory separately.
- Whereas
ulimit -m
and/or rss
in limits.conf
was meant to offer similar functionality, but that doesn't work since kernel Linux 2.4.30!
- Need to enable some kernel cgroup flags in bootloader:
cgroup_enable=memory swapaccount=1
.
- This didn't happen by default with Ubuntu 16.04.
- Probably due to some performance implications of extra accounting overhead.
- cgroup/systemd stuff is relatively new and changing a fair bit, so the flux upstream implies Linux distro vendors haven't yet made it easy to use. Between 14.04LTS and 16.04LTS, the user space tooling to use cgroups has changed.
cgm
now seems to be the officially supported userspace tool.
- systemd unit files don't yet seem to have any pre-defined"vendor/distro" defaults to prioritise important services like ssh.
E.g. to check current settings:
$ echo $(($(cat /sys/fs/cgroup/memory/memory.max_usage_in_bytes) / 2**20)) MB
11389 MB
$ cat /sys/fs/cgroup/memory/memory.stat
...
E.g. to limit the memory of a single process:
$ cgm create memory mem_1G
$ cgm setvalue memory mem_1G memory.limit_in_bytes $((1*2**30))
$ cgm setvalue memory mem_1G memory.memsw.limit_in_bytes $((1*2**30))
$ bash
$ cgm movepid memory mem_1G $$
$ r2(){ r2 $@$@;};r2 r2
Killed
To see it in action chewing up RAM as a background process and then getting killed:
$ bash -c 'cgm movepid memory mem_1G $$; r2(){ r2 $@$@;};r2 r2' & while [ -e /proc/$! ]; do ps -p $! -o pcpu,pmem,rss h; sleep 1; done
[1] 3201
0.0 0.0 2876
102 0.2 44056
103 0.5 85024
103 1.0 166944
...
98.9 5.6 920552
99.1 4.3 718196
[1]+ Killed bash -c 'cgm movepid memory mem_1G $$; r2(){ r2 $@$@;};r2 r2'
Note the exponential (power of 2) growth in memory requests.
In the future, let's hope to see "distro/vendors" pre-configure cgroup priorities and limits (via systemd units) for important things like SSH and the graphical stack, such that they never get starved of memory.
About disabling the swap, according to http://unix.stackexchange.com/a/24646/9108 it might not be the best option.
– sashoalm – 2016-09-05T19:01:48.183Indeed, someone commented the same on me, so I've modified the thrash-protect doc at that point. – tobixen – 2016-09-09T06:43:02.090