Section 9.6 "Overcommit and OOM" in the doc that @dunxd mentions is particularly graphic on the dangers of allowing overcommit. However, the 80
looked interesting to me as well, so I conducted a few tests.
What I found is that the overcommit_ratio
affects the total RAM available to ALL processes. Root processes don't seem to be treated differently from normal user processes.
Setting the ratio to 100
or less should provide the classic semantics where return values from malloc/sbrk
are reliable. Setting it ratios lower than 100
might be a way to reserve more RAM for non-process activities like caching and so forth.
So, on my computer with 24 GiB of RAM, with swap disabled, 9 GiB in use, with top
showing
Mem: 24683652k total, 9207532k used, 15476120k free, 19668k buffers
Swap: 0k total, 0k used, 0k free, 241804k cached
Here are some overcommit_ratio
settings and how much RAM my ram-consumer program could grab (touching each page) - in each case the program exited cleanly once malloc
failed.
50 ~680 MiB
60 ~2900 MiB
70 ~5200 MiB
100 ~12000 MiB
Running several at once, even with some as the root user, didn't change the total amount they consumed together. It's interesting that it was unable to consume the last 3+ GiB or so; the free
didn't drop much below what's shown here:
Mem: 24683652k total, 20968212k used, 3715440k free, 20828k buffers
The experiments were messy - anything that uses malloc at the moment all RAM is in use tends to crash, since many programmers are terrible about checking for malloc failures in C, some popular collection libraries ignore it entirely, and C++ and various other languages are even worse.
Most of the early implementations of imaginary RAM I saw were to handle a very specific case, where a single large process - say 51%+ of available memory - needed to fork()
in order to exec()
some support program, usually a much, much smaller one. OSes with copy-on-write semantics would allow the fork()
, but with the proviso that if the forked process actually tried to modify too many memory pages (each of which would then have to be instantiated as an new page independent from the initial huge process) it would end up getting killed. The parent process was only in danger if allocating more memory, and could handle running out, in some cases just by waiting for a bit for some other process to die, and then continuing. The child process usually just replaced itself with a (typically smaller) program via exec()
and was then free of the proviso.
Linux's overcommit concept is an extreme approach to allowing both the fork()
to occur as well as allowing single processes to massively overallocate. OOM-killer-caused deaths happen asynchronously, even to programs that do handle memory allocation responsibly. I personally hate system-wide overcommit in general and the oom-killer in particular - it fosters a devil-may care approach to memory management that infects libraries and through them every app that uses them.
I'd suggest setting the ratio to 100, and having a swap partition as well that would generally only end up getting used by huge processes - which often are only using a tiny fraction of the part of themselves that gets stuffed into swap, and thus protect the vast majority of processes from the OOM killer misfeature. This should keep your webserver safe from random death, and if it was written to handle malloc
responsibly, even safe from killing itself (but don't bet on the latter).
That means I'm using this in /etc/sysctl.d/10-no-overcommit.conf
vm.overcommit_memory = 2
vm.overcommit_ratio = 100