44

My VPS web server running on CentOS 5.4 (Linux kernel 2.6.16.33-xenU) irregularly (like once a month give or take a few weeks) becomes unresponsive due to oom-killer kicking in. Monitoring of the server shows that it doesn't normally run out of memory, just every so often.

I've read a couple of blogs that point to this page which discusses configuring the kernel to better manage overcommit using the following sysctl settings:

vm.overcommit_memory = 2
vm.overcommit_ratio = 80

My understanding of this (which may be wrong, but I can't find a canonical definition to clarify) is that this prevents the kernel over-allocating memory beyond swap + 80% of physical memory.

However, I have also read some other sources suggesting that these settings are not a good idea - although the critics of this approach seem to be saying "don't do things to break your system, rather than attempting this kludge" in the assumption that causation is always known.

So my question is, what are the pros and cons of this approach, in the context of an Apache2 web server hosting about 10 low traffic sites? In my specific case, the web server has 512Mb RAM, with 1024Mb swap space. This seems to be adequate for the vast majority of the time.

dunxd
  • 9,482
  • 21
  • 80
  • 117

2 Answers2

37

Setting overcommit_ratio to 80 is likely not the right action. Setting the value to anything less than 100 is almost always incorrect.

The reason for this is that linux applications allocate more than they really need. Say they allocate 8kb to store a couple character string of text. Well thats several KB unused right there. Applications do this a lot, and this is what overcommit is designed for.

So basically with overcommit at 100, the kernel will not allow applications to allocate any more memory than you have (swap + ram). Setting it at less than 100 means that you will never use all your memory. If you are going to set this setting, you should set it higher than 100 because of the fore-mentioned scenario, which is quite common.
However, while setting it greater than 100 is almost always the correct answer, there are some use cases where setting it less than 100 is correct. As mentioned, by doing so you wont be able to use all your memory. However the kernel still can. So you can effectively use this to reserve some memory for the kernel (e.g. the page cache).

Now, as for your issue with the OOM killer triggering, manually setting overcommit will not likely fix this. The default setting (heuristic determination) is fairly intelligent.

If you wish to see if this is really the cause of the issue, look at /proc/meminfo when the OOM killer runs. If you see that Committed_AS is close to CommitLimit, but free is still showing free memory available, then yes you can manually adjust the overcommit for your scenario. Setting this value too low will cause the OOM killer to start killing applications when you still have plenty of memory free. Setting it too high can cause random applications to die when they try to use memory they were allocated, but isnt actually available (when all the memory does actually get used up).

phemmer
  • 5,789
  • 2
  • 26
  • 35
  • 1
    Thanks - I'm trying things with overcommit_ratio set to 100 to see what happens. The main problem I have is that when oom-killer starts it invariably kills sshd preventing me from accessing the server and seeing what is going on. I guess what I really need is to stop oom-killer from running and some means of recording what happens when it would have run so I can find the cause of the problem. – dunxd Feb 23 '12 at 09:41
  • 6
    @dunxd you can use `/proc//oom_score_adj` for this purpose. For example, if you set oom_score_adj to -1000 for sshd, the oom killer will never target sshd when it wants to kill something. Stopping oom killer entirely isnt a good idea as then your programs wont be able to malloc memory, and they'll die anyway. – phemmer Feb 24 '12 at 14:44
  • 1
    That would be a great function if the PID for an app was static... – dunxd Feb 24 '12 at 22:40
  • 4
    @dunxd its inherited. have your init script set it on itself, and anything started by the init script inherits it. – phemmer Feb 25 '12 at 00:49
  • 7
    Your 4 KB example is wrong. Virtual memory is used with pages and the (smallest) size of a page under Linux is 4 KB. That means storing a couple of characters requires 4 KB to be mapped somewhere regardless of overcommitment settings. A proper example of memory over commitment would be for example you allocate 10 KB and only use the first 4100 bytes. That means two 4 KB pages need to store the data and one extra page is unused. Non overcommiting systems will always have that third page ready to store data should the demand arrive, over commiting systems won't enforce that. – jlliagre Sep 27 '13 at 07:36
  • 2
    /proc/self points to the current process, so /proc/self/oom_score_adj could be used to change oom_score_adj of the current process. – r_2 Aug 28 '14 at 20:50
  • 1
    How does the `overcommit_ratio` interact with a setting of 1 (always overcommit) for `overcommit_memory`? If it always allow overcommit, then does that mean the ratio is irrelevant? – CMCDragonkai Jan 13 '16 at 09:03
  • 3
    Setting overcommit_ratio to 80 or 90 makes sense if you are running an application (e.g., PostgreSQL) which counts on the OS maintaining a RAM cache of disk pages. Generally, you would rather have the OS start to swap than to throw away all cache. – kgrittn Aug 24 '17 at 19:36
  • 1
    "Setting the value to anything less than 100 is almost always incorrect." => Any backing for that? Debian, Armbian, Ubuntu and Mint have set it to 50 by default. Either something changed since you wrote your answer, or they are all incorrect ;) – Izzy Oct 06 '19 at 17:51
  • Setting `overcommit_ratio` to less than 100 makes sense only if all the processes that you run have been carefully written to not acquire unused memory pages. If you run regular linux distribution with lots of service, your system does not apply. Running with `overcommit_ratio` makes sure that OOM Killer never kills anybody, but any random process may fail any random `malloc()`. And changes are very very high that some of those failures are not correctly handled. This leads random processes crashing left and right or, in worst case, memory corruption in random processes. – Mikko Rantalainen Apr 01 '20 at 09:49
27

Section 9.6 "Overcommit and OOM" in the doc that @dunxd mentions is particularly graphic on the dangers of allowing overcommit. However, the 80 looked interesting to me as well, so I conducted a few tests.

What I found is that the overcommit_ratio affects the total RAM available to ALL processes. Root processes don't seem to be treated differently from normal user processes.

Setting the ratio to 100 or less should provide the classic semantics where return values from malloc/sbrk are reliable. Setting it ratios lower than 100 might be a way to reserve more RAM for non-process activities like caching and so forth.

So, on my computer with 24 GiB of RAM, with swap disabled, 9 GiB in use, with top showing

Mem:  24683652k total,  9207532k used, 15476120k free,    19668k buffers
Swap:        0k total,        0k used,        0k free,   241804k cached

Here are some overcommit_ratio settings and how much RAM my ram-consumer program could grab (touching each page) - in each case the program exited cleanly once malloc failed.

 50    ~680 MiB
 60   ~2900 MiB
 70   ~5200 MiB
100  ~12000 MiB

Running several at once, even with some as the root user, didn't change the total amount they consumed together. It's interesting that it was unable to consume the last 3+ GiB or so; the free didn't drop much below what's shown here:

Mem:  24683652k total, 20968212k used,  3715440k free,    20828k buffers

The experiments were messy - anything that uses malloc at the moment all RAM is in use tends to crash, since many programmers are terrible about checking for malloc failures in C, some popular collection libraries ignore it entirely, and C++ and various other languages are even worse.

Most of the early implementations of imaginary RAM I saw were to handle a very specific case, where a single large process - say 51%+ of available memory - needed to fork() in order to exec() some support program, usually a much, much smaller one. OSes with copy-on-write semantics would allow the fork(), but with the proviso that if the forked process actually tried to modify too many memory pages (each of which would then have to be instantiated as an new page independent from the initial huge process) it would end up getting killed. The parent process was only in danger if allocating more memory, and could handle running out, in some cases just by waiting for a bit for some other process to die, and then continuing. The child process usually just replaced itself with a (typically smaller) program via exec() and was then free of the proviso.

Linux's overcommit concept is an extreme approach to allowing both the fork() to occur as well as allowing single processes to massively overallocate. OOM-killer-caused deaths happen asynchronously, even to programs that do handle memory allocation responsibly. I personally hate system-wide overcommit in general and the oom-killer in particular - it fosters a devil-may care approach to memory management that infects libraries and through them every app that uses them.

I'd suggest setting the ratio to 100, and having a swap partition as well that would generally only end up getting used by huge processes - which often are only using a tiny fraction of the part of themselves that gets stuffed into swap, and thus protect the vast majority of processes from the OOM killer misfeature. This should keep your webserver safe from random death, and if it was written to handle malloc responsibly, even safe from killing itself (but don't bet on the latter).

That means I'm using this in /etc/sysctl.d/10-no-overcommit.conf

vm.overcommit_memory = 2
vm.overcommit_ratio = 100
Alex North-Keys
  • 531
  • 4
  • 6