22

From time to time "my" server stalls because it runs out of both memory and swap space. (it keeps responding to ping but nothing more than that, not even ssh).

I'm told linux does memory overcommitment, which as far as I understand is the same as banks do with money: it grants to processes more memory than actually available, assuming that most processes won't actually use all the memory they ask, at least not all at the same time.

Please assume this is actually the cause why my system occasionally hangs, let's not discuss here whether or not this is the case (see What can cause ALL services on a server to go down, yet still responding to ping? and how to figure out).

So,

  1. how do I disable or reduce drastically memory overcommitment in CentOS? I've read there are two settings called vm.overcommit_memory (values 0, 1, or 2) and vm.overcommit_ratiom but I have no idea where I have to find and change them (some configuration file hopefully), what values should I try, and whether I need to reboot the server to make the changes effective.

  2. and is it safe? What side effects could I expect? When googling for overcommit_memory I find scary things like people saying their server can't boot anymore....

Since what causes the sudden increase in memory usage is mysql because of queries that are made by php which in turn is called while serving http requests, I would expect just some php script to fail to complete and hence some 500 response from time to time when the server is too busy, which is a risk I can take (certainly better that have the whole server become inaccessible and have to hard reboot it).

Or can it really cause my server to be unable to reboot if I choose the wrong settings?

matteo
  • 701
  • 2
  • 9
  • 21
  • 1
    Disabling overcommit isn't going to help you when you're _really_ running out of memory. Adding RAM to the server might help, though. – Michael Hampton Mar 08 '13 at 02:03
  • 3
    Disabling overcommit is not going to be the final solution but IS going to help a lot, if any time the server runs out of memory (which is just every once in a long while for a few seconds) I only have a few http requests rejected (or badly served), instead of having my server DIE completely and forever (until i restart it) – matteo Mar 10 '13 at 15:26

3 Answers3

33

Memory overcommit can be disabled by vm.overcommit_memory=2

0 is the default mode, where kernel heuristically determines the allocation by calculating the free memory compared to the allocation request being made. And setting it to 1 enables the wizardry mode, where kernel always advertises that it has enough free memory for any allocation. Setting to 2, means that processes can only allocate up to a configurable amount (overcommit_ratio) of RAM and will start getting allocation failure or OOM messages when it goes beyond that amount.

Is it safe to do so, no. I haven't seen any proper use case where disabling memory overcommit actually helped, unless you are 100% certain of the workload and hardware capacity. In case you are interested, install kernel-docs package and go to /Documentation/sysctl/vm.txt to read more, or read it online.

If you set vm.overcommit_memory=2 then it will overcommit up to the percentage of physical RAM configured in vm.overcommit_ratio (default is 50%).

echo 0/1/2 > /proc/sys/vm/overcommit_memory 

This will not survive a reboot. For persistence, put this in /etc/sysctl.conf file:

vm.overcommit_memory=X

and run sysctl -p. No need to reboot.

Dario Seidl
  • 416
  • 5
  • 12
Soham Chakraborty
  • 3,534
  • 16
  • 24
  • the part you didn't answer is in which file I change that vm.memory_overcommit setting and especially whether I need to reboot (or what else) to have it take effect – matteo Mar 24 '13 at 00:42
  • 2
    echo 0/1/2 > /proc/sys/vm/overcommit_memory This will not survive a reboot. For persistence, put this in /etc/sysctl.conf file vm.overcommit_memory=X and run sysctl -p. No need to reboot – Soham Chakraborty Mar 24 '13 at 04:36
  • Thanks a lot. May you please add this to the answer body so that I can formally "accept" it. – matteo Apr 02 '13 at 14:43
  • 1
    Added the new part. – Soham Chakraborty Apr 03 '13 at 13:53
  • 4
    "overcommit_ratio" has an important effect when using overcommit_memory=2 -- it determines the percentage of the physical RAM which is allowed to be allocated! So if ratio < 100 then you will leave some RAM unallocated, perhaps for disk cache or similar. The default ratio is 50% so you'll only use 50% of the physical RAM if you don't change this! – David Gardner Oct 14 '14 at 15:24
  • @DavidGardner that's right. I edited the post to clarify this. – Dario Seidl Dec 31 '18 at 12:18
7

Totally unqualified statement: Disabling memory overcommit is definitely "safer" than enabling it.

$customer has it set on a few hundred web servers and it helped with stability issues a lot. There's even a Nagios check calling out fire real loud if it's ever NOT disabled.

On the other hand, people might not consider it "safe" getting their processes going out of memory when they'd just like to overcommit a little ram and would never really use that. (i.e. SAP would be a very good example)

So, you're back to seeing if it improves things for you. Since You're already looking into it to get rid of related issues - I think it might help for you.

(I know I'll risk a downvote by some grumpy person)

Florian Heigl
  • 1,440
  • 12
  • 19
4

I agree that disabling overcommit is safer than enabling it in some circumstances. If the server runs only few large memory jobs (like circuit simulations in my case), it is much safer to deny the application the memory request upfront rather than waiting for an OOM event (which is sure to follow shortly) Quite often we see servers having issues after the OOM killer has done its work.

user185690
  • 41
  • 1