Is there any way to get a core dump of, or be able to debug a process that has been killed by oom-killer?
Or even set oom-killer to try to kill a process using ABRT instead?
Another approach, is to disable overcommitting of memory.
To restore some semblance of sanity to your memory management:
- Disable the OOM Killer (Put
vm.oom-kill = 0
in /etc/sysctl.conf)- Disable memory overcommit (Put
vm.overcommit_memory = 2
in/etc/sysctl.conf
)These settings will make Linux behave in the traditional way (if a process requests more memory than is available
Note that this is a ternary value:malloc()
will fail and the process requesting the memory is expected to cope with that failure).
- 0 = "estimate if we have enough RAM"
- 1 = "Always say yes"
- 2 = "say no if we don't have the memory"
This will force the application to handle running out of memory itself, and possibly its logs / coredump / etc. could give you something useful.
NOTE: When your system runs out of memory, you will not be able to spawn new processes! You may be locked out of the system.
echo 1 > /proc/sys/vm/oom_dump_tasks
which seems about the max that you can get the kernel to display on out-of-memory errors.
https://www.kernel.org/doc/Documentation/sysctl/vm.txt
Enables a system-wide task dump (excluding kernel threads) to be produced when the kernel performs an OOM-killing and includes such information as pid, uid, tgid, vm size, rss, nr_ptes, swapents, oom_score_adj score, and name. This is helpful to determine why the OOM killer was invoked, to identify the rogue task that caused it, and to determine why the OOM killer chose the task it did to kill.
If this is set to zero, this information is suppressed. On very large systems with thousands of tasks it may not be feasible to dump the memory state information for each one. Such systems should not be forced to incur a performance penalty in OOM conditions when the information may not be desired.
If this is set to non-zero, this information is shown whenever the OOM killer actually kills a memory-hogging task.