1

I was testing how an Ubuntu 16.04 machine handles low memory conditions and had several suspended jobs that that used up almost all the RAM on the server. I was still able to run normal bash commands when I left work last night. In the morning (without anyone doing anything to the server) all commands (ps, free, ls, etc.) gave the following error:

-bash: fork: Cannot allocate memory

I was eventually able to run jobs and kill %1 to recover the machine. It was lucky that I kept an SSH session open, because further SSH connections failed (the client simply didn't show a command prompt after authentication).

The fact that regular userland programs can effectively bring down the entire server to the point where, if there wasn't an existing SSH session, it would have to be hard-rebooted, is highly concerning! My understanding was that Linux (or any OS, really) should kill processes to free up RAM before the entire system gets to such an unusable state. How do I make it do that?

EM0
  • 351
  • 7
  • 20
  • Can you post the output of `free`? – Khaled Feb 21 '18 at 09:35
  • did you reconfigure the sysctl parameters vm.overcommit_memory and/or vm.overcommit_ratio? Seeing "cannot allocate memory" is somewhat unusual on a linux system in standard configuration. – Andreas Rogge Feb 21 '18 at 09:45
  • @Khaled `free` failed with `-bash: fork: Cannot allocate memory` – EM0 Feb 21 '18 at 13:07
  • @AndreasRogge No, I didn't change them. This VM is based on AWS standard image. I just checked sysctl and I have `vm.overcommit_memory = 0, vm.overcommit_ratio = 50`. – EM0 Feb 21 '18 at 13:08
  • I now got the server into the same bad state again by using almost all available RAM. `jobs` shows nothing now and `kill -9 %1' returns `-bash: kill: %1: no such job` (though there *should* be jobs running). I also get `-bash: wait_for: No record of process 10728` in response to some `jobs` and `kill` commands. – EM0 Feb 21 '18 at 14:08
  • Happened again with another AWS instance, this time for real (I wasn't trying to test memory pressure). I can't even connect to the server via SSH - it freezes after "Using username Ubuntu" and eventually times out. Looking at the console screenshot from the AWS console I can see "Killed proces ..." messages. – EM0 Feb 26 '18 at 14:01
  • I met exactly the same situation. No commands work such as `ls`, `top`, and `kill`. My error messages are same: `-bash: fork: Cannot allocate memory` and sometimes `-bash: wait_for: No record of process 19156`. I submitted command `kill -9 19156` but it gives me `-bash: kill: (19156) - No such process`. Luckily, I had a ssh connection so that I can submit some commands, I cannot open new ssh windows through ssh clients. Previously, I just needed to wait until system kills the process automatically (After several hours) Does anyone know the solution? – SUNDONG Oct 09 '18 at 13:33

0 Answers0