1

Our servers in production sometimes under load runs into the following issue (Linux 4.10):

  1. Application handling the load (webserver) is running at high load.
  2. A new job starts (e.g. cron). This requires a clone() system call, which fails to allocate memory.
  3. Kernel OOM killer starts up, and kills one of the webserver application processes, so that the new job can start.
  4. "free -m" shows that the free memory is critically low, around 1-3 GB out of the 64GB on the server. However, most of the memory is in page cache.
  5. System does not have any swap partition/file setup, and vm.swappiness is set to it's default value of 60.

Instead if we had run "echo 3 > /proc/sys/vm/drop_caches" before the load starts, the OOM does not end up killing our webserver application and everything works fine.

Under the above conditions, does the kernel not free up the page cache before trying to kill a process to release some memory?

Confused
  • 11
  • 1

0 Answers0