4

I'm running Docker 0.9.0

uname -a
Linux 3.11.0-18-generic #32-Ubuntu SMP Tue Feb 18 21:11:14 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

I have 40 containers running at once. Each container is fairly simple - it runs a snippet of code in a Node process. An infinite loop listens for more code snippets to run in Node.

Occasionally I mark these containers to be killed, and I start another container to take its place. I've been experiencing memory errors. Sometimes everything crashes and Docker reports "cannot allocate memory for new container", and sometimes there's simply a timeout on the socket.

Constant reporting cats meminfo and calls "free". This reporting suggests I have plenty of memory unused.

The command

ps --sort -rss -eo rss,pid,command | grep docker

under different scenarios tells me that as new containers are replacing old containers, Resident Set Size memory is ever expanding. If I stop the service as-is and wait an hour this decreases somewhat but never reaches the previous level. For instance it will never drop back to the level at which the original 40 containers were created.

pmap `pidof docker`

Tells shows that all entries are [anon] - as I understand it this is memory reserved by malloc.

Point of crash is ~2GB allocated to the Docker daemon RSS, up from ~40M when fresh.

I am not certain if this is a[nother] Docker bug / memory leak.


How might this lead to an out of memory error provided that free reports 4.5G unused?


There is no swap on my system.

IMPORTANT DETAIL: Docker fails to remove killed containers through the Remote API with an AUFS driver error. For this reason I rely on an external cron to remove stopped containers via the CLI.

Ryan Hewitt
  • 41
  • 1
  • 3

1 Answers1

4

You can use valgrind to find memory related issues:

Usage:

valgrind --tool=memcheck program_name (/sbin/docker or wharever the path to docker is)
valgrind --leak-check=yes program_name

Example:

valgrind --leak-check=yes /sbin/httpd

Check for lines containing definitely lost or probably lost to confirm that there is a memory leak.

Gabriel Talavera
  • 1,367
  • 1
  • 11
  • 18