2

When I'm running Celery with Upstart, after a while, the child processes or the main processes die without any trace.

The Upstart script I'm using (/etc/init/celery):

description "celery"

start on runlevel [2345]
stop on runlevel [!2345]

kill timeout 20
# console log
setuid ***
setgid ***

script
chdir /opt/***/project/
exec /opt/***/virtualenvs/***/bin/python manage.py celery worker --settings=settings.staging -B -c 4 -l DEBUG
end script

respawn

When running exectly the same command without upstart (manually running the exec part), everything works fine.

With the respawn stanza, the master process will die and get respawned while the lost child processes still exist, causing memory overflows. Without it the processes will just disappear until there are no workers left.

Celery spawns a master process and worker processes (4 of them in this case). I also tried running it with eventlet instead of multiprocessing (1 master, 1 child process) but the results are similar.

Did anyone encouter such behaviour before?


Update:

  • Celery when run with -c N starts with N + 2 processes, N of which are workers (what are the other 2?).
  • I'm beginning to think that this is related to the expect stanza, but not sure what the value should be. With eventlet expect fork makes sense. But what about multiprocessing?

Update2:

Using except fork seemed to stop the processing from dying, but when trying to stop or restart the job it just hangs.

yprez
  • 183
  • 1
  • 10

2 Answers2

5

The solution was not to run Celery beat together with the worker (removing the -B part from the exec command).

Apparently this was the "extra" process, and was somehow messing things up.

Here's the final script I ended up with:

description "celery"

start on started postgresql
stop on runlevel [!2345]

kill timeout 20
setuid ***
setgid ***
respawn

chdir /opt/***/project/
exec /opt/***/virtualenvs/***/bin/python manage.py celery worker --settings=settings.staging -c 4 -l DEBUG

And running celery beat separately:

description "celerybeat"

start on started celery
stop on stopped celery

setuid ***
setgid ***
respawn

chdir /opt/***/project/
exec /opt/***/virtualenvs/***/bin/python manage.py celery beat --settings=settings.staging -l DEBUG
yprez
  • 183
  • 1
  • 10
5

Using chdir inside the script clause is plain wrong, and it means that you fail to understand a very basic idea in upstart (no offence meant). (As a side note, the exec keyword is just useless, but does no harm).

There is an idea very central to understanding how upstart works. Upstart tries to determine which process of the processes that were spawned by the script stanza is the actual daemon of this service. Then, it uses this process to determine whether this job is running or stopped or failed or whatever. For this reason, it is of paramount importance to make sure it gets the process right.

The algorithm to determine the process is very simple, it depends on expect stanza. expect fork means "take the second fork in the script stanza", expect daemon -- same, but the third one.

Now, using chdir inside script means that it calls the actual /bin/chdir binary and this counts as a separate fork. What you need to do is move it outside the script stanza and than play with expect stanza until you get it right. You can check if you got it right by comparing the output of initctl status celery against ps. The PIDs should match.

Leonid99
  • 326
  • 3
  • 3
  • Hmm... didn't see the part where `chdir` is executed like a command and not an upstart stanza... The `script` stanza looks completely redundant in this case. Seems the solution here is to remove the `script`, `end script` lines and just use `exec`. – yprez Feb 06 '14 at 22:51
  • Exactly, `exec` stanza is just a one-line `script`. – Leonid99 Feb 09 '14 at 08:17