0

We have setup of Celery worker with 8 node setup. It create 8 queues in RabbitMQ.

When we start deploying new changes, last step in ansible playbook is Celery restart.

Celery restart has to down each node and start that node. But, there are continue messages coming to queue, and consumed by worker node, its taking more time to restart the celery worker.

I thought, if we remove the consumer from the celery worker queue when we start deployment, then it will celery node will not consume more messages and it will process only those messages which already consumed. This way, it might be fast celery restart.

I am not sure, I am thinking in right direction, but I have to process celery worker restart more quick then what its processing now.

For now, it takes 2-3 hours to complete that step. Some time ansible lost connection and jenkins job update status with failed job.

If there is better way to do this, let me know.

Nilesh
  • 255
  • 1
  • 6
  • 17

1 Answers1

0

It sounds like you just need to change the logic within your worker code to either stop accepting new messages, or just outright shut down.As long as you're using the acknowledgement feature of RabbitMQ queues properly, worst case your message is never confirmed to have been processed and is put back onto the queue to be reprocessed.

Though this wouldn't work if there are other factors in the processing of the message that might cause duplicates, such as if you add a SQL record to a database for example.