0

We have one deployment that consists of only one pod (with service and ingress). It is using a Docker container that executes a custom run script as its command.

When we roll out a new version, image is pulled, new pod is created and that script is started. At that point that new pod is "Running" and old pod is "Terminated" because number of desired pods is still 1.

However, this is the meat of our problem, is that this run script can sometimes take a few minutes to finish. It includes some DB migrations and other stuff that cannot be done during build (ie put in Dockerfile). This results that our new pod is running for a few minutes, but isn't ready to serve requests, resulting in some downtime of our service.

My question is - is there a way to "delay" the termination of an older pod to prevent this? Or delay the flagging of new pod as "Running"?

I know the ideal solution is to have more than 1 pods, but that is (currently) not possible as the service in question is not entirely stateless. But even if it were, if we had, for example, 3 pods, they would all enter "Running" state without actually finishing the tasks and yet again causing some (albeit smaller) downtime.

How should I deal with this kind of problem?

Kortemy
  • 103
  • 1

1 Answers1

0

I'm not 100% sure if you can delay a pod from terminating till jobs are done but you can set a check to make sure a pod is fully ready for work. There are two types of checks liveness and readiness

https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/

liveness is when you have an app that just stops accepting traffic for some reason and a restart might fix the issue.

readiness is what you are looking for where an app might take a bit to fully load to accept traffic.

The ideal solution would be to setup a readiness check and have the app be smart enough at some endpoint like / or /ready to return a 200 response to let kubernetes know its ok to accept traffic.

Mike
  • 21,910
  • 7
  • 55
  • 79
  • I did exactly that, I set up readiness probe for the newly created container and now I can see exactly the moment it is ready to serve requests. But it didn't solve my issue, as old pods are terminated before newly created container is ready. Any way to solve that? The problem is that I have only 1 replica (and for the time being I cannot increase that number). – Kortemy Jan 03 '18 at 11:03