1

I have an application that forward TCP connection to another App. Currently I am trying to make this application Zero Download Deployment, so I can deploy new version at any time but there is a problem I don't have not found a solution in how to solve it.

I can't kill the TCP sessions, some of them can least 5 min or 2 hours. I would like to know what is the generic way to solve this problem, when deploying a new version of my software it will be taken by new connections without kill the previous ones.

I know with docker you can modify signals that the container receives and handle them, but still I see on the deployment after some point, they send a a "docker rm" command a delete the container (currently I am testing with Docker Swarm and I assume Kubernetes will do the similar).

Is that the way to go to have a very long time out for the deployment or use kind of a blue/green?

Thanks,

1 Answers1

2

Rolling upgrades. Deploy a new version of the thing. Drain stop the old ones.

Implementation may involve graceful stop scripts, or setting time outs longer than your longest session. On Kubernetes, try terminationGracePeriodSeconds, and be sure to handle SIGTERM.

John Mahowald
  • 30,009
  • 1
  • 17
  • 32
  • Hi @John, thank you for your comment. That's is currently my approach, however, I kind of worries about the GracePeriod, in my case it can be up to hours. Is this still the way to go or better? I guess in a rolling-update won't be successful until all the pods are not running with the updated image. – Penguin_HuHu May 18 '19 at 21:26
  • You need to design this. Check if you have a maintenance window to update and disconnect sessions. Containers tend to assume short connections, explore what happens when you go very long. Some HA paired network appliances mirror TCP connections, but this tends to be exotic and require operating system support. – John Mahowald May 19 '19 at 14:22
  • Thanks again John, Could you explain more what do you mean with " Some HA paired network appliances mirror TCP connections" ? – Penguin_HuHu May 20 '19 at 07:36
  • Some cluster software is capable of transferring active TCP sessions between nodes without them dropping. This requires deep integration with the TCP stack, and isn't something you can do with a typical Linux container. – John Mahowald May 20 '19 at 13:07
  • +1 to TerminationGracePeriod. It's perfectly fine to have a grace period of multiple hours. – Dirbaio Jun 14 '19 at 11:06