Kubernetes update results in pod stuck on terminating

0


I am facing a problem with kubernetes deployment. This is probably going to be a low-quality question, I'm sorry about that: I'm new to server management and I'm not able to contact the person that set up the servers in the first place, so I'm having a hard time.
My configuration (in the testing environment) consists in one master node and two normal nodes, each one of them hosting one replica of a pod (two pods in total) containing a docker image running a wildfly server.
I was messing around with the testing environment, because we used to experience a problem: sometimes after deployment (randomly a few minutes later or a few ours later) the pods would fail (liveness probe time out) and go in CrashLoopBackOff. I added a line in the code to log an Info message everytime the liveness probe was called, to see if it was called at all, and I re-deployed (deploy configuration unchanged). Since the problem presents itself randomly, I spent the afternoon re-deploying every hour or so (without changing anything), and monitoring the logs. No luck.

So, here's the part where something went wrong:
After deploying for the n-th time, I started seeing events about FailedScheduling. Looking at the pod status, I can see that one of the two pods from the old replicaset is stuck on Terminating, and the pod that's supposed to take its place is stuck on Pending. I can solve the problem by calling kubectl delete pod --force --grace-period=0 [pod name], but this happens again every time I deploy, so of course it's not ideal. I haven't tried yet to deploy in production environment.
Here are the logs:
Pod status: https://pastebin.com/MHuWV2dM
Events: https://pastebin.com/8hvpg9n5
Describe pods: https://pastebin.com/QFmkUQB3
Thank you in advance for any help you can provide.

Riccardo Vailati

Posted 2019-09-27T10:24:29.227

Reputation: 101

No answers