My problem occured when using Redis on Kubernetes, but it seems that it is not a problem with Redis itself, but with network/infrastructure. My scenario:
- I have a Redis Service with single Redis Pod serving it.
- I connect Redis Client to the Service.
- I delete a Redis Pod.
- Client connection gets ended.
- Redis Client tries to reconnect.
- In this time Redis Replica Ret brings up a new Redis Pod, and the Redis Service starts responding to requests/creating new connections.
- However my existing Redis Client is hanged on connection (first reconnect attempt) and it stays that way until it gets timeout (which is approximately after 130 seconds).
- On the second reconnect attepmt it gets connected immediately.
The problem seems to not exists on my dev env (local docker containers), because timeout shows up after a second or 2.
Also, the client that I am using has no option to configure a socket timeout.
- Is this a proper behavior of a Service - hanging a connection until a timeout occurs whend there are no Pods to handle requests? If it responded with error immediately, there would be no such problem.
- Is there a way to configure this timeout to acceptable value somewhere (on Service level, on some network configuration, etc.)? Let's say 5 seconds would be ok.