Debugging kubernetes connection reset by peer to external Oracle DB

Question

question related to this issue. Basically we have a Java app which when started and user logs in, it creates long living connection to Oracle DB which stays active for a lifetime of an app (or a kubernetes POD in this case). Issue is that after some time, it can be 30 minutes, it can be even 2 days, there is an error in logs

[pool-16-thread-1] WARN  c.zaxxer.hikari.pool.ProxyConnection - HikariPool-2 - Connection oracle.jdbc.driver.T4CConnection@51f5db67 marked as broken because of SQLSTATE(08006), ErrorCode(17002)
java.sql.SQLRecoverableException: IO Error: Connection reset by peer

which results in a lot of SQL errors afterwards, because said active connection still tries to use the broken connection. As it turns out, it looks like hibernate session gets created but never cleaned up as I believe it should always create a new session when interacting with DB. I have pointed this out to our developers but no idea when this will get fixed.

We are currently trying to migrate the app to kubernetes and the main issue for me is why this Connection reset happens on kubernetes? This does not happen on a plain virtual machine for example, although the java app is same. The app is scaled to 2 PODs and the issue sometimes happens only for one POD, sometimes after 5-10 minutes on the second POD as well so this kind of excludes network glitch as it should then happen on both PODs at the same time, no? I have no idea where else to look for as all the logs for kubernetes and nodes does not say anything. Maybe anyone has any idea where or how to properly debug this? From oracle DB side they say that there is also only connection lost error or something and thats all.

Debugging kubernetes connection reset by peer to external Oracle DB

0 Answers0