1

My Spring Boot application stopped working a couple of days ago and I'm trying to figure out why so I can prevent it in the future. This is the first time this happens so I don't really know where to start. Restarting the server solved the problem.

I will write down everything I consider relevant and hopefully someone will help me with how I should go about this.

  • Hosted on a Digital Ocean droplet.
  • Ubuntu 16.04, 1GB RAM, 25GB SSD, 1 core.
  • The HTTP requests hit a separate server (same setup) running Nginx and are passed to the upstream server running the Spring Boot application. During the failure, all http requests return a 502 and are logged in error.log by Nginx as

2019/04/20 20:06:56 [error] 14576#14576: *1161160 connect() failed (111: Connection refused) while connecting to upstream, client: xx.xxx.x.xxx, server: api.example.com, request: "OPTIONS /oauth/token HTTP/1.1", upstream: "http://xx.xxx.xx.xxx:8080/oauth/token", host: "api.example.com", referrer: "https://example.com/login"

2019/04/20 20:06:56 [error] 14576#14576: *1161160 no live upstreams while connecting to upstream, client: xx.xxx.x.xxx, server: api.example.com, request: "OPTIONS /oauth/token HTTP/1.1", upstream: "http://server_upstream/oauth/token", host: "api.example.com", referrer: "https://example.com/login"

  • I was able to SSH onto the server without issue.
  • I use log4j2 for logging in the Spring Boot application, but nothing was logged during the failure.
  • A separate cron on the same server, periodically fetching data over HTTP, worked fine during the failure.
  • When the failure happened there was a huge drop in used memory of the server (85% -> 18%).
  • I cannot find any relevant information in the syslog.
  • The Spring Boot application in run in systemd, and (I think) the Spring Boot application was still running during the failure.

Where should I start looking for the reason for the failure? Is there anything I can do to make it easier to debug this if it happens again?

darksmurf
  • 111
  • 1

1 Answers1

0

Looks like your java process was down. Could you please provide more information:

  • launch command, if it systemd - service file
  • last logs

Possible reason is out of memory, because 1 GB is quite small amount of memory for java web app. But it depends on code and launch parameters.