1

We have a Debian server which runs Tomcat, inside it, a single WAR is deployed which listens for incoming MQTT messages, processes them, and forwards the result to different third-party web services (depending on the received message). Mostly everything works fine, but once in a while (almost daily right now) we start to experience what I think are communication issues (network), receiving errors like:

  1. Connection reset
  2. Connection timed out
  3. Host unreachable

Is there any way I can diagnose such issues and get metrics or alike that could reflect some kind of network resource outage or similar problems?

Operating System

Distributor ID: Debian
Description:    Debian GNU/Linux 8.6 (jessie)
Release:    8.6
Codename:   jessie

Kernel

Linux tomcat-ws 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u2 (2016-10-19) x86_64 GNU/Linux

Java

java version "1.7.0_111"
OpenJDK Runtime Environment (IcedTea 2.6.7) (7u111-2.6.7-2~deb8u1)
OpenJDK 64-Bit Server VM (build 24.111-b01, mixed mode)

Tomcat

Apache Tomcat/8.0.14 (Debian)
gvasquez
  • 153
  • 8
  • From Tomcat logs I get the above errors. Regarding system logs, which ones would you recommend to check? I'm thinking perhaps of a `ulimit` related issue. – gvasquez Mar 01 '17 at 18:35

1 Answers1

2

Install some monitoring and have it gather data about the system, it's resources and their usage. Then use Scientific Method to figure out the solution.

user9517
  • 114,104
  • 20
  • 206
  • 289
  • The Scientific method is the way I do it, but as I don't know which resources to monitor I'm pretty troubled with that part. But I'll give it a try to Nagios/Cacti and alike. – gvasquez Mar 02 '17 at 13:06
  • Monitor everything and get a good view of what's going on on when the system is performing correctly and when it's not. Knowing how your systems perform is key. – user9517 Mar 02 '17 at 13:10
  • Any personal recommendations and/or comment regarding Nagios and Cacti? (Please consider there's no XServer in the running server) – gvasquez Mar 02 '17 at 13:15
  • I use zabbix and the usual Linux tools (top, vmstat, iostat, sar etc) as per the second link in my answer. – user9517 Mar 02 '17 at 13:41