0

I have a Nagios/icinga monitoring system, which i use to monitor mainly Windows based machines running a version of NSClient++ that i found works without being too annoying (NSCP-0.4.1.105-x64). This has been working fine.

Recently though, i'm starting to get a lot of random "Connection refused by host" messages on random services. Usually it's just one service per machine, anything from 2-10 machines will throw this error.

This started maybe a week ago.

Normally connection refused by host would indicate soem sort of firewall issue or maybe even timeout. but the fact that it is only 1 of the 10-15 services that report this and within maybe 2-3 minutes it's checked as ok makes this very annoying.

I've tried updating the NSclient install and also tried to lessen the load of the icinga machine by increasing round timers and timeouts, not that it is particularly high at about 0.15 load

Any idea where i can start with this?

At the moment i have about 40 servers and 200 services, and 6 of them report one service with "Connection refused by host" about half of them are phsyical machines, other half VMs

Snowflow
  • 69
  • 6

1 Answers1

0

Ended up being a packetloss issue in a site to site VPN between the monitoring server and the host objects.

Snowflow
  • 69
  • 6