Use Nagios(or alternative free product) to monitor uptime of 50+ remote machines with dynamic IP addresses

Question

I'm trying to monitor the uptime of 53 remote windows machines, all at separate locations. These are all behind cheapy consumer level routers of various makes, and they all have dynamic IP addresses.

I just want to have a list of the machines, if they are currently connected to the internet, and if they're not connected when did they check in last.

Would be nice: A simple log for each machine indicating times they went offline.

We're a non-profit. I'm looking for an open-source/free solution.

My original idea was to have a hidden IRCbot launch on each machine and autoconnect to a channel. I could join said channel and view at a glance which machines were connected, and the channel log(with enough sifting) would tell me which machines were frequently disconnecting.

A friend told me "nah, use nagios."

After a bit of googling I've arrived at NCclient++ which I've installed on a remote box and am attempting to make check in with my brand-new Nagios box, without much luck.

Am I on the right track? Can anyone point me in the right direction? I've been googling around for a more comprehensive guide on how to do this, and I've not had much luck.

score 3 · Answer 1 · answered Jun 16 '11 at 17:24

Some possibilities: 1) What about using dynamic DNS for each host? The last poll time indicated in the service can be a rough uptime gauge.

2) Something equivalent to Logmein Hamachi can create a virtual private network of all the hosts. The IP addresses of each host in the VPN can be static. Then the normal Nagios monitoring can be done. Or simply regular scheduled host checks via ping.

3) I am not sure if NSClient++ appropriate. For one, NSClient++ requires a specific port to be open on both hosts, on both networks, in both traffic directions. Cannot remember the port, but it is not a commonly opened port. You can change the port, but the network traffic path must be clear. Also, use the check_nrpe plugin, not the check_nt plugin, in NSClient++.

4) You may want to consider NSCA instead of NSClient++. It is a passive check, i.e. NSCA sends the results of the check to the Nagios server. This way, only the network path required for each host to reach the Nagios server need to be open. Maybe a port forwarding magic in the router? http://exchange.nagios.org/directory/Addons/Passive-Checks/Windows-Passive-checks-for-NSCA/details

Gabriel Sosa · Answer 2 · 2011-06-16T18:48:17.820

1

We use nagios... you could use nagios too. The main difference is the availability of your "checker box"

A service like pingdom.com or some alternative have many sources where the service is checked, so before raise an alert many checks are done. In the other hand in the case where your nagios lost communication ONLY with the other servers you will get a bunch of falses positives.

edited Jun 16 '11 at 18:48

answered Jun 16 '11 at 16:54

Gabriel Sosa

1,200
1
11
13

score 1 · Answer 3 · answered Jul 23 '11 at 04:37

If I were forced into doing something like this, I'd use Jmarki's option 4, but with nsclient++. Your Nagios host will have to have a static IP address, and a firewall rule allowing access to the NSCA service, which usually runs on port 5667. Then you configure nsclient++ to send NSCA messages to that host.

If you cannot setup Nagios with a static IP address that can be reached by the remote hosts, then you should consider using a service like DynDNS or similar to make a hostname available for your checks to be sent to, allowing the dns entry to be updated as/when needed.

The nsclient++ docs have details on how to setup the agent to send information to a Nagios host. Once you have that side setup, you then configure Nagios to accept passive check results. This is also well documented on the Nagios site.

Use Nagios(or alternative free product) to monitor uptime of 50+ remote machines with dynamic IP addresses

3 Answers3