5

My blog is a custom ruby/rack application, and has been crashing randomly every couple of weeks. I sometimes don't notice for days, and I'd like to be notified immediately if it happens.

What's the best way to do it? I'm running Centos 5.3, Nginx, Passenger, Rack, etc.

I've considered figuring out some way to email myself the tail of my error log, as that would help me catch EVERYTHING, not just that one app (it would tell me of missing links, etc). Is there an easy way to do that?

Thanks!

Sean Clark Hess
  • 263
  • 3
  • 13

8 Answers8

6

If you need an alert when your site goes down you should consider an online service for notifications: They will see the outside perspective.

If you monitor from "inside your own box" you will never get an email if it crashes completely or looses its network connectivity because your script will not be able to run or alert you anymore.

Bello or Pingdom both offer free accounts that are great to get you started.

More services are listed in Can anyone recommend a website monitoring service?

Dirk Paessler
  • 914
  • 1
  • 7
  • 15
4

I'm surprised nobody's mentioned Nagios. It's incredibly powerful, does uptime percentages, notification via email/IM, can run scripts on downtime, etc. It's probably the best out there.

Josh
  • 9,001
  • 27
  • 78
  • 124
  • Nagios is almost always the answer for monitoring. But for something like this, it might be overkill. Some scripting would probably be simpler. – baumgart Oct 12 '09 at 18:14
  • I don't agree. With recent Virtual Appliances and work by GroundWork, setting up Nagios is pretty easy, and the sooner you set it up, the longer you start collecting data on trends. Also, what about as he grows/wants to monitor more things? Set up Nagios now, be ready to monitor new services at any time in the future. – Josh Oct 12 '09 at 18:49
  • 1
    Here's the link to GroundWork: http://www.groundworkopensource.com/community/ – Josh Oct 12 '09 at 19:28
  • Actually it is a terrible answer. The cost - just to check whether a server /website is up - are crazy. This is like recommending someone to open a taxi service when he asks where he can get a car for a ride. Nagios would need a second virtual machine (can not check if the first one hangs) and a log of configuraiton. It is a power tool, but for "is my site up" I would always recommend an external service. – TomTom Apr 05 '14 at 15:44
2

Check out AreMySitesUp (http://aremysitesup.com) and Pingdom. Both have free options, and will send an email and SMS when your site is down. AreMySitesUp has an iPhone app as well.

Chris Brentano
  • 306
  • 1
  • 4
1
  • you can use God : god (dot) rubyforge [dot] org

  • do you have a server in another location where you could run scripts?

  • these guys will monitor your page (max 2 urls) for free (every 30 minutes) http host-tracker.com order-page

Cristian
  • 51
  • 1
  • 1
  • 3
1

You can get basic connectivity tests by just writing a shell script that uses wget and then determines if the page responded or not based on the response code.

#!/bin/bash
WGET='/usr/bin/wget'
URL='http://url.to.check'

${WGET} -O /dev/null --tries=1 ${URL}

if [ $? -eq 0 ]; then
    echo "Success!"
     # You could write a log file or something here
else
    echo "Fail! :("
     # run something to mail you that your site isn't responding
fi

This is a very basic example that could be expanded, but if you are just looking for something quick, this will work. You can cron it so you know w/in a minute if it has crashed.

Alex
  • 6,477
  • 1
  • 23
  • 32
  • Yeah, cool... Am I correct in assuming that most people would use something more robust for "real" applications? – Sean Clark Hess Oct 12 '09 at 17:24
  • Yeah, this is just a very basic check. If you wanted to do more detailed monitoring and trending of more than just one URL, you would need something like Nagios, OpenNMS or other similar monitoring systems. – Alex Oct 12 '09 at 20:16
1

Nagios is great if you have a large amount of servers. I suggest starting with munin it is simple to setup and plugins are literally a 5 minute time investment. It is great for collecting statistics and alerting on a smaller scale than nagios. The best part is should you expand to be large enough to warrant the investment nagios requires, it integrates into nagios well.

Munin: http://munin.projects.linpro.no/

Development started picking up again also!!

ScottZ
  • 467
  • 2
  • 7
0

You can use something like puppet or cfengine for process monitoring.

Monitoring whether a certain process still runs and if not, restart the process and report the event, is quite easy with these tools. You can even extend it so that it runs a check like opening a port and expecting some reply on a request.

However, this does not work if your entire server is dying, but that doesn't seem to be the cause here.

I'm not familiar with the ruby/rack set of options, but I know Django can also mail you on server errors (a page that causes an error while rendering) and 404's from your own site. Maybe you can find a similar option or hook in what you're building.

Combining the two of these means I'm notified in case a page fails to render and if the entire daemon dies.

  • I'm running a ruby app without rails. So unless something is built into passenger (the layer that creates ruby processes for nginx), I don't have a framework to do anything cool like that for me – Sean Clark Hess Oct 12 '09 at 17:15
0

you really should focus on debugging and fixing the problem instead :)

Said that, there are two ways to do what you want. If your server is always up (and you trust it to be up), you can easily monitor any running service via a cron job. Any monitoring software would simply be an overkill. But if you have problems with your web application and it fails in some way without actually bringing down any services running on your server, and there is no simple way to test that it failed (the process itself still runs, check results are inconsistent, etc.) then you probably want to use one of the recommended here services that check your site from the outside.

monomyth
  • 971
  • 1
  • 5
  • 9
  • By the time I notice the site is down, it's a pain to find the corresponding entry in my error log. If I get it right away, it will be easier. – Sean Clark Hess Oct 13 '09 at 20:28