0

I am currently using AWS Cloudwatch to monitor basic metrics of my EC2 servers.

But it is lacking of detail monitoring such as partition space used, memory free etc.

Should I install and use Nagios, or other better alternatives?

(I want to automate as much as possible, do I don't prefer Nagios...)

Howard
  • 2,005
  • 11
  • 47
  • 70

4 Answers4

3

What I have is a simple script that I wrote and checks if the CPU or memory reaches the threshold which I consider it as high load. If it is, the script will call another script that will gather information for me to analyze what causes the highload and sends it to my e-mail address as an attachment. Below is a sample of my script that you may want to consider enhancing it and apply it to your needs.

#!/bin/bash

GATHER_INFO=<SCRIPT_NAME_HERE>
CPU_LOAD=$(uptime | cut -d"," -f4 | cut -d":" -f2 | cut -d" " -f2 | sed -e "s/\.//g")
CPU_THRESHOLD=<VALUE_HERE>
MEMORY_USAGE=$(free -m | grep -i "buffers/cache" | awk '{ print $3 }')
MEMORY_THRESHOLD=<VALUE_HERE>

if [ $CPU_LOAD -gt $CPU_THRESHOLD ] ; then
  $GATHER_INFO # I call another script here.
  <SEND_INFORMATION_GATHERED_BY_EMAIL_HERE> # I use nail/mailx here.
  exit 0
elif [ $MEMORY_USAGE -gt $MEMORY_THRESHOLD ] ; then
  $GATHER_INFO # I call another script here.
  <SEND_INFORMATION_GATHERED_BY_EMAIL_HERE> # I use nail/mailx here.
  exit 0
fi

exit 0

Please take note that the external script $GATHER_INFO depends on tools that are already installed in your system (e.g. sysstat).

I have answered a similar problem and is located here for your reference.

I also used Munin and it's very simple to use but the problem with it is that the disk I/O is too high on the Munin server which is not practical if you host it in one of the EC2 instance unless you are only monitoring a few number of instances.

bintut
  • 304
  • 1
  • 5
1

Cloudwatch provides basic metrics by default. You can add custom and detailed metrics as you wish (although, you are limited to 10 additional free metrics).

For the two examples you provided (disk space and used memory) it is very easy to setup Cloudwatch. In essence, you need: a script, run via cron, that will gather the data and log the custom metric(s) to cloudwatch (e.g. aws-missing-tools or this forum post).

Beyond that, it really comes down to what you want to do. If the above meets your needs, there is no need look into more complex solutions. Moreover, depending on what you mean by 'automation' cloudwatch is more integrated into the rest of AWS, which would allow you easier control in many cases (e.g. launching new instances).

cyberx86
  • 20,620
  • 1
  • 60
  • 80
0

Cloudwatch is not intended to be highly flexible and meet ALL your monitoring needs. it covers bascis to intermediate level of monitoring and lacks many features which are present in Enterprise monitoring system (as its not intended to be like one of them, its focus is different).

i would personally recommend you to use ZenOSS (ease of use), or Nagios (complex manual setup)

Farhan
  • 4,210
  • 9
  • 47
  • 76
0

CopperEgg provides a more detailed view into the performance and operation of your servers. More detailed in terms of : - higher resolution data ... meaning server metrics are gathered, analyzed and displayed up to 10 time per minute, and - a richer set of metrics ... for example, seeing the top processes running on each instance, in real time and historically

With regard to automation, CopperEgg provides web-hooks as well integration with Puppet and Chef.

Full-disclosure : my name is Scott Johnson, and I work at CopperEgg. CopperEgg's services are hosted on Amazon EC2, and we use all of our tools to monitor our own service.

Best, Scott