1

So, I have been playing around with my auto scaling config and my Cloudwatch alarms to try and keep all my instances purring but not roaring.

I can't seem to get rid of a constant yoyo. CPU usage goes up, introduce an instance, CPU usage goes down, kill an instance. Rinse and repeat.

enter image description here

I'm currently basing my alarm on 3 x 1min intervals of average CPU >= 40%. Maybe I can base it on something else? CPU is a tricky one as when this graph is spiking (high) I can see some instances with idle CPU so the average is being raised by a single instance.

I'm finding some people are getting 502's when I'm getting 200's. Obviously I would like this to be consistent and stop this spiking all the times.

Thanks in advance.

EDIT 1: I have adjusted the Cloudwatch metric to be 20% cpu over 2 mins and also found an nginx error that may also have attributed to some additional load. Current graph looks like the below.

enter image description here

EDIT 2: Monitoring on load is so much better. See below for the load alarm. I'm getting alerts far less frequently and everything is running much nicer.

This is what I'm running in cron every minute;

/usr/local/bin/aws cloudwatch put-metric-data --namespace="NS" --metric-name="GroupLoad" --value `cat /proc/loadavg | awk '{print $1}'` --dimensions AutoScalingWebGroup=NS-WebGroup

enter image description here

Christian
  • 779
  • 1
  • 13
  • 31

1 Answers1

1

Instead of AutoScaling based on CPU try Server Load.

AWS AutoScaling can operate on any CloudWatch metric, and you can write your own custom CloudWatch metrics.

More Information on how AutoScaling works: http://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide/as-scale-based-on-demand.html

Creating a custom metric

http://aws.amazon.com/blogs/aws/amazon-cloudwatch-user-defined-metrics/

CloudWatch metrics are scoped within namespaces, and can be further qualified by up to 10 dimensions. For example, latency could be tracked for a pair of applications ("App1" and "App2") while keeping the values isolated from each other:

$ mon-put-data -namespace App1 -metric-name Latency -value 104
$ mon-put-data -namespace App2 -metric-name Latency -value 120
Drew Khoury
  • 4,569
  • 8
  • 26
  • 28