10

I am running an python script on an ec2 instance that inserts rows in an database on another instance. In ec2's monitoring I saw a 100% cpu utilization, whereas top only shows 20% for the python process. What is missing from top? Network overhead?

RickyA
  • 300
  • 1
  • 4
  • 12

2 Answers2

19

The data exposed by top is often insufficient or misleading in virtualized environments like Amazon EC2 and the reported percentage depends on your instance type and the under­ly­ing proces­sor core utilization (which usually doesn't match the virtualized hardware you are presented with from the hypervisor), amongst other things - what you are seeing is most likely caused by respective CPU steal time as exposed in most related Unix/Linux monitoring tools nowadays - see e.g. columns %steal or st in sar or top:

st -- Steal Time
The amount of CPU 'stolen' from this virtual machine by the hypervisor for other tasks (such as running another virtual machine).

The blog post EC2 monitoring: the case of stolen CPU provides a nice exploration and illustration of this topic:

When the top com­mand dis­plays 40% CPU busy but Cloud­Watch says the server is maxed out at 100% — which side do you take? The answer is sim­ple (Cloud­Watch is cor­rect, top is not) [...]

Please note that this hypervisor metric seems to be (easily) accessible on Unix/Linux systems only, but doesn't seem to be observable on Windows (yet), see my question Is there a Windows equivalent of Unix 'CPU steal time'? for more regarding this problem.

jellycsc
  • 137
  • 7
Steffen Opel
  • 5,560
  • 35
  • 55
  • 2
    Thanks for the blogpost. That really makes it clear. It is really good to know this since I am about to roll out Ganglia, and it would be a shame to measure the wrong metrics. Measure %idletime! – RickyA Jun 20 '12 at 15:01
  • In my case, cloudwatch is also reporting 3mb/s network usage, but when I look on my server (with iftop, iptraf, netstat, etc) I see that the only thing with a network connection is my ssh into the server. Which I really doubt is using 3mb/s... – Benubird Feb 27 '15 at 08:37
-2

Amazon probably check load and no percentage usage from TOP. If you have two processes on cpu, they can be utilized on 20% but you can have load 2.