We have a server with unusual high load and cpu util, but we can't figure out why. When we run top all the procs seem to be very low cpu.
http://cl.ly/2d1g0K3q261r0R0K3e35
Is there a better way to look for what is causing this?
We have a server with unusual high load and cpu util, but we can't figure out why. When we run top all the procs seem to be very low cpu.
http://cl.ly/2d1g0K3q261r0R0K3e35
Is there a better way to look for what is causing this?
Load is a measure of the workload a system has had on a 1, 5 and 15 minute basis.
The most common misconception is that Load Average is purely connected to the CPU usage of a system.
Load does however incorporate additional measurements such as CPU waiting for I/O which I think is your issue.
Based on the image I'm guessing you ran out of memory and started swapping data to disk.
A simple free -m
will tell you how much RAM and swap is used.
The interesting column is the free column besides -/+ buffers/cache
.
If it's close to zero you've run out of RAM and should act accordingly.
Noticed that the load average is quite high (68, wow). Is it possible that there are a lot of processes which takes up a little bit of CPU, thus add up consuming all CPU time? Maybe, those processes just start and finish very quickly thus top cannot capture the existence of them, you may try to see if atop can see that or not.
I think this bug is your case. From what I see from the output, you have enough memory (note the cached 14 GB or so), no I/O issues, but you have xen-related processes running. This make me think it is a bug.
Try using:
top -o cpu
The -o
flag will force top to order the processes by CPU usage in descending order.
It could be locked files on nfs or any other thing that locks a file that another process needs access to
could also be missed configured service with too many threads active
Looks like CPU usage is coming from a thread. top
seems to not take this into account. I recently saw this on a mysql server. there are running INSERT statements but I was unable to get the new rows with SELECT because some thread of mysqld was updating the table index. top shows 100% user load on one core but every process including mysqld was an 0.0% CPU.
hours later the same SELECT provided the expected result set.
See also