28

In the Wikipedia page for CPU time, it says

The CPU time is measured in clock ticks or seconds. Often, it is useful to measure CPU time as a percentage of the CPU's capacity, which is called the CPU usage.

I don't understand how a time duration can be replaced by a percentage. When I look at top, doesn't %CPU tell me that MATLAB is using 2.17 of my cores?

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
18118 jasl      20   0 9248400 261528  78676 S 217.2  0.1   8:14.75 MATLAB      

Question

In order to better understand what CPU usage is, how do I calculate the CPU usage myself?

Braiam
  • 622
  • 4
  • 23
Jasmine Lognnes
  • 2,490
  • 8
  • 31
  • 51
  • Press '1' whilst you have 'top' open to gather more granularity on per-core basis. – Peter Dec 03 '14 at 00:39
  • That's the number one (`1`). – Michael Hampton Dec 03 '14 at 02:00
  • Let Linux show you how busy each Processor is with this Command Line request. mpstat -P ALL 5 3 enter for multiprocessor status 5 seconds 3 intervals. Divide the %CPU reported by your number of cores to get average CPU Busy %. iostat -xm 5 3 enter will tell you how many cores/CPU you have available. – Wilson Hauck Dec 26 '18 at 14:11

2 Answers2

35

CPU time is allocated in discrete time slices (ticks). For a certain number of time slices, the CPU is busy, other times it is not (which is represented by the idle process). In the picture below the CPU is busy for 6 of the 10 CPU slices. 6/10 = .60 = 60% of busy time (and there would therefore be 40% idle time).

enter image description here

A percentage is defined as "a number or rate that is expressed as a certain number of parts of something divided into 100 parts". So in this case, those parts are discrete slices of time and the something is busy time slices vs idle time slices -- the rate of busy to idle time slices.

Since CPUs operate in GHz (billions of cycles a second). The operating system slices that time in smaller units called ticks. They are not really 1/10 of a second. The tick rate in windows is 10 million ticks in a second and in Linux it is sysconf(_SC_CLK_TCK) (usually 100 ticks per second).

In something like top, the busy CPU cycles are then further broken down into percentages of things like user time and system time. In top on Linux and perfmon in Windows, you will often get a display that goes over 100%, that is because the total is 100% * the_number_of_cpu_cores.

In an operating system, it is the scheduler's job to allocate these precious slices to processes, so the scheduler is what reports this.

Kyle Brandt
  • 82,107
  • 71
  • 302
  • 444
  • 1
    Time slices are not measured in billionths of a second. They are not that short. They are more likely somewhere between 0.1 ms and 10 ms. Resolution of time values in APIs is not the same as the rate of timer interrupts. Some API calls in Linux have times specified in nanoseconds, but you wouldn't want timer interrupts that frequently. If you had a million interrupts per second, you would spend all the CPU time on context switches. – kasperd Dec 03 '14 at 01:34
  • Kasperd, understand that. The CPU operates at that frequency though.... rewrote it – Kyle Brandt Dec 03 '14 at 01:35
  • 2
    Do you mean 1000 ticks? All my Linux systems are either 1000 ticks (EL5 and EL6), or [1000 ticks + tickless](http://lwn.net/Articles/549580/) (EL7). Or do you mean something else? – Michael Hampton Dec 03 '14 at 02:06
  • @MichaelHampton on CentOS 6.5 `sysconf(_SC_CLK_TCK)` returns 100, am I missing something? – Kyle Brandt Dec 03 '14 at 02:59
  • 1
    The man page says: "The corresponding variable is obsolete." I don't think that can be relied upon. I checked the kernel configuration in /boot/config-2.6.32-whatever it is this month... – Michael Hampton Dec 03 '14 at 03:00
  • @MichaelHampton: I think maybe this starts to explain it: http://man7.org/linux/man-pages/man7/time.7.html . On my CentOS system CONFIG_HZ=1000 , and /proc/stat does return things in 1/100 of a second. So it seems like maybe the tick interval of the kernel and user space reporting may not be the same. – Kyle Brandt Dec 03 '14 at 03:13
  • @MichaelHampton: http://lists.kernelnewbies.org/pipermail/kernelnewbies/2011-March/001191.html seems to indicate that as well – Kyle Brandt Dec 03 '14 at 03:17
  • 1
    Aha, no, they're not reported the same. CLK_TCK is a scaled value explicitly for userspace, and is apparently always 100 regardless of how many ticks the kernel actually uses. Found some good explanations on SO ([1](http://stackoverflow.com/q/17410841/1068283), [2](http://stackoverflow.com/q/4189123/1068283)) – Michael Hampton Dec 03 '14 at 03:23
  • 2
    I also think the Windows "ticks" that that API call refers to aren't the same as the [Windows timer interrupt frequency](https://randomascii.wordpress.com/2013/07/08/windows-timer-resolution-megawatts-wasted/), and thus isn't really comparable. – Michael Hampton Dec 03 '14 at 04:14
  • @MichaelHampton: Will read and correct the answer in the morning. Won't be offended by any edits either :-P – Kyle Brandt Dec 03 '14 at 04:20
15

The CPU time is the time that the process is using the CPU - converting it to a percentage is done by dividing by the amount of real time that's passed.

So, if I have a process that uses 1 second of CPU time over a period of 2 seconds, it's using 50% of a CPU.

In the case of your MATLAB process, 217% indicates that it's used 2.17 seconds of CPU time per second over the last sample interval - effectively, monopolizing 2 CPU cores and taking some of a third.

Shane Madden
  • 112,982
  • 12
  • 174
  • 248