12

My requirements are very simple. I need to graph the CPU usage on a remote Linux server. I'm not looking for anything complicated, I just need to be able to plot the CPU usage on a Linux server over a week.

I started down the cacti route - it's not simple, it's not straight forward and it definitely feels like overkill.

Is there a simpler, quicker and more straightforward option?

slm
  • 7,355
  • 16
  • 54
  • 72
Bart B
  • 3,419
  • 6
  • 30
  • 42

9 Answers9

15

Munin is very nice, and easy to install and setup.

wazoox
  • 6,782
  • 4
  • 30
  • 62
12

For a one off sort of thing, I would get the data using sar (sysstat package) and then graph it with it rrd tool. Here is a script that aids in creating graphs from sar output.

Kyle Brandt
  • 82,107
  • 71
  • 302
  • 444
  • +1 for sar + graphing-tool-of-your-choice Also check out ksar, which is a java app that eats raw sar data and poops out pretty graphs. I didn't find it super intuitive to use but the end result was good. – DictatorBob Sep 30 '09 at 19:44
6

You could try sar grapher at http://www.fsprings.com/neat/sargrapher.html, you upload sar -p -A output on it provides a page with graphs. If you want you can select the sar options you want it will only graph those.

user190941
  • 61
  • 1
  • 1
2

A couple of questions: - do you want to generate plots in real-time? - how often do you want to sample?

A previous comments mentioned 5 minute samples and I have to say if you really want to know what your CPU is doing with any confidence you should really be down in the 10 second range. Averaging things out at 5 minutes will just cause you to miss spikes that could be minutes long! Admittedly 10 second samples could miss 5 second spikes, but you have to figure out what you're trying to see in the data.

Personally I use collectl, probably because I wrote it! ;-)

But it runs very efficiently at low sampling rates (even sub-second) and can even report its output in a format suitable for plotting. In fact if you install collectl-utils and write to a shared directory, you can use colplot to see the data in real time.

One last comment about RRDTool. It's a great package and draws very nice plots, but you do need know if you log a lot of samples for a single day the resultant plots are not accurate. This is because RRDTool normalizes multiple samples into single data points to make the graphs simpler, something colplot never does. It uses gnuplot to make sure every data point that is captured is faithfully plotted.

-mark

1

As mentioned by @user190941 and @kyle-brandt, a quick and not so dirty solution is:

1. Install and run sysstat

On debian based distros:

$ sudo apt install sysstat
$ sudo systemctl start sysstat

2. Run sar

2.1. SSH to remote host

2.2. Create a screen

To keep sar running even when you lose ssh connection:

$ sudo apt install screen
$ screen -S sar

2.3 Execute sar in screen

$ sar -pu 1 604800 > cpu.sar

This will record CPU utilization (-u), every second for 604800 seconds = 7 days * 24 hours/day * 60 minutes/hour * 60 seconds/minutes

3. Visualize Online

scp cpu.sar to your local computer and use an online sar charting tool to plot the numbers:

kaptan
  • 123
  • 6
1

I prefer ORCA - www.orcaware.com for graphing server statistics.
The setup these days isn't too difficult (use snapshot r535), and it can display in hourly, daily, weekly, monthly, quarterly and yearly ranges.

It's based on a data collector (procallator) that polls in 5-minute intervals. The graphing engine is an old version of RRDTool, but is quick for this application.

For the remote server, you can have it graph its own stats, or you can pull the procallator files via ssh/rsync/scp on a regular interval to graph on a local server. It works well either way.

ewwhite
  • 194,921
  • 91
  • 434
  • 799
0

To monitor remote system usage, you could use a tool that outputs graphs to the terminal, like atopsar-plot. It is capable of plotting historic days with the --day parameter. Installation is easy, you just need atop, Python and pip. No configuration neccessary.

Full disclosure: I'm the author of atopsar-plot.

$ atopsar-plot --day 0 --disk sda --iface 0s31f6

                       %CPU_idle                  
      ┌──────────────────────────────────────────┐
1586.0┤▄▄▄█████▄▄▄▖       ▖   ▐█▄▄▄▄▄▄█▟█▙▄▄▄▄▄▄▄│
1532.0┤████████████▙▄▄▄▄▄▟█▖  ▟██████████████████│
1478.0┤█████████████████████▖▗███████████████████│
1424.0┤█████████████████████▙████████████████████│
      └┬────────┬────────┬──────────┬───────────┬┘
       09:57  10:37    11:07      12:07     13:07 

                      %DSK_busy                   
    ┌────────────────────────────────────────────┐
36.0┤                      ▟▄                    │
24.0┤                     ▟██▌                   │
12.0┤▄▄                  ▗████                   │
 0.0┤█████▄▄▄▄▄▄▄▄▄▄▄▄▟█▄█████████▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄│
    └┬────────┬────────┬──────┬──────┬──────────┬┘
     09:57  10:27    11:07  11:57  12:17    13:07 

                      MB_SWP_free                 
      ┌──────────────────────────────────────────┐
8192.0┤████████████████████▙▄▄                   │
8186.7┤███████████████████████▙▖                 │
8181.3┤██████████████████████████▄▄              │
8176.0┤█████████████████████████████▙▄▄▄▄▄▄▄▄▄▄▄▄│
      └┬────────┬────────┬──────────┬───────────┬┘
       09:57  10:37    11:07      12:07     13:07 

                      NET_iMbps                   
    ┌────────────────────────────────────────────┐
73.0┤                     ▄█                     │
48.7┤                    ▗██▖                    │
24.3┤                    ▐██▌                    │
 0.0┤▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄█████████▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄│
    └┬────────┬────────┬──────┬──────┬──────────┬┘
     09:57  10:27    11:07  11:57  12:17    13:07 

                      NET_oMbps                   
     ┌───────────────────────────────────────────┐
151.0┤                             ▗▟            │
100.7┤                           ▗▟██▖           │
 50.3┤                          ▄████▌           │
  0.0┤▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄███████▄▄▄▄▄▄▄▄▄▄▄│
     └┬─────────┬───────┬───────────┬───────────┬┘
      09:57   10:37   11:07       12:07     13:07 
0

When I was working with some Linux boxes I was using Splunk and found it very useful.

I liked Splunk because it allowed me not only to monitor performance but set up alerts for not only my Linux but my Windows boxes as well.

jgardner04
  • 298
  • 2
  • 10
0

If you really have just one, ignore this, but if you have a bunch, or are going to grow then Ganglia might be worth a look.

5 second sampling, and a bunch of metrics beyond CPU, nicely managed at multiple levels, per server/cluster/farm, etc.

Alex
  • 1,103
  • 6
  • 12
  • I thought Ganglia looked good and tried it out on a small number of Linux servers running different distros; the setup was relatively easy but I found the graph display to be very unreliable. The collected stats for certain servers displayed just fine, but for others nearly all the data was missing. There didn't seem to be any rhyme or reason as to which servers worked and which didn't. – gareth_bowles Dec 22 '09 at 22:08