5

I have a CentOS-5 server that has very little load. A typical output of "sar" command shows like this:

02:20:01 PM       CPU     %user     %nice   %system   %iowait    %steal     %idle
02:30:01 PM       all      0.63      0.00      1.23      0.31      0.00     97.84
02:40:01 PM       all      0.92      0.32      1.34      0.45      0.00     96.97
02:50:01 PM       all      0.76      0.00      0.93      0.77      0.00     97.54
03:00:01 PM       all      0.41      0.00      0.72      0.15      0.00     98.72
03:10:01 PM       all      0.78      0.32      1.49      1.83      0.00     95.58
03:20:01 PM       all      0.27      0.00      0.50      0.62      0.00     98.61

However, I am observing that every night at 4:10AM, the CPU usage is very high. This happens almost every night except few days in a month. The "sar" command shows something like this:

03:00:01 AM       all      0.10      0.00      0.10      0.11      0.00     99.68
03:10:01 AM       all      0.08      0.32      0.18      0.21      0.00     99.21
03:20:01 AM       all      0.03      0.00      0.03      0.19      0.00     99.76
03:30:01 AM       all      0.01      0.00      0.02      0.06      0.00     99.91
03:40:02 AM       all      0.75      0.32      3.21     15.43      0.00     80.28
03:50:01 AM       all      0.95      0.00      3.43      9.25      0.00     86.36
04:00:01 AM       all      0.69      0.00      3.79      4.86      0.00     90.66
04:10:01 AM       all     22.38      0.54      4.47     30.92      0.00     41.69
04:20:01 AM       all      2.99      0.02      4.58      4.95      0.00     87.46
04:30:01 AM       all      0.34      0.00      0.61     13.22      0.00     85.83
04:40:01 AM       all      0.05      0.32      0.17      0.52      0.00     98.95
04:50:01 AM       all      0.22      0.00      0.36      0.11      0.00     99.32
05:00:01 AM       all      0.23      0.00      0.36      0.12      0.00     99.28

How can I determine what is causing this peak in CPU usage and iowait etc. I examined /var/log/messages but could not determine. There is no other cron job or other things set to run at that time. Since this is showing under %user, I can only make out that it is some user process. How can I exactly determine the cause?

4 Answers4

7

This is when the cron.daily scripts run... By default, it runs daily just after 4:00am.

From /etc/crontabs

SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
HOME=/

# run-parts
01 * * * * root run-parts /etc/cron.hourly
02 4 * * * root run-parts /etc/cron.daily <-- THIS ONE!!
22 4 * * 0 root run-parts /etc/cron.weekly
42 4 1 * * root run-parts /etc/cron.monthly

And in /etc/cron.daily.

The culprit is probably your mlocate.cron, which runs updatedb and can be quite intensive on some systems:

#!/bin/sh
nodevs=$(< /proc/filesystems awk '$1 == "nodev" { print $2 }')
renice +19 -p $$ >/dev/null 2>&1
/usr/bin/updatedb -f "$nodevs"
ewwhite
  • 194,921
  • 91
  • 434
  • 799
  • Thank you very much! I checked according to your points and yes indeed the cron.daily seems to be the culprit. In fact I have logrotate, logwatch, sarg, webalizer, mlocate etc. in /etc/cron.daily. So correctly, these are eating up the CPU time and also responsible for the high I/O. Thank you again for the directions. – WebTenet Solutions Jan 04 '12 at 13:16
  • Please mark an answer. – ewwhite Jan 08 '12 at 14:13
4

As stated in /etc/crontab, there is an automatic cron run scheduled at 04:02 AM, each day:

[...]
02 4 * * * root run-parts /etc/cron.daily

Running a ls /etc/cron.daily/ will show you the running commands scheduled for 04:02AM (they are executed sequentially). You could easily modify them to output load 'before' and after they are executed.

However, if this is a standard CentOS install, there is a very high probability that the script responsible for the high load is either mlocate.cron, 0logwatch (if it is installed) or logrotate.

mlocate will execute a full filesystem scan (which would explain the increased i/o wait on your system), 0logwatch will parse your log file (which can be quite big, resulting in i/o operations) and logrotate can also create significant i/o if it moves file across file systems.

CloudWeavers
  • 2,511
  • 1
  • 14
  • 17
3

Since you are saying that this happens at 4:10 almost everyday a crude way would be to run ps aux in cron for every minute between 4:05 to 4:15 and get the output written in a file. You can then check the %CPU usage of the command that is eating up your cpu.

Aditya Patawari
  • 1,065
  • 8
  • 23
2

Likely to be a cron job - possibly "locatedb" being updated?

Adding your own cron'd output of "ps" for a night might help spot the culprit.

Tom Newton
  • 4,021
  • 2
  • 23
  • 28