1

When my server is downloading a file or executing other IO operations the other server functions become extremely slow (almost locked). Even other functions that do not use the network. When a download is in progress, if I start typing in gnome-terminal, the first key I type delay several seconds to appear on the screen and much times it is repeated several times the letter, even typing only once.

Seconds before I start a download I execute a sar command:

[root@hostname ~]# sar 6 36
Linux 2.6.18-348.6.1.el5 (hostname)         07-07-2013

13:16:42          CPU     %user     %nice   %system   %iowait    %steal     %idle
13:16:48          all      3,96      0,00      6,50      4,94      0,00     84,59
13:16:54          all      8,33      0,00      0,88      1,35      0,00     89,44
13:17:00          all      4,81      0,00      1,17      0,17      0,00     93,85
13:17:06          all      2,49      0,00      2,44      0,80      0,00     94,27
13:17:12          all      6,42      0,00     10,02     24,08      0,00     59,48
13:17:18          all      1,61      0,00     17,16     28,87      0,00     52,36
13:17:24          all      6,46      0,00     14,03     10,66      0,00     68,86
13:17:30          all     10,66      0,00     16,76      4,93      0,00     67,65
13:17:36          all      8,41      0,00     21,28     19,07      0,00     51,25
13:17:42          all      4,98      0,00     18,10     47,51      0,00     29,41
13:17:48          all      0,48      0,00     13,87     28,22      0,00     57,44
13:17:54          all      0,53      0,00     13,80     42,60      0,00     43,08
13:18:00          all      1,08      0,00     12,62     57,36      0,00     28,94
13:18:06          all      1,90      0,00     15,37     63,71      0,00     19,02
13:18:12          all      1,10      0,00     16,26     71,44      0,00     11,21
13:18:18          all      1,65      0,00     21,12     72,99      0,00      4,25
13:18:24          all      1,38      0,00     22,54     67,81      0,00      8,27
13:18:30          all      1,25      0,00     17,00     67,94      0,00     13,81
13:18:36          all      1,33      0,00     16,87     51,04      0,00     30,76
13:18:42          all      1,27      0,00     17,91     58,54      0,00     22,28
13:18:48          all      1,39      0,00     14,60     39,19      0,00     44,82
13:18:54          all      1,78      0,00     13,68     35,70      0,00     48,85
13:19:00          all      0,43      0,00     10,63     54,44      0,00     34,50
13:19:06          all      6,58      0,00      8,81     13,92      0,00     70,69
13:19:12          all      0,89      0,00     27,40     19,09      0,00     52,63
13:19:18          all      1,63      0,00     20,10     39,95      0,00     38,32
13:19:24          all     18,95      0,00     16,89     34,02      0,00     30,15
13:19:30          all      4,21      0,00      9,03     17,51      0,00     69,24
13:19:36          all      0,87      0,00      3,40      2,13      0,00     93,60
13:19:42          all      2,17      0,00      0,46      0,13      0,00     97,25
13:19:48          all      2,90      0,00      1,53      2,17      0,00     93,40
13:19:54          all      2,17      0,00     11,44     17,07      0,00     69,31
13:20:00          all      1,10      0,00      1,86      0,04      0,00     97,00
13:20:06          all      1,71      0,00      0,63      0,38      0,00     97,28
13:20:12          all      1,93      0,00      1,30      0,42      0,00     96,36
13:20:18          all      0,12      0,00      0,38      0,04      0,00     99,46
Média:           all      3,32      0,00     10,74     24,91      0,00     61,03

No erros appear in /var/log/messages.

My disks are two SSD Kingston SH103S3240G in RAID-1 partitioned as follows:

[root@cluster ~]# fdisk -l

Disk /dev/hda: 240.0 GB, 240057409536 bytes
255 heads, 63 sectors/track, 29185 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/hda1               1          14      112423+  fd  Linux raid autodetect
/dev/hda2              15         537     4200997+  fd  Linux raid autodetect
/dev/hda3             538       29185   230115060   fd  Linux raid autodetect

Disk /dev/hdc: 240.0 GB, 240057409536 bytes
255 heads, 63 sectors/track, 29185 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/hdc1               1          14      112423+  fd  Linux raid autodetect
/dev/hdc2              15         537     4200997+  fd  Linux raid autodetect
/dev/hdc3             538       29185   230115060   fd  Linux raid autodetect

Disk /dev/md125: 235.6 GB, 235637702656 bytes
2 heads, 4 sectors/track, 57528736 cylinders
Units = cylinders of 8 * 512 = 4096 bytes

Disk /dev/md125 doesn't contain a valid partition table

Disk /dev/md1: 4301 MB, 4301717504 bytes
2 heads, 4 sectors/track, 1050224 cylinders
Units = cylinders of 8 * 512 = 4096 bytes

Disk /dev/md1 doesn't contain a valid partition table

Disk /dev/md127: 115 MB, 115015680 bytes
2 heads, 4 sectors/track, 28080 cylinders
Units = cylinders of 8 * 512 = 4096 bytes

Disk /dev/md127 doesn't contain a valid partition table

My question is: How can I diagnose this problem if it occurs when the computer is extremely slow to type commands and see the output?

The server is so slow that you can not analyze the output of the commands iotop and top.

2 Answers2

4
  • Do a disk check. Ensure the hard disks have no errors.
  • Run a top or i/o stat in a terminal window and start the process
  • Check the log files and see if any errors are being displayed.
Tiffany Walker
  • 6,541
  • 13
  • 53
  • 77
1

That is true, some time server performance is so bad that make difficult use performance debugging commands, so my suggestion is install this (on CentOS yum install sysstat) and use sar to review historical data in the period of performance degradation happened in your server to determine what is the cause.