0

Server Stats:

"cat /proc/version" Output

Linux version 2.6.18-308.24.1.el5 (mockbuild@builder17.centos.org) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-52)) #1 SMP Tue Dec 4 17:43:34 EST 2012

ethtool eth0 Output:

Settings for eth0:
        Supported ports: [ TP ]
        Supported link modes:   10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
        Supports auto-negotiation: Yes
        Advertised link modes:  10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
        Advertised auto-negotiation: Yes
        Speed: 1000Mb/s
        Duplex: Full
        Port: Twisted Pair
        PHYAD: 1
        Transceiver: internal
        Auto-negotiation: on
        Supports Wake-on: pumbg
        Wake-on: g
        Current message level: 0x00000001 (1)
        Link detected: yes

cat /proc/cpuinfo | grep MHz Output:

cpu MHz         : 3201.000
cpu MHz         : 3201.000
cpu MHz         : 3201.000
cpu MHz         : 3201.000
cpu MHz         : 3201.000
cpu MHz         : 3201.000
cpu MHz         : 3201.000
cpu MHz         : 3201.000

Im not really all that good at Linux and have been trying to trace down why this server has such a high load. I believe it is either the HDD is not fast enough or there are too many interrupts as the "ksoftirqd" process will sometimes take a large amount of the CPU and appears to be running for lengthy bits of time.

Iv been researching around on the internet on how to properly diagnose this and i believe i have found out how to correctly bring up useful information but unfortunately the results still leave my confused.

Top Output

top - 08:40:31 up 132 days,  2:06,  2 users,  load average: 84.25, 63.29, 63.02
Tasks: 3214 total,   8 running, 3206 sleeping,   0 stopped,   0 zombie
Cpu(s): 18.6%us,  3.2%sy,  0.0%ni, 41.1%id, 26.8%wa,  0.3%hi,  9.9%si,  0.0%st
Mem:  32934596k total, 25811556k used,  7123040k free,   329988k buffers
Swap:  4194296k total,      128k used,  4194168k free, 10888060k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 1846 nobody    16   0  125m  12m 4944 S  3.9  0.0   0:00.47 httpd
13490 nobody    15   0  126m  13m 5064 S  2.9  0.0   0:02.37 httpd
20137 everprox  16   0  127m  13m 4908 D  2.6  0.0   0:00.76 httpd
 1827 everprox  15   0  127m  13m 4924 S  2.0  0.0   0:00.50 httpd
16574 root      15   0 15120 3480  812 R  2.0  0.0   0:00.15 top
16894 nobody    15   0  126m  84m  944 S  2.0  0.3 946:10.13 nginx
 6347 root      16   0 15112 3552  816 S  1.6  0.0   3:16.46 top
 7115 named     25   0  422m  52m 2084 S  1.6  0.2   4089:16 named
16575 everprox  16   0  126m  11m 3992 D  1.6  0.0   0:00.05 httpd
16891 nobody    15   0  149m  89m  944 S  1.6  0.3 939:49.39 nginx
16892 nobody    15   0  126m  84m  944 S  1.6  0.3 940:41.47 nginx
26041 everprox  15   0  126m  13m 5076 S  1.6  0.0   0:01.55 httpd
26113 nobody    15   0  126m  13m 5024 S  1.6  0.0   0:02.46 httpd
 4345 everprox  15   0  126m  13m 5040 S  1.3  0.0   0:01.82 httpd
13131 everprox  15   0  125m  12m 5072 S  1.3  0.0   0:01.82 httpd
14058 everprox  15   0  127m  13m 5132 D  1.3  0.0   0:01.57 httpd
14554 nobody    15   0  126m  13m 4896 S  1.3  0.0   0:00.74 httpd
26209 everprox  15   0  126m  13m 5044 S  1.3  0.0   0:03.08 httpd
26283 everprox  16   0  125m  12m 5108 D  1.3  0.0   0:02.06 httpd
 4360 everprox  15   0  126m  13m 5088 S  1.0  0.0   0:01.93 httpd
12997 everprox  15   0  126m  13m 5052 S  1.0  0.0   0:03.33 httpd
13351 nobody    15   0  127m  13m 5168 S  1.0  0.0   0:02.43 httpd
13705 everprox  15   0  126m  13m 5076 D  1.0  0.0   0:01.55 httpd
13870 nobody    16   0  126m  13m 5088 S  1.0  0.0   0:02.73 httpd
13931 nobody    15   0  126m  13m 5064 S  1.0  0.0   0:02.57 httpd
14008 everprox  15   0  127m  13m 5156 D  1.0  0.0   0:03.39 httpd
14009 everprox  15   0  126m  13m 5064 D  1.0  0.0   0:01.94 httpd
14215 everprox  15   0  126m  13m 5044 S  1.0  0.0   0:01.68 httpd
14550 everprox  16   0  126m  12m 5088 D  1.0  0.0   0:02.73 httpd
14556 nobody    15   0  126m  13m 5096 S  1.0  0.0   0:03.57 httpd
14587 everprox  15   0  126m  12m 5072 S  1.0  0.0   0:03.74 httpd
14625 nobody    15   0  126m  13m 5108 S  1.0  0.0   0:02.93 httpd
14671 everprox  15   0  126m  13m 5048 S  1.0  0.0   0:02.92 httpd
16893 nobody    15   0  125m  81m  944 R  1.0  0.3 936:15.00 nginx
16896 nobody    15   0  127m  87m  944 S  1.0  0.3 939:30.33 nginx
16897 nobody    15   0  122m  84m  944 R  1.0  0.3 939:11.18 nginx
20121 nobody    16   0  125m  11m 4752 S  1.0  0.0   0:00.63 httpd
20122 everprox  16   0  126m  13m 5036 D  1.0  0.0   0:00.60 httpd
25391 everprox  16   0  126m  13m 5108 D  1.0  0.0   0:02.74 httpd
25463 everprox  15   0  126m  13m 5036 D  1.0  0.0   0:02.45 httpd
25514 everprox  16   0  126m  13m 5096 D  1.0  0.0   0:01.03 httpd
26130 everprox  15   0  126m  13m 5048 D  1.0  0.0   0:01.42 httpd
26220 nobody    15   0  126m  13m 5068 S  1.0  0.0   0:03.15 httpd
 1833 nobody    16   0  126m  12m 4976 S  0.7  0.0   0:00.40 httpd
 4364 everprox  15   0  125m  12m 5020 S  0.7  0.0   0:02.01 httpd
 4370 nobody    16   0  126m  13m 5076 S  0.7  0.0   0:02.02 httpd
 5499 everprox  15   0  126m  12m 4972 S  0.7  0.0   0:00.54 httpd
 5507 everprox  16   0  126m  13m 5004 D  0.7  0.0   0:00.50 httpd
12984 everprox  16   0  127m  13m 5064 D  0.7  0.0   0:01.84 httpd
13004 everprox  15   0  126m  13m 5056 S  0.7  0.0   0:02.81 httpd
13029 everprox  16   0  126m  13m 5048 D  0.7  0.0   0:02.65 httpd

free -mt Output

root@echo [~]# free -mt
             total       used       free     shared    buffers     cached
Mem:         32162      25219       6943          0        322      10690
-/+ buffers/cache:      14206      17956
Swap:         4095          0       4095
Total:       36258      25219      11039

iostat :

root@echo [~]# iostat
Linux 2.6.18-308.24.1.el5 (echo.uk7.org)        10/17/2013

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          26.95    0.08   12.17    3.42    0.00   57.38

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             111.64        19.88      2038.06  226836250 23259888204
sda1              0.00         0.00         0.00       2688       1076
sda2            111.64        19.88      2038.06  226833282 23259887128
dm-0            255.26        19.88      2038.06  226831554 23259887880
dm-1              0.00         0.00         0.00       1160        344

sar -I SUM output:

Linux 2.6.18-308.24.1.el5 (echo.uk7.org)        10/17/2013

12:00:01 AM      INTR    intr/s
12:10:01 AM       sum  17315.21
12:20:01 AM       sum  23640.63
12:30:05 AM       sum  26005.42
12:40:05 AM       sum  27051.29
12:50:01 AM       sum  25887.09
01:00:01 AM       sum  25915.91
01:10:02 AM       sum  25643.99
01:20:01 AM       sum  25590.73
01:30:01 AM       sum  25843.38
01:40:01 AM       sum  25817.66
01:50:01 AM       sum  25937.93
02:00:03 AM       sum  25836.42
02:10:01 AM       sum  25850.17
02:20:01 AM       sum  25788.77
02:30:01 AM       sum  25680.55
02:40:01 AM       sum  25871.60
02:50:01 AM       sum  27089.20
03:00:01 AM       sum  26069.86
03:10:01 AM       sum  26368.91
03:20:01 AM       sum  25977.64
03:30:04 AM       sum  26038.12
03:40:05 AM       sum  26278.10
03:50:02 AM       sum  25988.70
04:00:04 AM       sum  26723.36
04:10:05 AM       sum  26150.12
04:20:03 AM       sum  25904.27
04:30:01 AM       sum  26030.90
04:40:09 AM       sum  25714.96
04:50:10 AM       sum  25732.73
05:00:01 AM       sum  24374.81
05:10:01 AM       sum  21990.37
05:20:01 AM       sum  22917.79
05:30:03 AM       sum  22847.98
05:40:03 AM       sum  24926.45
05:50:01 AM       sum  24986.11
06:00:01 AM       sum  24935.01
06:10:04 AM       sum  25438.65
06:20:01 AM       sum  25430.91
06:30:03 AM       sum  26959.88
06:40:01 AM       sum  26723.60
06:50:01 AM       sum  26422.57
07:00:01 AM       sum  26052.94
07:10:07 AM       sum  27915.00
07:20:01 AM       sum  25868.20
07:30:06 AM       sum  25811.18
07:40:05 AM       sum  25843.82
07:50:01 AM       sum  25814.03
08:00:01 AM       sum  25554.51
08:10:01 AM       sum  24948.75
08:20:01 AM       sum  25413.89
08:30:06 AM       sum  25860.78
08:40:01 AM       sum  25819.49
Average:          sum  25512.26

sar -w output:

Linux 2.6.18-308.24.1.el5 (echo.uk7.org)        10/17/2013

12:00:01 AM   cswch/s
12:10:01 AM 150959.09
12:20:01 AM 108496.38
12:30:05 AM  32508.30
12:40:05 AM  17555.99
12:50:01 AM  21667.90
01:00:01 AM  89007.13
01:10:02 AM  95902.66
01:20:01 AM  83193.93
01:30:01 AM  76984.23
01:40:01 AM  82111.94
01:50:01 AM  77520.72
02:00:03 AM  39197.94
02:10:01 AM  22047.28
02:20:01 AM  21469.65
02:30:01 AM  26522.87
02:40:01 AM  63104.71
02:50:01 AM  85472.19
03:00:01 AM  40869.59
03:10:01 AM  34278.48
03:20:01 AM  15844.37
03:30:04 AM  16504.44
03:40:05 AM  25177.02
03:50:02 AM  18018.24
04:00:04 AM  27187.20
04:10:05 AM  29010.02
04:20:03 AM  40022.62
04:30:01 AM  69535.67
04:40:09 AM  96043.34
04:50:10 AM  82239.90
05:00:01 AM 128834.10
05:10:01 AM 167916.98
05:20:01 AM 130773.27
05:30:03 AM 125977.75
05:40:03 AM 112561.88
05:50:01 AM  94872.38
06:00:01 AM  98417.10
06:10:04 AM  91611.66
06:20:01 AM  94804.15
06:30:03 AM  75834.69
06:40:01 AM  54488.51
06:50:01 AM  24460.81
07:00:01 AM  16950.60
07:10:07 AM  24471.96
07:20:01 AM  16379.81
07:30:06 AM  15711.76
07:40:05 AM  15708.03
07:50:01 AM  16305.04
08:00:01 AM  18454.64
08:10:01 AM  73621.10
08:20:01 AM  57868.75
08:30:06 AM  15440.36
08:40:01 AM  14954.61
08:50:01 AM  14906.57
Average:     58290.70

sar -d 5 0 output:

root@echo [~]# sar -d 5 0
Linux 2.6.18-308.24.1.el5 (echo.uk7.org)        10/17/2013

08:52:50 AM       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
08:52:55 AM    dev8-0    104.40      0.00   1760.00     16.86     19.86    190.26      1.64     17.12
08:52:55 AM    dev8-1      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
08:52:55 AM    dev8-2    104.40      0.00   1760.00     16.86     19.86    190.26      1.64     17.12
08:52:55 AM  dev253-0    220.00      0.00   1760.00      8.00     40.12    182.36      0.78     17.12
08:52:55 AM  dev253-1      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00

08:52:55 AM       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
08:53:00 AM    dev8-0     98.40      0.00   1771.20     18.00     17.44    177.22      1.62     15.92
08:53:00 AM    dev8-1      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
08:53:00 AM    dev8-2     98.40      0.00   1771.20     18.00     17.44    177.22      1.62     15.92
08:53:00 AM  dev253-0    221.40      0.00   1771.20      8.00     36.61    165.36      0.72     15.92
08:53:00 AM  dev253-1      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00

08:53:00 AM       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
08:53:05 AM    dev8-0    109.20      0.00   1916.80     17.55     18.26    167.25      1.75     19.14
08:53:05 AM    dev8-1      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
08:53:05 AM    dev8-2    109.20      0.00   1916.80     17.55     18.26    167.25      1.75     19.14
08:53:05 AM  dev253-0    239.60      0.00   1916.80      8.00     26.30    109.78      0.80     19.14
08:53:05 AM  dev253-1      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00

08:53:05 AM       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
08:53:10 AM    dev8-0    104.79      0.00   2000.80     19.09     18.60    177.46      1.68     17.62
08:53:10 AM    dev8-1      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
08:53:10 AM    dev8-2    104.79      0.00   2000.80     19.09     18.60    177.46      1.68     17.62
08:53:10 AM  dev253-0    250.10      0.00   2000.80      8.00     38.19    152.70      0.70     17.62
08:53:10 AM  dev253-1      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00

08:53:10 AM       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
08:53:15 AM    dev8-0    174.35      0.00   3148.70     18.06     21.08    120.73      1.63     28.36
08:53:15 AM    dev8-1      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
08:53:15 AM    dev8-2    174.35      0.00   3148.70     18.06     21.08    120.73      1.63     28.36
08:53:15 AM  dev253-0    393.59      0.00   3148.70      8.00     39.29     99.81      0.72     28.36
08:53:15 AM  dev253-1      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00

sar -W output:

12:00:01 AM  pswpin/s pswpout/s
12:10:01 AM      0.00      0.00
12:20:01 AM      0.00      0.00
12:30:05 AM      0.00      0.00
12:40:05 AM      0.00      0.00
12:50:01 AM      0.00      0.00
01:00:01 AM      0.00      0.00
01:10:02 AM      0.00      0.00
01:20:01 AM      0.00      0.00
01:30:01 AM      0.00      0.00
01:40:01 AM      0.00      0.00
01:50:01 AM      0.00      0.00
02:00:03 AM      0.00      0.00
02:10:01 AM      0.00      0.00
02:20:01 AM      0.00      0.00
02:30:01 AM      0.00      0.00
02:40:01 AM      0.00      0.00
02:50:01 AM      0.00      0.00
03:00:01 AM      0.00      0.00
03:10:01 AM      0.00      0.00
03:20:01 AM      0.00      0.00
03:30:04 AM      0.00      0.00
03:40:05 AM      0.00      0.00
03:50:02 AM      0.00      0.00
04:00:04 AM      0.00      0.00
04:10:05 AM      0.00      0.00
04:20:03 AM      0.00      0.00
04:30:01 AM      0.00      0.00
04:40:09 AM      0.00      0.00
04:50:10 AM      0.00      0.00
05:00:01 AM      0.00      0.00
05:10:01 AM      0.00      0.00
05:20:01 AM      0.00      0.00
05:30:03 AM      0.00      0.00
05:40:03 AM      0.00      0.00
05:50:01 AM      0.00      0.00
06:00:01 AM      0.00      0.00
06:10:04 AM      0.00      0.00
06:20:01 AM      0.00      0.00
06:30:03 AM      0.00      0.00
06:40:01 AM      0.00      0.00
06:50:01 AM      0.00      0.00
07:00:01 AM      0.00      0.00
07:10:07 AM      0.00      0.00
07:20:01 AM      0.00      0.00
07:30:06 AM      0.01      0.00
07:40:05 AM      0.00      0.00
07:50:01 AM      0.00      0.00
08:00:01 AM      0.00      0.00
08:10:01 AM      0.00      0.00
08:20:01 AM      0.00      0.00
08:30:06 AM      0.00      0.00
08:40:01 AM      0.00      0.00
08:50:01 AM      0.00      0.00
Average:         0.00      0.00

Just wondering if anything really stands out to someone with more knowledge then me, like i said above im thinking its a slow HDD where possibly a SSD would do better or too many interrupts.

The server is primarily a web hosting server hosting web based proxies. It's running Apache 2.2.23 with mod_ruid2 and nginxcp (cpanel addon).

Thanks.

Analog
  • 202
  • 2
  • 12
  • From top: `Tasks: 3214 total`. Do you think your server supports running that many processes/threads? – Matthew Ife Oct 17 '13 at 16:16
  • I have 2 other servers for web proxies, 1 being identical to the one im mentioning above and the highest it ever gets is 30 which doesn't show much performance loss. Another though is the exact same with the exception of having somewhat less traffic (-20%) and the primary drive being an SSD and it has much lower loads (<10). Iv done testing and iv managed to run stable with an APache max clients of 3000 on all 3 servers. – Analog Oct 17 '13 at 19:15
  • 1
    Seeing as about 75 of those 3000 threads are waiting on I/O I imagine that server is not in a position to serve that much. Plus your number of context switches per second is unreal, you're spending about 10% of your CPU time merely switching tasks. Just becuase you can do such a thing does not mean you should. – Matthew Ife Oct 17 '13 at 19:25
  • Thanks for the info, duly noted. Where is the number 75 come from ? As how do you know that 75 threads are waiting on I/O ? is it the "D"'s ? – Analog Oct 17 '13 at 19:43
  • In top, you have 8 tasks in the run queue. That counts for a load just less than 8. Deduct that from 84, your reported load. The remainder of the load count is accumulated from tasks waiting on I/O. – Matthew Ife Oct 17 '13 at 20:51

1 Answers1

2

It looks to me like you're I/O bound. From top, you see many tasks with the D flag. This means that they're blocked on I/O waiting for a response from the disk. The "load average" basically means "x number of tasks waiting in x time".

You also have tons (maybe too many) worker threads if they're all apache. Look to tuning your server a bit or acquiring faster hardware.

Nathan C
  • 14,901
  • 4
  • 42
  • 62
  • I was unaware what the little "D"'s meant in top but see them more often now that i know what they are. SOunds like i will be giving the DC a call and arranging to have an SSD installed. Thanks for the input. – Analog Oct 17 '13 at 19:18
  • @Analog, that probably wont work. Your disk utilization is still only 20%. Your are spending too much time switching between tasks. Reduce that first, if your disk device utilization is 100% then consider an SSD. – Matthew Ife Oct 17 '13 at 20:56
  • @Analog Your best bet would be some RAID configuration to speed up your IOPS capability. The other way is to use a second server for your needs. – Nathan C Oct 17 '13 at 23:31