Nginx causing high load with not many active connections

Question

I have a linux machine 64bit (centos5.5), 2.83GHz Q9550, 6gb ram and a single SATA 500gb drive.

From this machine I only serve thumbnails, most around 10kb in size and at this point there are about 7 million thumbnails on the server. I have them setup in a /25/25/25/25 folder setup which was recommended to me.

On average the nginx status report shows that im serving about 300 to 400 active connections.

EXAMPLE:

Active connections: 297 
server accepts handled requests
 1975808 1975808 3457352 
Reading: 39 Writing: 8 Waiting: 250

Now the problem is that this machine is having a very hard time, and is getting slower as my site is gettin busier. The load is always around 8 to 9.

I noticed iostat showing over 100% util.

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.20     1.40 99.80 31.14  1221.56   255.49    11.28   114.14  831.81   7.62  99.84

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.20     0.60 100.80 24.00  1192.00   203.20    11.18   113.77  775.42   8.02 100.04

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.20   314.80 44.80 130.00   598.40  3547.20    23.72   113.76  937.18   5.72 100.02

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00     5.40 56.20 110.80   660.80   937.60     9.57   112.37  518.01   5.99 100.04

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.60    12.40 82.80 41.60  1008.00   432.00    11.58   113.66  852.51   8.04 100.04

Below you can see some of my nginx config settings:

worker_processes  6;
worker_connections  4096;

http {
        include                 mime.types;
        default_type            application/octet-stream;
        #access_log             logs/access.log  main;
        sendfile                on;
        #tcp_nopush             on;
        keepalive_timeout       4;
        gzip                    on;
        gzip_http_version       1.1;
        gzip_vary               on;
        gzip_comp_level         2;
        gzip_proxied            any;
        gzip_types              text/plain text/html text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript;
        gzip_buffers            16 8k;
}

My question is, apart from moving to RAID setups, and possibly SSD's, is there anything that I can tweak/tune to get more out of this machine? I have a feeling a server like mine should be able to handle much more than about 300 to 400 active nginx connections per second.

alvosu · Accepted Answer · 2011-01-29T20:05:24.210

1

disabled access_log
Use open_file_cache
mount option: async,noatime
increase vm.dirty_writeback_centisecs(15000)
Use expires
update harware(more memory, up to 24Gb; RAID or SSD)
use gzip_static

I use btrfs ssd soft raid(mkfs.btrfs -m single /dev/sde -d raid0 /dev/sdd /dev/sdc)

edited Jan 29 '11 at 20:05

answered Jan 29 '11 at 19:56

alvosu

8,357
24
22

Thanks for the reply. Changing those mount options, can that be done on an actively used drive? Also where can I change the value vm.dirty_writeback_centisecs ? – Mr.Boon Jan 29 '11 at 19:59
>Changing those mount options, can that be done on an actively used drive? yes >Also where can I change the value vm.dirty_writeback_centisecs ? edit /etc/sysctl.conf and run sysctl -p – alvosu Jan 29 '11 at 20:04
also, try to use gzip_static – alvosu Jan 29 '11 at 20:05
Ok thank you, i follow this tutorial, http://www.howtoforge.com/reducing-disk-io-by-mounting-partitions-with-noatime but the mount -o remount / command is taking a very long time. Is this suposed to be like that? Its been going for about 4 minutes now. – Mr.Boon Jan 29 '11 at 20:11
nevermind, its done :) only the noatime change made the load drop from 9 to 2. thanks a million alvosu! – Mr.Boon Jan 29 '11 at 20:19
`atime` is an unfortunate default for journalling filesystems, everybody should always set `noatime` or `relatime`. If not, the filesystem does a metadata (and thus synchronous) write for __every__ file access, _even if the file is cached_! – Javier Jan 29 '11 at 21:55

Nginx causing high load with not many active connections

1 Answers1