0

I have a linux machine 64bit (centos5.5), 2.83GHz Q9550, 6gb ram and a single SATA 500gb drive.

From this machine I only serve thumbnails, most around 10kb in size and at this point there are about 7 million thumbnails on the server. I have them setup in a /25/25/25/25 folder setup which was recommended to me.

On average the nginx status report shows that im serving about 300 to 400 active connections.

EXAMPLE:

Active connections: 297 
server accepts handled requests
 1975808 1975808 3457352 
Reading: 39 Writing: 8 Waiting: 250 

Now the problem is that this machine is having a very hard time, and is getting slower as my site is gettin busier. The load is always around 8 to 9.

I noticed iostat showing over 100% util.

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.20     1.40 99.80 31.14  1221.56   255.49    11.28   114.14  831.81   7.62  99.84

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.20     0.60 100.80 24.00  1192.00   203.20    11.18   113.77  775.42   8.02 100.04

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.20   314.80 44.80 130.00   598.40  3547.20    23.72   113.76  937.18   5.72 100.02

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00     5.40 56.20 110.80   660.80   937.60     9.57   112.37  518.01   5.99 100.04

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.60    12.40 82.80 41.60  1008.00   432.00    11.58   113.66  852.51   8.04 100.04

Below you can see some of my nginx config settings:

worker_processes  6;
worker_connections  4096;

http {
        include                 mime.types;
        default_type            application/octet-stream;
        #access_log             logs/access.log  main;
        sendfile                on;
        #tcp_nopush             on;
        keepalive_timeout       4;
        gzip                    on;
        gzip_http_version       1.1;
        gzip_vary               on;
        gzip_comp_level         2;
        gzip_proxied            any;
        gzip_types              text/plain text/html text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript;
        gzip_buffers            16 8k;
}

My question is, apart from moving to RAID setups, and possibly SSD's, is there anything that I can tweak/tune to get more out of this machine? I have a feeling a server like mine should be able to handle much more than about 300 to 400 active nginx connections per second.

Mr.Boon
  • 1,441
  • 4
  • 24
  • 41

1 Answers1

1
  • disabled access_log
  • Use open_file_cache
  • mount option: async,noatime
  • increase vm.dirty_writeback_centisecs(15000)
  • Use expires
  • update harware(more memory, up to 24Gb; RAID or SSD)
  • use gzip_static

I use btrfs ssd soft raid(mkfs.btrfs -m single /dev/sde -d raid0 /dev/sdd /dev/sdc)

alvosu
  • 8,357
  • 24
  • 22
  • Thanks for the reply. Changing those mount options, can that be done on an actively used drive? Also where can I change the value vm.dirty_writeback_centisecs ? – Mr.Boon Jan 29 '11 at 19:59
  • >Changing those mount options, can that be done on an actively used drive? yes >Also where can I change the value vm.dirty_writeback_centisecs ? edit /etc/sysctl.conf and run sysctl -p – alvosu Jan 29 '11 at 20:04
  • also, try to use gzip_static – alvosu Jan 29 '11 at 20:05
  • Ok thank you, i follow this tutorial, http://www.howtoforge.com/reducing-disk-io-by-mounting-partitions-with-noatime but the mount -o remount / command is taking a very long time. Is this suposed to be like that? Its been going for about 4 minutes now. – Mr.Boon Jan 29 '11 at 20:11
  • nevermind, its done :) only the noatime change made the load drop from 9 to 2. thanks a million alvosu! – Mr.Boon Jan 29 '11 at 20:19
  • `atime` is an unfortunate default for journalling filesystems, everybody should always set `noatime` or `relatime`. If not, the filesystem does a metadata (and thus synchronous) write for __every__ file access, _even if the file is cached_! – Javier Jan 29 '11 at 21:55