0

I've tried both lighttpd and nginx as webservers. All to very same result: In the morning, when the load is low, files are lighting fast to download. But later, when the load is a BIT more, server starts to serve files extremly slow if any.

We're using Riak on server, some mono applications which are constantly sending udp packets to user's application and a web server to execute php scripts and to host some files: lots of images, which are constantly access and modified by users and some application-specific files with average 20-30mbs, which are also accessed constantly. The load is not more then 5-10k per day for now.

  • The CPU is always more then 90% idling (top, dstat), I guess he is doing fine.
  • Memory is more then enough (htop, free), it's not even half used.
  • Network speed = 1 Gbit/s, duplex = full, autonegotiation = off
  • HDD
    Timing cached reads: 28842 MB in 2.00 seconds = 14436.45 MB/sec Timing buffered disk reads: 766 MB in 3.01 seconds = 254.78 MB/sec

  • Ubuntu 14+ unicorn, ulimit -n 65536, somaxconn = 40000

I'm kind of in despair now) I though it was something with my configuration of lighttpd, but after moving to nginx, situation didn't change at all. I've tried aio with nginx, but, unfortunately, to no breakthrough. Where should I look at?

Lighttpd conf: http://www.pastebin.ca/2962652 Nginx conf: http://pastebin.ca/2962656

upd 01:

netstat -i
Kernel Interface table
Iface   MTU Met   RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
eth0       1500 0  459293618      0    860 0      795794415      0      0      0 BMRU
lo        65536 0  38105807      0      0 0      38105807      0      0      0 LRU

upd 02:

-iostat
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           4.80    0.00    1.09    0.01    0.00   94.11

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda               7.50         6.91       371.47    2938271  157867024
sdb               7.35         2.88       371.47    1224325  157867024
sdc               7.32         2.56       371.47    1089356  157867024
md2               1.15         0.13       227.20      56169   96553984
md1              11.91        10.36       140.27    4402931   59610720

-atop
PRC | sys    0.79s  | user   4.37s  | #proc    182  | #tslpu     0  | #zombie    0  | #exit      ?  |
CPU | sys       7%  | user     41%  | irq       3%  | idle    749%  | wait      0%  | curscal   ?%  |
CPL | avg1    0.39  | avg5    0.34  | avg15   0.39  | csw   245907  | intr   63997  | numcpu     8  |
MEM | tot    31.3G  | free   22.6G  | cache   4.0G  | dirty   1.4M  | buff  295.8M  | slab  184.0M  |
SWP | tot     1.5G  | free    1.5G  |               |               | vmcom   3.9G  | vmlim  17.2G  |
MDD |          md2  | busy      0%  | read       0  | write     14  | MBw/s   0.01  | avio 0.00 ms  |
MDD |          md1  | busy      0%  | read       0  | write     93  | MBw/s   0.07  | avio 0.00 ms  |
DSK |          sda  | busy      0%  | read       0  | write     41  | MBw/s   0.09  | avio 0.10 ms  |
DSK |          sdb  | busy      0%  | read       0  | write     41  | MBw/s   0.09  | avio 0.10 ms  |
DSK |          sdc  | busy      0%  | read       0  | write     41  | MBw/s   0.09  | avio 0.10 ms  |
NET | transport     | tcpi   17140  | tcpo   35894  | udpi    5175  | udpo    4868  | tcpao      2  |
NET | network       | ipi    22311  | ipo    24687  | ipfrw      0  | deliv  22310  | icmpo      0  |
NET | eth0      4%  | pcki   20325  | pcko   39045  | si 2061 Kbps  | so   41 Mbps  | erro       0  |
NET | lo      ----  | pcki    1987  | pcko    1987  | si  298 Kbps  | so  298 Kbps  | erro       0  |
user275407
  • 11
  • 2
  • Are the files served directly from the file system or via some script? – Tero Kilkanen Mar 23 '15 at 06:50
  • @tero They are served directly, not using any script. Thank you – user275407 Mar 23 '15 at 06:57
  • Are you sure, it's not an issue of the local network? – sebix Mar 23 '15 at 07:03
  • @sebix We are renting server at OVH's daugther - soyoustart.com (datacenter RBX2). I'm not sure how to check if it's lan issue. – user275407 Mar 23 '15 at 07:25
  • As sebix says, this feels network related. Does `netstat -i` report any errors? Ask your hosting company to double check the switch configuration and check for errors on the switch port. – Paul Haldane Mar 23 '15 at 07:42
  • @PaulHaldane, added netstat info to post. As I can see, there is no rx or tx errors. Writing to tech support right now. Thank you! – user275407 Mar 23 '15 at 07:53
  • Looks lore like an IO issue. –  Mar 24 '15 at 15:28
  • @AndréDaniel, it could be, easily. Support answered with link to check speed to datacenter, the results were quite impressive. Still digging into switch config with support. How can I check if it is IO-issue? I have aio lib installed, tried it on nginx to get pretty same results. Added `iostat` info to post. Thank you! – user275407 Mar 25 '15 at 05:47

0 Answers0