23

I have this nginx server running on AWS & it was working all fine until recently when couple of users started complaining about the website not opening until they made some 10 attempts to access it.

I never was able to repro the issue from my side. I am using google's dns i.e 8.8.8.8 & when I changed the same for one of the users, the site was working fine. Now this can be the reason or this can be just a coincidence also.

I found this in the error log -

2014/05/29 13:46:15 [info] 6940#0: *150649 client timed out (110: Connection timed out) while waiting for request, client: xx.xxx.xxx.xx, server: 0.0.0.0:80
2014/05/29 13:46:20 [info] 6940#0: *150670 client closed connection while waiting for request, client: xx.xxx.xxx.xx, server: 0.0.0.0:80
2014/05/29 13:46:20 [info] 6940#0: *150653 client closed connection while waiting for request, client: xx.xxx.xxx.xx, server: 0.0.0.0:80
2014/05/29 13:46:20 [info] 6940#0: *150652 client closed connection while waiting for request, client: xx.xxx.xxx.xx, server: 0.0.0.0:80

And some places even this -

2014/05/29 13:46:53 [info] 6940#0: *150665 client closed connection while waiting for request, client: xx.xxx.xxx.xx, server: 0.0.0.0:80
2014/05/29 13:46:53 [info] 6940#0: *150660 client xx.xxx.xxx.xx closed keepalive connection

Note- Have placed xx.xxx.xxx.xx for the clien't IP

Here is the nginx config -

server {
    listen       80;
    server_name  somedomain.com  www.somedomain.com;

    #charset koi8-r;
    #access_log  /var/log/nginx/log/host.access.log  main;

    root        /var/www/somedomain/current/app/webroot;
    index       index.php index.html index.htm;

    ... couple of location rules ...
}

I would really appreciate any help.

Thanks

Nitish Dhar
  • 331
  • 1
  • 2
  • 5
  • 1
    This could be a problem with the developers' connection to the server, not the server. Since you cannot recreate the problem and the server itself is registering a client connection timeout, we need to suspect the developer may be behind a firewall and they have internal networking issues that cause this. – Andrew S Aug 20 '14 at 14:15
  • You can try disabling Keep-Alive just as a test for this issue. I'm not sure the traffic hitting your webserver but Keep-Alive could be causing you to hit the concurrency limit in your nginx config. Here is more info: http://nginx.com/blog/http-keepalives-and-web-performance/ – Alfonso Feb 18 '15 at 17:19
  • 1
    @NitishDhar Did you get to solve this problem? I am also facing the same issue and just clueless. Will be glad if you can share the solution. – Ethan Collins Jun 11 '15 at 16:34
  • 2
    Questions: is the server behind a load balancer or a firewall? Is NAT involved? Is there a tunnel of any sort between the server and the Internet? The reason I ask is that this sounds like the sort of thing that happens when there is a tunnel someplace in the path and someone has blocked all ICMP which breaks Path MTU discovery. – GeorgeB Jun 19 '15 at 18:03
  • Also, what is the output of cat /proc/sys/net/ipv4/tcp_mtu_probing – GeorgeB Jun 19 '15 at 18:05

3 Answers3

6

Based on the log you provided from Nginx, it seems that the connections between your server and users are unstable or slow. Please try traceroute to your client IP address or his/her gateway from your server. Also, ping your client IP address for a long time to see the packet loss rate and response time. MTU may be another source of this problem. Test if you can reach your client with MTU=1500 (Mac: ping -D -s 1472 xx.xx.xx.xx).

BTW: If your server or client resides in China, this problem usually not your fault. GFW is known to randomly discard packets between border to intentionally make international connection quality worse.

Lingfeng Xiong
  • 208
  • 4
  • 10
0

As speculated in that comment, it's likely a user error and they're closing the connection (whether intentionally or not). Try to reliably reproduce the problem. Rule it out happening elsewhere and if it's only that location, they'll need to troubleshoot on their end. Try from different browsers/computers and then test network reliability.

Peter
  • 1,450
  • 2
  • 15
  • 26
0

These log entries look similar to entries that show up when I use tools like OpenVAS to scan a server. These tools make bad connections, go slow or otherwise operate poorly; nginx is just reporting that some connection was not playing nice. If all the traffic is from the same source, and is rapid and doesn't have other legit requests to match in the access log it's likely just a bot-scanner kind of thing.

These scanners could also be putting your application under load which could make it slow for other legitimate traffic.

edoceo
  • 185
  • 3