13

Description of issue

We have a web server that is serving static assets. We are experiencing a problem where after you browse around some of the http requests get stuck in the "pending" state. In the chrome inspector the response headers do come back, but the requests don't time out and they look like they are downloading. In the timeline view the "Waiting (ttfb)" is the last item filled out (400ms for example), and then there is a note "CAUTION: request is not finished yet!"

This issue seems limited to chrome, and when the site is being run https. We can't reproduce on safari, ff, ie, and we can't reproduce if https is off.

Repro. steps taken

  1. open chrome incognito
  2. open inspector tools > net tab
  3. navigate to site
  4. usually the first page and all of it's requests finish
  5. browse to another page
  6. unexpected behavior: parts of the page don't load; xhr requests for .html files, and jpg images typically. When inspected in chrome's net tab they say "pending"

Odd note:

  1. after following the steps above, if you open the "pending" request in a new tab the tab "spins"
  2. if you close the first tab, the second tab with the "pending" url resolves, this lead us to look into keep alives and timeouts, but to no avail.
  3. This full issue can sometimes present itself on the very first request as well (document)

Environment notes:

  • frontend is angularjs, accessed via chrome other browsers don't seem to have this issue
  • server is run https, wildcard cert ( *.domain.com )
  • nginx version 1.9.3

    # some variables we've tweaked
    worker_processes 4;
    worker_connections 4000;
    keepalive_timeout 15;
    client_body_timeout 12;
    gzip on
    
  • nginx logs don't complain about anything

  • the cpu / ram never get anywhere close to maxxing out when there is load on the server
  • the response headers include; etag, gzip, content-type, date, last-modified, server, status (200), strict-transport-security:max-age=604800 ...
  • changing chrome's "disable cache" checkbox doesn't seem to effect things
  • we've experienced this on man chrome browsers on different computers. I'm running 44.0 64bit on max

Based on these issues the bug feels like some type of server configuration issue, we don't think it is cert related, but the fact that it only impacts chrome is really odd.

AKnox
  • 231
  • 1
  • 2
  • 4
  • Share full Nginx configuration, Linux kernel version, iostat, free -m – Anatoly Jul 24 '15 at 20:45
  • when you build nginx without spdy it works! there appears to be a way to get it to work otherwise, but we haven't come back around to working on it. – AKnox Sep 09 '15 at 12:46

2 Answers2

6

Tons of info on this or similar issue and none of the solutions worked for me. So after digging - here it is Add to server's response headers[ 'Connection' ] = 'close'

vladbph
  • 61
  • 1
  • 2
  • Warning: This will drastically reduce the performance of your web site and is only a workaround for whatever the actual underlying problem is. – Michael Hampton Jul 03 '20 at 19:27
  • 2
    Define 'drastically'? This is statement is not true in general. The correct comment would be - it depends on the pattern of utilizing the services provided by your web server. In my case(again, in my case) web page load time and servicing web requests now work much faster(>10x), plus there are no any pending requests queued by the browser, which otherwise from customer perspective is considered a failure to provide the service(call it workaround). So, in some cases it may result in the decreased performance, in some cases it solves the problem of stalled ajax queries and increased performance. – vladbph Jul 04 '20 at 23:26
  • Compared to the performance you would get if you solved the actual underlying problem rather than using this workaround. – Michael Hampton Jul 05 '20 at 00:07
  • For me, I add `Connection: close` to request header, and it works like a charm, thanks! But I still had no clue why Chrome sometimes pending the XHR request, while Firefox works great most time. – L_K Jul 22 '20 at 06:48
  • 2
    I would speculate, that the issue is still on the server side. In my case chrome and firefox behaved the same way, until I added 'connection close' to the server response header. – vladbph Jul 22 '20 at 23:07
3

Consider wireshark and the Chrome developers tools to analyse the network traffic.

Open the network debugger in Chrome and try to reproduce a stuck request. It will show you an exact timeline: when the request headers were fully sent, request content is sent, waiting for reply, response headers fully received, first byte of content received, last byte of content (if it ever completes).

That's important to determine at what stage the issue is?

  • If it's at the beginning, the web server never received the request, it may not have been possible to establish a connection in the first place.

  • If it's stuck waiting for the end of the content, it means the web server fully received and processed the request, you should be able to see the request in the web server log, with a status code maybe an error?

I've encountered many real world things that break the response or slow down the transfer to a crawl.

  • Aggressive traffic shaping. Had to work in one office in London communicating with Asian servers. The first MB of response comes fine then transfer is capped at 20 KB/s. It's not broken, it's just slow beyond belief.

  • Proxy issues, especially with MITM proxies. They can break connections or hold for no particular reasons. Used to have a full MITM proxy when I was working a bank, it would take 5 minutes to open Google.com sometimes for no reason, so irritating, worked fine the other 99% of the time though.

  • HTTP/2 (previously SPDY) and TLS 1.3 issues. Believe me or not but these were buggy as hell the first years they came out (there are still some edge cases as of 2020). Any developer should appreciate that no code is perfect on the first try, it takes a while to iron out bugs and edge cases. Unfortunately this stuff really needs to work perfectly across systems or ouch (browsers, load balancers, web servers, etc...). Trust me, you don't want to be an early adopter of new protocols when you've got tens or hundreds of applications to manage (on couple different stacks) in an organization.

  • Chrome bugs. Chrome has bugs like every complex software, more so because it's moving quick and often deliberately shoving things down your throat. I could repeat HTTP/2 as an example, Chrome is the first to implement and of course the first try ain't perfect, so here come some bugs. Chrome is also the first to enable it by default and of course this causes any incompatibility to popup and you can't escape it. Consider trying the previous/next version of Chrome or another browser when you encounter weird issues.

  • Antivirus software. Antivirus filter all connections and all traffic coming in and out of the computer. They intercept all network syscall if you're not aware of how a Windows antivirus work. They're not flawless and can break connections pretty bad. I can tell you I spend a lot of time debugging an issue affecting a handful of employees, one software downloads a configuration over HTTP that should take 1 second but for them it takes between 10 minutes and 2 hours. We tracked down the issue to the Symantec antivirus grinding the connection to an halt (1 KB/s), gotta be buggy trying to analyse the traffic or determining if the connection should be allowed. Then back and forth with support to try to fix it or find settings where it doesn't happen.

user5994461
  • 2,749
  • 1
  • 17
  • 30