1

I am using Google Compute Engine. I have setup deployment manager and it set's up a firewall that allows network LB to connect to web servers, the web servers themselves adds them to a Instance Group Manager and it set's up an Autoscaler that targets the Instance Group Manager, HTTP Health Check is setup that will execute against web server instances its adds the HTTP Health Checks to a backend service and adds the Instance Group Manager to the backend service, it set's up a URL Map that has the backend service as it's default Service the url mapper in its turn are added to the HTTP Proxy that is pointed at by a Forwarding Rule that have a global IP.

This setup is very similar to the setup described here https://cloud.google.com/solutions/scalable-and-resilient-apps

So now to the problem that I cans seem to solve for this setup. I have a Nginx server running on the web servers and it responds to requests and I am able to create event source connections to it but after exactly 1 minute the connection is closed with the error INCOMPLETE_CHUNKED_ENCODING. This do not happen if I connect directly to one of the web servers. I have changed the sysconf setting for tcp keepalive to:

net.ipv4.tcp_keepalive_time=600 
net.ipv4.tcp_keepalive_intvl=15 
net.ipv4.tcp_keepalive_probes=5

This after reading https://cloud.google.com/compute/docs/troubleshooting#networktraffic

I have tried countless things in the nginx config and can not seam to find a solution.

Do any one have any idea or similar problems?

Pit
  • 184
  • 11
Calle
  • 11
  • 2
  • That error implies that the final block (chunk) of the xfer had fewer actual bytes than the header said it should. Each chunk consists of a length (in hex) followed by CRLF followed by Length-bytes of data followed by a final CRLF. I suspect the direct connection is either forgiving of these missing bytes or gae is mishandling the final chunk. – caskey May 17 '15 at 23:54
  • Yes you are right the problem indicates that there is fewer actual bytes then there should be. But I think that is a effect of the connection being dropped by some thing in the Google cloud platform and I cant find the config to get this longer then 1 min – Calle May 18 '15 at 14:51
  • I have found out that connecting to the server directly gives me a Connection : keep-alive header but going throw the load balancer do not. So that could be the problem. Need to make the load balancer send back the Connection : keep-alive header. Should solve the problem I think. – Calle May 19 '15 at 17:02
  • 1
    This question seems to be answered on [this discussion group](https://groups.google.com/forum/#!topic/gce-discussion/0Vtd7p3mTVQ). If your issue is resolved, can you post the answer here for other community members who may be seeing this same issue. Thanks – Faizan Nov 23 '16 at 20:55
  • Were you able to solve this issue? If yes, can you provide an answer so the community can benefit? As per this [link,](https://blog.percy.io/tuning-nginx-behind-google-cloud-platform-http-s-load-balancer-305982ddb340) it appears that increasing the 'keepalive_timeout' in Nginx to a higher value that the the timeout configured on your GCE backend will help to solve this issue. – Marilu Apr 26 '17 at 20:09
  • Anyone found a solution to this ? Is this still an issue ? – Wojtek_B Nov 09 '20 at 15:57

2 Answers2

1

Alex was right sharing that post link because it leads to the main issue, but it needs a little bit of an explanation.

You will need to change 'keepalive_timeout' value (default is 65) in your Nginx configuration file ( /etc/nginx/nginx.conf) to increase the HTTP connection timeout, so your timeout is longer than the 600 seconds timeout in the Load Balancer. This causes the load balancer to be the side that closes idle connections, rather than nginx.

Tune nginx keepalives to work with the Google Cloud Platform HTTP(S) Load Balancer.
Set “keepalive_timeout 650;” in nginx /etc/nginx/nginx.conf

keepalive_timeout 650;
keepalive_requests 10000;

More in depth information about http persistence.

Pit
  • 184
  • 11
0

Your nginx probably requires some tuning.

Alex
  • 523
  • 1
  • 4
  • 14