1

I have a system hitting a number of backend services through four loadbalanced varnish caches. Approximately 0.1% of requests are being truncated because varnish is returning an incorrect content-length header. This error happens across all the caches and various backends.

Varnish is configured to not cache anything without valid cache-control headers and these specific requests are not cached. The client does not send a Accept-Encoding header so Varnish requests a gzipped response from the back end and sends an uncompressed version to the client.

From varnishlog all the backend requests look the same:

BerespStatus   200
BerespReason   OK
BerespHeader   Content-Encoding: gzip
BerespHeader   Content-Length: 1001
BerespHeader   Content-Type: application/json
BerespHeader   Date: Fri, 08 Feb 2019 14:59:06 GMT
BerespHeader   Vary: Accept-Encoding
TTL            RFC 0 10 0 1549637947 1549637947 1549637946 0 0 cacheable
VCL_call       BACKEND_RESPONSE
TTL            VCL 120 10 0 1549637947 cacheable
TTL            VCL 120 10 0 1549637947 uncacheable
VCL_return     deliver
Filters        testgunzip
Storage        malloc Transient
Fetch_Body     3 length stream
Gzip           u F - 1001 3792 80 7906 7944
BackendReuse   54 default
Length         1001

I believe that the line Gzip u F - 1001 3792 80 7906 7944 is stating that the 1001 byte zipped response is uncompressed to be 3792 bytes.

There are three different types of response. About 50% set the content-length to 3792:

VCL_call       RECV
ReqHeader      Surrogate-Capability: key=ESI/1.0
VCL_return     hash
VCL_call       HASH
VCL_return     lookup
HitMiss        1052359 119.885211
VCL_call       MISS
VCL_return     fetch
Link           bereq 1052362 fetch
Timestamp      Fetch: 1549633459.720553 0.004765 0.004765
RespProtocol   HTTP/1.1
RespStatus     200
RespReason     OK
RespHeader     Content-Encoding: gzip
RespHeader     Content-Length: 1000
RespHeader     Content-Type: application/json
RespHeader     Date: Fri, 08 Feb 2019 13:44:19 GMT
RespHeader     Vary: Accept-Encoding
RespHeader     Age: 0
RespHeader     Via: 1.1 varnish (Varnish/6.1)
VCL_call       DELIVER
RespHeader     X-Cache: MISS
RespHeader     X-Cache-Hits: 0
VCL_return     deliver
Timestamp      Process: 1549633459.720567 0.004779 0.000014
RespUnset      Content-Encoding: gzip
RespHeader     Accept-Ranges: bytes
RespUnset      Content-Length: 1000
RespHeader     Content-Length: 3792
Gzip           U D - 1001 3792 80 7906 7944

About 50% without a content-length header but with transfer-encoding chunked instead:

Same as above except
...
RespUnset      Content-Length: 1000
RespHeader     Transfer-Encoding: chunked
Gzip           U D - 1001 3792 80 7906 7944

And the 0.1% that breaks the client because of an incorrect content-length:

...
RespUnset      Content-Length: 1001
RespHeader     Content-Length: 208
Gzip           U D - 1001 3792 80 7906 7944

Does anybody have any thoughts as to the cause of this? Any suggestions for next step in debugging?

My VCL doesn't do anything with content-length headers. I'm running varnish 6.1 (I'm considering dropping down to the stable version 6.0.2 but haven't tried that yet)


EDIT:

Downgraded to 6.0.2 and am still seeing the same issue.


EDIT 2:

I have disabled varnish from upgrading a request to gzip with the command switch -p "http_gzip_support=off" and that has fixed the issue at the expense of non-zipped requests still being non-zipped when they hit the backend, but that's good enough for now.

  • The part about chunk encoding was covered [here](https://serverfault.com/questions/183843/content-length-not-sent-when-gzip-compression-enabled-in-apache) - which is about serving response *while* gzipping (certainly no `Content-Length` is applied to that, so it's being unset by either Varnish or backend. As for when it's present and "incorrect" - do you have `do_gzip` or similar directive in your VCL? – Danila Vershinin Feb 08 '19 at 17:37
  • @DanielV. No, no `do_gzip` in my VCL. After a bit more experimentation I've found that if I set the client to accept gzip then everything appears fine. So it seems to be something to do with varnish requesting a zipped response from the backend then uncompressing it for the client. I'll going to see if I can disable that feature for the time being so that if the client requests uncompressed then varnish also requests uncompressed. – Martin Harris Feb 11 '19 at 14:52
  • 2 years leter, we just stumbled on the same issue when migrating from varnish 5.2.1 to 6.0.7 I filed [an issue](https://github.com/varnishcache/varnish-cache/issues/3535) on the github repo (and referenced this report here). – yched Feb 24 '21 at 11:31

0 Answers0