2

While trying to figure out why our Varnish 4.1 install (on CentOS7 via the varnish-cache.org repo) was not following vcl rules set out to log the client IP address in an X-Forwarded-For header (see: Varnish 4 logging proxy/load balancer instead of client IP addresses) I happened to notice something odd while looking through varnishlog files:

- Begin        req 9353447 rxreq
- Timestamp    Start: 1488771709.337974 0.000000 0.000000
- Timestamp    Req: 1488771709.337974 0.000000 0.000000
- ReqStart     172.25.20.65 19903
- ReqMethod    GET
- ReqURL       /about-us/
- ReqProtocol  HTTP/1.1
- ReqHeader    host: www.<notreallythishost>.com
- ReqHeader    Accept: */*
- ReqHeader    Accept-Encoding: gzip, deflate
- ReqHeader    Cache-Control: no-cache
- ReqHeader    From: bingbot(at)microsoft.com
- ReqHeader    Pragma: no-cache
- ReqHeader    User-Agent: Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)
- ReqHeader    X-Forwarded-For: 40.77.167.41
- ReqHeader    X-Forwarded-Port: 80
- ReqHeader    X-Forwarded-Proto: http
- ReqHeader    Connection: keep-alive
- ReqUnset     X-Forwarded-For: 40.77.167.41
- ReqHeader    X-Forwarded-For: 40.77.167.41, 172.25.20.65
- VCL_call     RECV
- ReqUnset     X-Forwarded-For: 40.77.167.41, 172.25.20.65
- ReqHeader    X-Forwarded-For: 172.25.20.65
- ReqUnset     Accept-Encoding: gzip, deflate
- ReqHeader    Accept-Encoding: gzip
- ReqUnset     User-Agent: Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)
- VCL_return   hash
- VCL_call     HASH
- VCL_return   lookup
- VCL_call     MISS
- VCL_return   fetch
- Link         bereq 9353449 fetch
- Timestamp    Fetch: 1488771709.338395 0.000421 0.000421
- RespProtocol HTTP/1.1

This fully explains why we were never able to get anything other than the load balancer IP address logged through varnishncsa at any point regardless of the logging technique we used.

It looks as though in the request processing it builds up the X-Forwarded-For header by adding the AWS Load Balancer IP address to the header, but as vc_call is called summarily decontructs it again and removes the original client IP address. So how do I go about keeping the X-Forwarded-For intact and why is Varnish shifting IPs off the left side rather than than just adding them onto the X-Forwarded-For header like it should? Bug?

dcmbrown
  • 305
  • 3
  • 12

1 Answers1

3

So the work-around for this (possible bug in 4.1.3-1.el7.x86_64) was a clue I found while browsing other varnish logging questions, specifically one on disabling the x-forwarded-for header altogether.

While this isn't what I wanted to do, it provided the hint about preventing varnish from appending its own vcl_recv content to the buttom of my vcl_recv function definition. Specifically provide your own return (lookup) (though that's varnish <= 3) or return (hash) (varnish 4.x).

So now I have this at the top of the vcl_recv():

# ensure proper logging of x-forwarded-for IP addresses
std.collect(req.http.x-forwarded-for);
set req.http.x-forwarded-for = regsub ( req.http.x-forwarded-for, "^(([0-9]{1,3}\.){3}[0-9]{1,3})(.*)", "\1" );
if (req.http.x-forwarded-for) {
    std.log("ip:" + req.http.x-forwarded-for);
} else {
    std.log("ip:" + client.ip);
}

Then at the very end of the function I have of course as I mentioned a:

return (hash);

So it is now successfully logging just the client IP address as it should be with the addition of a varnishncsa option of:

-F "%%{VCL_Log:ip}x %%l %%u %%t \"%%r\" %%s %%b \"%%{Referer}i\" \"%%{User-agent}i\""

Hopefully someone else finds this information useful.

UPDATE: Just as a note I found this post about spoofing in this nginx answer and it would be nice to have a sort of real_ip_from or a trusted_ip_from headers in varnish as well but it doesn't appear to natively at the moment. My initial solution will pick out the spoofed address in this case. So it would be better to have a regex with known IPs removed and you pick up the first unhandled client IP instead. Something like this would work:

set req.http.x-forwarded-for = regsub ( req.http.x-forwarded-for, "(([0-9]{1,3}\.){3}[0-9]{1,3})(, (172.25.20.65|172.25.10.228),?)+$", "\1" );

where 172.25.20.65 and 172.25.10.228 are my trusted IPs (proxies or load balancers picked up and added to X-Forwarded-For, etc). Depending on whether you expect to see a proxy/LB in front of it your regex can either be this if you expect there to always be at least one load balancer/proxy in your header:

(([0-9]{1,3}\.){3}[0-9]{1,3})(, (<trustedip1>|<trustedip2>|...),?)+$

or this if it is allowable to not have anything before your varnish server:

(([0-9]{1,3}\.){3}[0-9]{1,3})(, (<trustedip1>|<trustedip2>|...),?)*$

Although if the latter is the case why would you be looking at the x-forwarded-for header...

dcmbrown
  • 305
  • 3
  • 12