2

i've got an issue with logging from my webservers, which has an elb and then a varnish layer in front of nginx layer.

varnish is setup properly for X-Forwarded-For and logs come through normally with the correct 'client.ip' being logged.

however, nginx logs are coming through with a whole list of IP's in the request. the default grok behaviour seems to set the client IP to the last in the list ie. the elb and varnish servers, which messes up my client.ip field for nginx logs. the correct client IP should be the first (or at least first few) in the list.

heres an example:

172.31.7.219 - - [28/Sep/2015:12:39:56 +1000] "GET /api/filter/14928/content?api_key=apikey&site=website HTTP/1.1" 403 101 "-" "-" "my.website.com" "1.144.97.102, 1.144.97.102, 1.144.97.102, 127.0.0.1, 172.31.26.59"

problem is i haven't been able to tweak the grok to handle such a result, the heroku grok debugger doesn't seem to work for this query and my grok -- but they are working in logstash ie. not tagging grok failure.

i've attempted to debug the specific parts but i haven't found a way to do what i need with IP/IPORHOST where there is a comma separated list of IP addresses. i need to be able to specify which IP it should use. ie. the first in the list should be the client.ip not the last.

my nginx grok is:

NGINXACCESS %{IP:clientip} %{NGUSER:ident} %{NGUSER:auth} \[%{HTTPDATE:timestamp}\] "%{WORD:verb} %{DATA:request} HTTP/%{NUMBER:httpversion}" %{NUMBER:response} (?:%{NUMBER:bytes}|-) (?:"(?:%{URI:referrer}|-)"|%{QS:referrer})(?:;|) %{QS:agent}

any ideas on grok to cover that log?

geniestacks
  • 65
  • 1
  • 2
  • 7
  • Can you the `access_log` and `log_format` directives from your Nginx configuration? Also, is the grok failing, and if so, where? – GregL Sep 28 '15 at 11:53
  • this is the custom log format for all nginx logs: ``` log_format custom '$remote_addr - $remote_user [$time_local] ' '"$request" $status $body_bytes_sent ' '"$http_referer" "$http_user_agent" ' '"$host" ' '"$http_x_forwarded_for" '; ``` – geniestacks Sep 29 '15 at 01:36

2 Answers2

4

Not sure if you're still having this issue, but if so, here's what will work for you.

Given this log format:

log_format custom '$remote_addr - $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent" "$host" "$http_x_forwarded_for"';

the grok pattern you've specified doesn't take into account the addition of the "$host" "$http_x_forwarded_for" portion.

Not sure why your grok isn't failing, but it should.

In any event, this pattern will work with the log format above:

%{IP:clientip} %{NOTSPACE:ident} %{NOTSPACE:auth} \[%{HTTPDATE:timestamp}\] "%{WORD:verb} %{DATA:request} HTTP/%{NUMBER:httpversion}" %{NUMBER:response} (?:%{NUMBER:bytes}|-) (?:"(?:%{URI:referrer}|-)"|%{QS:referrer})(?:;|) %{QS:agent} "%{NOTSPACE:host}" "(?<x_forwarded_for>%{IP:xff_clientip}, .*)"

And results in the following fields

httpversion      1.1
request          /api/filter/14928/content?api_key=apikey&site=website
timestamp        28/Sep/2015:12:39:56·+1000
auth             -
host             my.website.com
agent            "-"
x_forwarded_for    1.144.97.102,·1.144.97.102,·1.144.97.102,·127.0.0.1,·172.31.26.59
clientip         172.31.7.219
bytes            101
response         403
xff_clientip     1.144.97.102
ident            -
port    
verb             GET
referrer    

Note that you've got a couple of new fields than you would have had before.

The first ("x_forward_for" => 1.144.97.102, 1.144.97.102, 1.144.97.102, 127.0.0.1, 172.31.26.59) is the contents of the last set of quotes, or $http_x_forwarded_for from the log format.
The second ("xff_clientip" => 1.144.97.102) is just the first IP in that list, which should translate to the actual source IP of the request.

If it were me, I'd also run the x_forwarded_for field through a mutate filter to break it into an array:

mutate {
  split  => { "x_forwarded_for" => ", " }
}
GregL
  • 9,030
  • 2
  • 24
  • 35
0

For the last part, the solution by Anton Roslov would only match "ip1, ip2" and "single-ip" log lines, but not "ip1, ip2, ip3".
IMHO something like

(?<x_forwarded_for>%{IP:clientip}(?:, [^,]+)*)

should do the trick. Just checking...

... \"(?:%{DATA:user_agent}|-)\" \"(?<x_forwarded_for>%{IP:clientip}(?:, [^,]+)*)?|-\"

or

... \"(?:%{DATA:user_agent}|-)\" \"(-|(?<x_forwarded_for>%{IP:clientip}(?:, [^,]+)*)?)\"

should be your pattern of choice. Tested in grokdebug.herokuapp.com.

kenlukas
  • 2,886
  • 2
  • 14
  • 25
mdb
  • 1
  • 1