7

I am having trouble getting the following nginx error log message to parse in the grok debugger. I have a feeling there is a stupid trick that I should use but can't figure out what it may be.

2015/03/20 23:35:52 [error] 8#0: *10241823 testing "/www" existence failed (2: No such file or directory) while logging request, client: 201.45.203.78, server: $domain, request: "GET /ritikapuri_"

Here is my Grok pattern so far:

(?<timestamp>%{YEAR}[./]%{MONTHNUM}[./]%{MONTHDAY} %{TIME}) \[%{LOGLEVEL:severity}\] %{POSINT:pid}#%{NUMBER}: %{GREEDYDATA:errormessage} client: %{IP:client}

This pattern gets me to the "server" section but I can't seem to get the rest to parse and it isn't clear to me why.

If I use another %{GREEDYDATA} pattern to grab the end of the log it sometimes wont' parse logs that don't match the above and give me a _grokparsefailure.

Would the best route be to use if statements to trap the different variations of log messages in nginx?

I have followed methods including this one but can't get them working.

realdubb
  • 143
  • 3
jmreicha
  • 791
  • 1
  • 16
  • 29

4 Answers4

4

I used @dr01's answer to improve the recipe for error logs in nginx 1.15 using the notice format - this answer will separate out the HTTP version and HTTP method and request.

(?<timestamp>%{YEAR}[./]%{MONTHNUM}[./]%{MONTHDAY} %{TIME}) \[%{LOGLEVEL:severity}\] %{POSINT:pid}#%{NUMBER:threadid}\: \*%{NUMBER:connectionid} %{GREEDYDATA:message}, client: %{IP:client}, server: %{GREEDYDATA:server}, request: "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion}))", host: %{GREEDYDATA:host}

sample string

2015/03/20 23:35:52 [error] 8#0: *10241823 testing "/www" existence failed (2: No such file or directory) while logging request, client: 201.45.203.78, server: $domain, request: "GET /dsfadsfe HTTP/1.1", host: "localhost:8080"

Output from grok debugger

{
  "timestamp": [
    [
      "2015/03/20 23:35:52"
    ]
  ],
  "severity": [
    [
      "error"
    ]
  ],
  "pid": [
    [
      "8"
    ]
  ],
  "threadid": [
    [
      "0"
    ]
  ],
  "connectionid": [
    [
      "10241823"
    ]
  ],
  "message": [
    [
      "testing "/www" existence failed (2: No such file or directory) while logging request"
    ]
  ],
  "client": [
    [
      "201.45.203.78"
    ]
  ],
  "server": [
    [
      "$domain"
    ]
  ],
  "verb": [
    [
      "GET"
    ]
  ],
  "request": [
    [
      "/dsfadsfe"
    ]
  ],
  "httpversion": [
    [
      "1.1"
    ]
  ],
  "host": [
    [
      ""localhost:8080""
    ]
  ]
}
realdubb
  • 143
  • 3
2

Without seeing you attempted patterns which didn't work I cannot comment on why they didn't work. As you stated the pattern you provided matches up to server, I have modified your statement slightly and added a bit to the end to capture the rest:

(?<timestamp>%{YEAR}[./]%{MONTHNUM}[./]%{MONTHDAY} %{TIME}) \[%{LOGLEVEL:severity}\] %{POSINT:pid}#%{NUMBER}: %{GREEDYDATA:errormessage},\ client: %{IP:client}, server: \$domain, request: \"%{WORD:method} %{URIPATH:path}\"

Notice that after your GREEDYDATA, I have added a comma, as you probably don't want that in your captured data, and I assume that will always be used before the client part of the message. I suspect you had issue matching $domain, as you need a \ in front of the $ to escape it.

Please note, while this works in the grok debugger, I suspect it won't in logstash, you will need to escape all of your spaces as well, in order for logstash to play nice with the pattern (that is, change every instance " " to "\ ")

re. :Would the best route be to use if statements to trap the different variations of log messages in nginx?

I'm not exactly clear on what you're asking, but you can put if statements around your filter, or parts of your filter, like in this answer. You can do the same thing using tags, if you can figure out a way to tag them. These two options are probably "best" in terms of processing power used for each line, as I believe there will be less work involved than something like this answer as each event would need to be checked against each pattern. You could also write a very complex pattern which could match every different situation, but I don't think that is ideal, as the pattern would expand out to have so many different potential matches it would take a lot of power to check each time.

I hope that helps!

Rumbles
  • 915
  • 1
  • 12
  • 27
2

This grok recipe also works, no matter the value of the server field:

(?<timestamp>%{YEAR}[./]%{MONTHNUM}[./]%{MONTHDAY} %{TIME}) \[%{LOGLEVEL:severity}\] %{POSINT:pid}#%{NUMBER:threadid}\: \*%{NUMBER:connectionid} %{GREEDYDATA:errormessage}, client: %{IP:client}, server: %{GREEDYDATA:server}, request: %{GREEDYDATA:request}
dr_
  • 1,035
  • 11
  • 19
0

Grok error pattern with the addition of optional upstream and referrer fields. Tested with nginx:1.17.3

(?<timestamp>%{YEAR}[./]%{MONTHNUM}[./]%{MONTHDAY} %{TIME}) \[%{LOGLEVEL:severity}\] %{POSINT:pid}#%{NUMBER:threadid}\: \*%{NUMBER:connectionid} %{GREEDYDATA:message}, client: %{IP:client}, server: %{GREEDYDATA:server}, request: "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion}))"(, upstream: "%{GREEDYDATA:upstream}")?, host: "%{DATA:host}"(, referrer: "%{GREEDYDATA:referrer}")?
cdalxndr
  • 101
  • 1