I am parsing multiline log4j logs with logstash and I've got the most curious problem.
Here's an example of some related log entries:
2016-01-05 13:55:26,275 DEBUG [class] [thread] - bla 2016-01-05 13:55:26,275 DEBUG [class] [thread] - blaa 2016-01-05 13:55:26,275 DEBUG [class] [thread] - blaaa 2016-01-05 13:55:26,275 DEBUG [class] [thread] - blaaaa 2016-01-05 13:55:26,282 DEBUG [class] [thread] - blaaaaa 2016-01-05 13:55:26,282 DEBUG [class] [thread] - blaaaaaa 2016-01-05 13:55:27,030 DEBUG [class] [thread] - blaaaaaaa
Here's my logstash configuration for log4j:
if [type] == "log4j" { multiline { pattern => "(^%d{yyyy-MM-dd HH:mm:ss,SSS})" what => previous negate => true } grok { match => ["message", "%{TIMESTAMP_ISO8601:logdate} %{LOGLEVEL:loglevel} (?.*)"] } date { match => ["logdate", "yyyy-MM-dd HH:mm:ss,SSS", "ISO8601"] timezone => "Europe/Paris" } }
The result is fine; no grok parse errors, dates are fine, log level is fine, BUT, the message contents is duplicated:
2016-01-05 13:55:26,275 DEBUG [class] [thread] - bla 2016-01-05 13:55:26,275 DEBUG [class] [thread] - blaa 2016-01-05 13:55:26,275 DEBUG [class] [thread] - blaaa 2016-01-05 13:55:26,275 DEBUG [class] [thread] - blaaaa 2016-01-05 13:55:26,282 DEBUG [class] [thread] - blaaaaa 2016-01-05 13:55:26,282 DEBUG [class] [thread] - blaaaaaa 2016-01-05 13:55:27,030 DEBUG [class] [thread] - blaaaaaaa [class] [thread] - bla 2016-01-05 13:55:26,275 DEBUG [class] [thread] - blaa 2016-01-05 13:55:26,275 DEBUG [class] [thread] - blaaa 2016-01-05 13:55:26,275 DEBUG [class] [thread] - blaaaa 2016-01-05 13:55:26,282 DEBUG [class] [thread] - blaaaaa 2016-01-05 13:55:26,282 DEBUG [class] [thread] - blaaaaaa 2016-01-05 13:55:27,030 DEBUG [class] [thread] - blaaaaaaa
AND the timestamp and log level are missing from the first line of the duplicated content!
Anyone knows why this is happening?
Any help is welcome.
EDIT after 473183469's remarks
The date pattern in the multiline filter doest not work as expected.
I've modified logstash's configuration to combine lines comming only from exceptions (just like un the docs)
if [type] == "log4j" { multiline { pattern => "^\s" what => "previous" } multiline { pattern => "^Caused" what => "previous" } grok { match => ["message", "%{TIMESTAMP_ISO8601:logdate} %{LOGLEVEL:loglevel} (?.*)"] } date { match => ["logdate", "yyyy-MM-dd HH:mm:ss,SSS", "ISO8601"] timezone => "Europe/Paris" } }
I've juste added another multiline filter in case a "Caused by" line is present in the stack trace.
This works alright, except the contents of the message is duplicated in the same way! Again the timestamp and the log level are missing:
016-01-06 09:46:00,842 ERROR [coyntry.org.web.LoaderListener] [Thread-6] - java.lang.RuntimeException: java.io.FileNotFoundException: file (No such file or directory) at coyntry.org.web.WebProperties.getDatabaseProperties(WebProperties.java:174) at coyntry.org.web.WebProperties.getServiceLocator(WebProperties.java:156) at coyntry.org.web.LoaderListener$LoaderThread.run(LoaderListener.java:74) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.FileNotFoundException: file (No such file or directory) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.(FileInputStream.java:146) at coyntry.org.web.WebProperties.getDatabaseProperties(WebProperties.java:168) [coyntry.org.web.LoaderListener] [Thread-6] - java.lang.RuntimeException: java.io.FileNotFoundException: file (No such file or directory) ... 3 more, at coyntry.org.web.WebProperties.getDatabaseProperties(WebProperties.java:174) at coyntry.org.web.WebProperties.getServiceLocator(WebProperties.java:156) at coyntry.org.web.LoaderListener$LoaderThread.run(LoaderListener.java:74) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.FileNotFoundException: file (No such file or directory) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.(FileInputStream.java:146) at coyntry.org.web.WebProperties.getDatabaseProperties(WebProperties.java:168) ... 3 more
I really can't understand what is going on.