3

I am parsing multiline log4j logs with logstash and I've got the most curious problem.

Here's an example of some related log entries:

2016-01-05 13:55:26,275 DEBUG [class] [thread] - bla
2016-01-05 13:55:26,275 DEBUG [class] [thread] - blaa
2016-01-05 13:55:26,275 DEBUG [class] [thread] - blaaa
2016-01-05 13:55:26,275 DEBUG [class] [thread] - blaaaa
2016-01-05 13:55:26,282 DEBUG [class] [thread] - blaaaaa
2016-01-05 13:55:26,282 DEBUG [class] [thread] - blaaaaaa
2016-01-05 13:55:27,030 DEBUG [class] [thread] - blaaaaaaa

Here's my logstash configuration for log4j:

if [type] == "log4j" {
    multiline {
        pattern => "(^%d{yyyy-MM-dd HH:mm:ss,SSS})"
        what => previous
        negate => true
    }
    grok {
        match => ["message", "%{TIMESTAMP_ISO8601:logdate} %{LOGLEVEL:loglevel} (?.*)"]
    }
    date {
        match => ["logdate", "yyyy-MM-dd HH:mm:ss,SSS", "ISO8601"]
        timezone => "Europe/Paris"
    }
}

The result is fine; no grok parse errors, dates are fine, log level is fine, BUT, the message contents is duplicated:

2016-01-05 13:55:26,275 DEBUG [class] [thread] - bla
2016-01-05 13:55:26,275 DEBUG [class] [thread] - blaa
2016-01-05 13:55:26,275 DEBUG [class] [thread] - blaaa
2016-01-05 13:55:26,275 DEBUG [class] [thread] - blaaaa
2016-01-05 13:55:26,282 DEBUG [class] [thread] - blaaaaa
2016-01-05 13:55:26,282 DEBUG [class] [thread] - blaaaaaa
2016-01-05 13:55:27,030 DEBUG [class] [thread] - blaaaaaaa
[class] [thread] - bla
2016-01-05 13:55:26,275 DEBUG [class] [thread] - blaa
2016-01-05 13:55:26,275 DEBUG [class] [thread] - blaaa
2016-01-05 13:55:26,275 DEBUG [class] [thread] - blaaaa
2016-01-05 13:55:26,282 DEBUG [class] [thread] - blaaaaa
2016-01-05 13:55:26,282 DEBUG [class] [thread] - blaaaaaa
2016-01-05 13:55:27,030 DEBUG [class] [thread] - blaaaaaaa

AND the timestamp and log level are missing from the first line of the duplicated content!

Anyone knows why this is happening?

Any help is welcome.

EDIT after 473183469's remarks

The date pattern in the multiline filter doest not work as expected.

I've modified logstash's configuration to combine lines comming only from exceptions (just like un the docs)

if [type] == "log4j" {
    multiline {
        pattern => "^\s"
        what => "previous"
    }
    multiline {
        pattern => "^Caused"
        what => "previous"
    }
    grok {
        match => ["message", "%{TIMESTAMP_ISO8601:logdate} %{LOGLEVEL:loglevel} (?.*)"]
    }
    date {
        match => ["logdate", "yyyy-MM-dd HH:mm:ss,SSS", "ISO8601"]
        timezone => "Europe/Paris"
    }
}

I've juste added another multiline filter in case a "Caused by" line is present in the stack trace.

This works alright, except the contents of the message is duplicated in the same way! Again the timestamp and the log level are missing:

016-01-06 09:46:00,842 ERROR [coyntry.org.web.LoaderListener] [Thread-6] - java.lang.RuntimeException: java.io.FileNotFoundException: file (No such file or directory)
    at coyntry.org.web.WebProperties.getDatabaseProperties(WebProperties.java:174)
    at coyntry.org.web.WebProperties.getServiceLocator(WebProperties.java:156)
    at coyntry.org.web.LoaderListener$LoaderThread.run(LoaderListener.java:74)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.FileNotFoundException: file (No such file or directory)
    at java.io.FileInputStream.open(Native Method)
    at java.io.FileInputStream.(FileInputStream.java:146)
    at coyntry.org.web.WebProperties.getDatabaseProperties(WebProperties.java:168)
[coyntry.org.web.LoaderListener] [Thread-6] - java.lang.RuntimeException: java.io.FileNotFoundException: file (No such file or directory)
    ... 3 more,
    at coyntry.org.web.WebProperties.getDatabaseProperties(WebProperties.java:174)
    at coyntry.org.web.WebProperties.getServiceLocator(WebProperties.java:156)
    at coyntry.org.web.LoaderListener$LoaderThread.run(LoaderListener.java:74)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.FileNotFoundException: file (No such file or directory)
    at java.io.FileInputStream.open(Native Method)
    at java.io.FileInputStream.(FileInputStream.java:146)
    at coyntry.org.web.WebProperties.getDatabaseProperties(WebProperties.java:168)
    ... 3 more

I really can't understand what is going on.

pour toi
  • 302
  • 1
  • 4
  • 12

1 Answers1

1

Do you really need the multiline?

Isn't multiline used to join together again a single log which has been splitted by rsyslog?

The typical example is the java stack error trace message.

If rsyslog split at the "\n" the stack trace error, we are not going to understand anything anymore, so we need to re-arrange the message.

Your example is different, because each log line looks indipendent from other log lines.

In short: duplication as well as the header suppression come from multiline. Either fix multiline according your use case or (as I suspect) remove it tout-court.

473183469
  • 1,350
  • 1
  • 12
  • 23