1

The logstash documentation indicates that you can collapse the multiple indented lines in a Java stacktrace log entry into a single event using the multiline codec:

https://www.elastic.co/guide/en/logstash/current/plugins-codecs-multiline.html

input {
   syslog {
       type => syslog
       port => 8514
       codec => multiline {
            pattern => "^\s"
            what => "previous"
       }
  }
}

This is based on logstash finding an indent at the start of the line and combining that with the previous line.

However, the logstash documentation is the only place where I can find a reference to this. The general user community seems to be using elaborate grok filters to achieve the same effect.

I've tried the basic indentation pattern provided by logstash, but it doesn't work. Has anyone else managed to get this working by matching the indentation pattern?

Garreth McDaid
  • 3,399
  • 26
  • 41

1 Answers1

0

Yes, though not with the syslog {} input. I've done it with the file {} input and Tomcat logs. If the stacktraces are coming into syslog with a new event on each line, and still having the usual syslog prefix of datestamp and such, reassembling these into a unitary stackdump becomes much harder. It still can be done, but requires much more extensive filters.

  1. The input codec is not multiline; in the case of an event-per-line, the multiline codec can't handle it.
  2. A Grok filter to split out the syslog message into parts, taking the SYSLOGMESSAGE part into its own field.
  3. Using the multiline {} filter on the SYSLOGMESSAGE field to reassemble your stackdump.
  4. Use one and only one filter-worker (-w flag), it's the only way to be sure the entire stacktrace is gathered.

If at all possible, it's best to use the file {} codec on the file the stacktraces are emitted into, and use the indentation-method you've already found.

sysadmin1138
  • 131,083
  • 18
  • 173
  • 296
  • I seem to have got it working, although I am not sure how stable it is. Instead of using "^\s" as the pattern, I just used an actual keyboard generated tab ie " ". That is producing multiline entries in elasticsearch, with the "multiline" tag. I will update answer if it holds up. – Garreth McDaid Mar 21 '17 at 10:33
  • @GarrethMcDaid Yeah... the `csv` filter needs an actual tab-character for tab-delimited fields too. Due to open source, it's a bit inconsistent which plugin needs `\s` and which needs the actual character. – sysadmin1138 Mar 21 '17 at 14:25
  • To be honest, it isn't stable or consistent. I'm trying to work on your suggestion, using a combination of grok and the ml filter. Its horribly complex when using syslog. – Garreth McDaid Mar 21 '17 at 14:33
  • Do you have any thoughts on best way to forward application logs from Docker containers to ELK? I know there are options when your applications write to STDOUT STDERR, but our applications write to the file system. Not sure if it is a good idea to run a logstash daemon in every Docker container. Would prefer to rely on syslog and have a central logstash system. – Garreth McDaid Mar 21 '17 at 14:35
  • @GarrethMcDaid The Docker problem is one I haven't handled myself, but I do have an idea. Run logstash on the host box, but run a `tcp` listener on a port on the machine-network. The apps then write to the network over a socket to that tcp listener. That logstash then either sends everything to the local syslog for relay, or to the central syslog for relay. Or... direct to ElasticSearch, if you rather. – sysadmin1138 Mar 21 '17 at 14:40
  • Not sure I follow. Does that suggestion include running logstash in each container? If I was going to do that, I'd just configure logstash in each container as required and have it forward directly from the container to ElasticSearch. Thanks anyway. – Garreth McDaid Mar 21 '17 at 14:51
  • @GarrethMcDaid Not quite. In that arrangement, the app running in each container would be updated to log to a socket, not a file. Not always possible, I know; but would be very low footprint. – sysadmin1138 Mar 21 '17 at 16:53