0

I've configured Logstash to filter httpd_access_log messages and grok the fields associated with COMBINEDAPACHELOG. However, I'm receiving errors like the following:

[2017-02-10T15:37:39,361][WARN ][logstash.outputs.elasticsearch] Failed action. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"filebeats", :_type=>"logs", :_routing=>nil}, 2017-02-10T23:37:34.187Z perf-wuivcx02.hq.mycompany.com cdn.mycompany.com 192.168.222.60 - - [10/Feb/2017:15:37:30 -0800] "GET /client/asd-client-main.js HTTP/1.1" 200 221430 "http://perf.companysite.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36"], :response=>{"index"=>{"_index"=>"filebeats", "_type"=>"logs", "_id"=>"AVoqY6qkpAiTDgWeyMHJ", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse [timestamp]", "caused_by"=>{"type"=>"number_format_exception", "reason"=>"For input string: \"10/Feb/2017:15:37:30 -0800\""}}}}}

Here is my Logstash filter configuration:

filter {
  if [type] == "json" {
    json {
      source => "message"
    }
  }
  if [type] == "syslog" {
    grok {
      match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
      add_field => [ "received_at", "%{@timestamp}" ]
      add_field => [ "received_from", "%{host}" ]
    }
    date {
      match => [ "syslog_timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
    }
  }
  if [type] == "httpd_access_log" {
    grok {
      match => { "message" => "%{URIHOST} %{COMBINEDAPACHELOG}" }
    }
    date {
      match => [ "timestamp", "MMM dd yyyy HH:mm:ss", "MMM  d yyyy HH:mm:ss", "ISO8601" ]
    }
  }
}

The date function works fine for processing syslog type messages, but is not working for httpd_access_log messages. Does anyone know why the timestamps are causing lines from httpd_access_log files to fail indexing in Elasticsearch?

Thanks in advance for any help or advice you can provide!

Justin
  • 3
  • 1
  • 2

1 Answers1

0

This isn't 100% a filter problem, the output is merely the symptom. Here are the key parts of the error message that show you this.

[2017-02-10T15:37:39,361][WARN ][logstash.outputs.elasticsearch]

That's telling you that the plugin that failed, was the elasticsearch output.

Failed action. {:status=>400, :action=>["index",

(Clipped for clarity) That's attempting an index action on ElasticSearch.

"error"=>
  {"type"=>"mapper_parsing_exception",
   "reason"=>"failed to parse [timestamp]",
   "caused_by"=>
     {"type"=>"number_format_exception",
      "reason"=>"For input string: \"10/Feb/2017:15:37:30 -0800\""}
     }
   }

What's happening here is that the timestamp field in the index is not accepting the string you're attempting to put into it. The fact that it says number_format_exception says that ElasticSearch is expecting a non-string as input.

Logstash is attempting to write a string to the timestamp field. This is a sign that the timestamp field hasn't actually been run through the date {} filter. This suggests that if [type] == "httpd_access_log" { isn't catching all possible instances of timestamp, or the pattern for your date filter isn't catching this. The error string was cleaned up, but I'm not sure if your source really is issuing a timestamp like:

10/Feb/2017:15:37:30 -0800

If it actually is entering the pipeline like that, figure out why.

sysadmin1138
  • 131,083
  • 18
  • 173
  • 296