1

Background

I have two types of log files: output from an ETL process, and then output from a downstream processor. We call these "ETL" and "Processor" logs.

The ETL logs are in their own folder under our logging directory, while the processor logs sit on their own in that same directory.

So, I have a folder structure that goes something like this:

/Archive
    /DataLoader_Supplemental
        /DataLoader_ETLForRequestID_1
            /(...40 log files)
        /DataLoader_ETLForRequestID_2
            /(...40 log files)
        DataLoader_Processor_123.log
        DataLoader_Processor_456.log

The log styles for each are the same (as in, I can use the same grok for both).

Goal

I would like to have both of these log types put into the same ElasticSearch index as different types, so that I can query them.

Problem

I was able to make this work when pointing it to only one type of log (*.log in a specific ETL request folder.)

However, I do not seem to be able to make that work with two different types, or make it work to scan all the ETL folders and extract all their logs.

What am I doing wrong?

My current config file

input { 
    file {
      path => '//MyFileServer/DATALOADER-TST/Archive/DataLoader_Supplemental/DataLoader_ETLForRequestID**/*.log'
      type => "etl"
      sincedb_path => "C:/Users/skilleen/Desktop/temp/logstash/target/.sincedb.etl.log"
      start_position => "beginning"
    }

    file {
      path => '//MyFileServer/DATALOADER-TST/Archive/DataLoader_Supplemental/*.log'
      type => "processor"
      sincedb_path => "C:/Users/skilleen/Desktop/temp/logstash/target/.sincedb.processor.log"
      start_position => "beginning"
    }
}
filter {
    grok {
            match => { "message" => "%{DATESTAMP:datestamp} %{ISO8601_TIMEZONE:tzoffset} %{SYSLOG5424SD:loglevel}" }
    }
}

output { 
    elasticsearch {
        protocol => "http"
        host => "localhost:9200"
        index => "dataloaderlogstst"
    } 
}

Results When I use this Config

Logstash appears to be processing something, and I see the sincedb files created; however, the indexes are never created on ElasticSearch.

UPDATE: After some patience, it appears that the ETL logs were imported to ElasticSearch while the processor logs were not.

The ElasticSearch Console output

[2015-08-05 08:43:38,282][INFO ][node                     ] [Isis] version[1.7.1], pid[22120], build[b88f43f/2015-07-29T09:54:16Z]
[2015-08-05 08:43:38,283][INFO ][node                     ] [Isis] initializing ...
[2015-08-05 08:43:38,356][INFO ][plugins                  ] [Isis] loaded [], sites [HQ]
[2015-08-05 08:43:38,428][INFO ][env                      ] [Isis] using [1] data paths, mounts [[OS (C:)]], net usable_space [40.2gb], net total_space [223.2gb], types [NTFS]
[2015-08-05 08:43:41,605][INFO ][node                     ] [Isis] initialized
[2015-08-05 08:43:41,606][INFO ][node                     ] [Isis] starting ...
[2015-08-05 08:43:42,292][INFO ][transport                ] [Isis] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/172.16.85.21:9300]}
[2015-08-05 08:43:42,568][INFO ][discovery                ] [Isis] elasticsearch/74dbAjLJQj62k6z83LkLog
[2015-08-05 08:43:46,339][INFO ][cluster.service          ] [Isis] new_master [Isis][74dbAjLJQj62k6z83LkLog][DCSKILLEEN][inet[/172.16.85.21:9300]], reason: zen-disco-join (elected_as_master)
[2015-08-05 08:43:46,383][INFO ][gateway                  ] [Isis] recovered [1] indices into cluster_state
[2015-08-05 08:43:46,764][INFO ][http                     ] [Isis] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/172.16.85.21:9200]}
[2015-08-05 08:43:46,766][INFO ][node                     ] [Isis] started
[2015-08-05 09:10:13,149][INFO ][cluster.metadata         ] [Isis] [dataloaderlogstst] creating index, cause [auto(bulk api)], templates [], shards [5]/[1], mappings [etl]
[2015-08-05 09:10:13,294][INFO ][cluster.metadata         ] [Isis] [dataloaderlogstst] update_mapping [etl] (dynamic)
[2015-08-05 09:10:14,097][INFO ][cluster.metadata         ] [Isis] [dataloaderlogstst] update_mapping [etl] (dynamic)

The LogStash Console output

C:\Users\skilleen\Downloads\logstash-1.5.3\logstash-1.5.3\bin>logstash agent -f
logstash.conf
io/console not supported; tty will not be manipulated
'[DEPRECATED] use `require 'concurrent'` instead of `require 'concurrent_ruby'`
Logstash startup completed
SeanKilleen
  • 1,073
  • 8
  • 25
  • 38
  • What do your Logstash and ES logs say? – GregL Aug 05 '15 at 13:25
  • @GregL the LogStash console output just says "Logstash startup completed." An update: the first set of files (the ETL logs) completed, but the second set (the processor logs) do not appear to have been completed. I don't see any additional logs in the logstash directory. The ES log mentions updating mappings for type ETL, but nothing for processor.. – SeanKilleen Aug 05 '15 at 15:42
  • In theory you should have seen documents increasing in real-time while LS was parsing your files. – GregL Aug 05 '15 at 15:48
  • Actually, I just noticed that you're running LS on one machine, but the logs are on another. While this may work, I can't imagine it's ideal. Maybe try copying the logs locally, or running LS on `MyFileServer`. – GregL Aug 05 '15 at 15:49
  • @GregL and I did -- for the first type (ETL). Nothing came through for the second type, despite them going to the same index. – SeanKilleen Aug 05 '15 at 15:50
  • Maybe try starting LS with the `--verbose` flag so that it says what it's up to. Alternatively, try commenting out the ETL files stanza and outputting to stdout as well. See if that nets anything. – GregL Aug 05 '15 at 15:54
  • @GregL these are great tips, thank you! I'll try them ASAP. Unfortunately moving the log file / copying them isn't really possible here -- this didn't strike me as non-ideal so I'll dig into logstash more to find out why it's not s good practice. – SeanKilleen Aug 05 '15 at 15:56
  • Here's another test to try: comment out the `sincedb_path` and `start_position` parameters and restart Logstash. Ensure that the *processor* log is actually getting new logs ('`tail -f` works) – KJH Aug 18 '15 at 14:19

0 Answers0