1

I'm setting up a generic Elasticsearch-Logstash-Kibana stack to deploy to a few of my clients. I'm trying to template some of the pipelines, so that we only need to deploy configs/pipelines as needed for each client.

Logstash refers to input{...}, filter{...}, and output{...} as sections and their content as plugins, their aggregation as a processing pipeline, and the file each pipeline is contained in as a config file.

With that in mind, is there a scope for sections and pipelines? That is, are the sections defined in a particular config file used only by the pipeline in that config file?

If I had 2 config files and 2 pipelines:

# my_apache_pipeline.config
input {
    tcp {
       port 5000
    }
}
filter {
    if [application == "httpd"] {
       ...
    }
}
output {
    elasticsearch {
       ...
    }
}

and

#my_nginx_pipeline.config
input {
    tcp {
        codec => "json"
        port => 6000
    }
 }
 filter {
     if [application == "nginx"] {
         ...
     }
 }
 output {
    elasticsearch {
       ...
    }
 }        

Do those two above config files create the same 2 pipelines as the single below config file?

#my_merged_pipeline.config
input {
    tcp {
        codec => "json"
        port => 6000
    }
    tcp {
       port 5000
    }
}
filter {
     if [application == "nginx"] {
         ...
     }
     if [application == "httpd"] {
         ...
     }
}
output {
     elasticsearch {
       ...
    }
}

That is, does it matter which config file a set of plugins/sections are in to get the pipelines? Or does an input {...} defined in a particular config file only apply to the filter {...} and output {...} in that config file?

Drew
  • 253
  • 3
  • 11

1 Answers1

1

As far as I understand the way it works, the config scope is global and the contents of a given section (input, filter, output) are all basically joined together in the "final" configuration.

So in your example, yes, the two pipelines above would equate to the simplified config below, with the only difference that you'd have two elasticsearch outputs, and not just one. I don't think LS smart enough to know that they're the same.

My suggestion is to create nicely named files based on section, function and log type:

#inputs
00-input-lumberjack.conf
01-input-syslog.conf
02-input-syslog_vmware.conf

#filters
11-filter-haproxy.conf
12-filter-lighttpd.conf
13-filter-syslog.conf
14-filter-proftpd.conf
15-filter-httpd.conf
17-filter-cron.conf
18-filter-yum.conf
88-filter-checksum.conf
89-filter-cleanup.conf

#outputs
97-output-kafka.conf
98-output-redis.conf
99-output-elasticsearch.conf

You'd then run LS with the -f switch pointing at the folder that contains all the files above.

This will ensure they get loaded in a particular order and that you don't have duplication.

In my case, filters and outputs are surrounded by if's to check the type/host/file/etc.. to make sure a given section only applies to certain events.

This also allows a couple extra things.

  1. Quickly activate/deactivate sections by moving files to an inactive subfolder and restarting LS; it doesn't load config files recursively
  2. Work on new log types out of band, then copy the new config file to the directory and restart LS
GregL
  • 9,030
  • 2
  • 24
  • 35