Questions tagged [apache-flume]

a distributed available service for efficiently collecting, aggregating, and moving large amounts of log data

Apache Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. It has a simple and flexible architecture based on streaming data flows. It is robust and fault tolerant with tunable reliability mechanisms and many failover and recovery mechanisms. It uses a simple extensible data model that allows for online analytic application.

Homepage:

https://flume.apache.org/

4 questions
1
vote
0 answers

Most efficient way to pipe web beacon logs to Apache Flume?

The Setup I've setup a simple Nginx server that logs (in a JSON) format, which is then piped to an S3 bucket with Apache Flume. All the Nginx server does is respond with a web beacon tracking pixel and write to the log file. Everything's cool so…
landons
  • 111
  • 1
1
vote
1 answer

Processing pre-existing log files with Flume

I have a large set of log files that I need to extract data from. Is it possible to use Flume to read these files and dump them into an HDFS (Cassandra, or another data source) which I can then query? The documentation seems to suggest it's all…
duckus
  • 11
  • 2
1
vote
1 answer

Consistent Reliable Messaging

Im working a new project, Im currently deciding between flume & scribe for messaging systems... ( most probably sent to logs or hadoop ) I cannot lose a message ever.. What are your thoughts on which is better?…
Arenstar
  • 3,592
  • 2
  • 24
  • 34
0
votes
1 answer

Flume- Error Log while using FileChannel

I am using Flume flume-ng-1.5.0 ( with CDH 5.4) to collect logs from many Servers and Sink to HDFS Here is my configuration : #Define Source , Sinks, Channel collector.sources = avro collector.sinks = HadoopOut collector.channels = fileChannel #…
Summer Nguyen
  • 214
  • 3
  • 10