49

Possible Duplicate:
Alternatives to Splunk?

This has been discussed, but it has been several months, so it may be time to revisit it:

Earlier discussion RE Splunk alternatives

For the record, Splunk rocks. But the pricing is simply beyond what we can consider (When I spoke with Splunk today, the cost for a system to index 5gb/day of data is over $30,000.)

That is more than we spend on SQL Server (by a large multiple), more than we spend on a rack of servers (by a multiple), etc. etc.

The splunk sales team is correct (that for $30K we get more value and functionality than if we spend the same building our own system), but it doesn't matter. The splunk cost is simply too high (by a multiple).

Soooooo, we are looking around!

Is anyone out there building a splunk like system?

Our basic need:

  • Able to listen for syslog messages on multiple udp ports
  • Able to index the incoming data in an async way
  • Some kind of search engine
  • Some kind of UI
  • An API to the search engine (to embed in our console)

We currently need to index 3-5gb/day, but need to be able to scale to 10gb/day or more. We do not need a lot of history (30 days is fine).

We use Windows 2008 and 2003 servers.

Thanks for your thoughts!

UPDATE: We spent two weeks researching commercial and open source options. Our conclusion: Write our own (we are a software company... we know how to write things). We built a great system built on mongodb and .NET that gives us the functions we needed from MongoDB in about one engineering week. We have now completed our implementation. We use two Mongodb servers (master and slave), and are able to log and index any amount of log data (5gb/day, 15gb/day, etc), limited only by disk space.

UPDATE TO THE UPDATE (December, 2012): We continue to use our mongodb solution, and it works great! If we were building it today, we would strongly consider building it on top of elasticsearch.

OBSERVATIONS: This space needs a solid solution that is $1000-3000 flat rate. The licensing models used by the commercial firms are based on a "milk the data center ops guys" models. That is their right (of course!), but it leaves a HUGE space open for someone to come in underneath them. My guess is that in another year or two there will be a good open source solution that will be really usable.

Thank you all for your input (even if it was self promotion).

  • 4
    I'm sorry but you really generate 5gb/day off of 4-8 servers (what you can get for 30k depending on spec) ? I'd suggest going back and really looking at what you index no matter what solution you go with. – Zypher Feb 23 '11 at 22:48
  • 1
    This data is coming from an app (our code), not from an OS.... – Jonesome Reinstate Monica Feb 24 '11 at 01:35
  • this is a duplicate of http://serverfault.com/questions/62687/alternatives-to-splunk – warren Mar 03 '11 at 13:29
  • 1
    I actually think Splunk's very cheap for what it is myself, either way this is a duplicate, at least TRY to do a search first next time. Closing. – Chopper3 Mar 03 '11 at 13:49
  • 23
    Hey, this thread is honest! It links to the other thread, it discusses more alternatives than the other thread, and it is the only thread that discusses the hard costs of the alternatives. Methinks you are being hasty. – Jonesome Reinstate Monica Mar 06 '11 at 04:40
  • Regarding your observation section: There's already a solution that fits this and has been around for many years...LogZilla. It can handle hundreds of millions of events and won't cost a million bucks... – Clayton Dukes Sep 04 '12 at 15:03
  • We're working on a new tool called CloudPelican that provides you realtime aggregation, indexing, searching, event notification and more. You can stay in touch with the development at http://www.cloudpelican.com – RobinUS2 Sep 14 '12 at 14:18
  • sam, can I ask how did you implement the syslog listeners? – MatteoSp Jan 27 '14 at 10:36
  • 1
    This is all well and good assuming your needs are extremely focused and won't change. The more I learn about Splunk though, the more I see its value. Can your system import data from a file? Syslog? A TCP stream? A crash report? A REST web service? A database? Then can it correlate all that together and provide visualizations? Can it create alerts? Can it export? How much would it cost you to develop custom software for all that? Disclaimer: I have no association with Splunk – Wade Williams Sep 03 '15 at 15:22
  • @WadeWilliams But we don't need all that... (we would like it, but we don't need it). We are monitiring a single (large, multi faceted) web app that we authored. We store 0.6 TB of log data, instantly searchable... and we did not have to pay for splunk. We like (and want) splunk, but the price tag is the issue. – Jonesome Reinstate Monica Sep 03 '15 at 15:52
  • @WadeWilliams, all that is trivial in the world of Apache Spark. – PKHunter Feb 18 '17 at 16:41
  • @PKHunter Apache Spark is not relevant to this convo. This is a log management convo, not a platform convo. – Jonesome Reinstate Monica Feb 19 '17 at 06:52
  • Thank you @samsmith. My mistake then, I got thrown by remarks such as "For the record, Splunk rocks". I thought it was within the remit of this convo to discuss solutions to the log management problem, especially because the crux of the question was to seek alternatives to an expensive solution such as Splunk. In my observation Spark can do log management just fine, perhaps better than many similar solutions. But OK. – PKHunter Feb 19 '17 at 16:25

2 Answers2

25

logstash is a tool for managing events and logs. You can use it to collect logs, parse them, and store them for later use (like, for searching). Speaking of searching, logstash comes with a web interface for searching and drilling into all of your logs.

https://www.elastic.co/products/logstash

It's still rather early in development, but sound very promising and moves fast.

Not Now
  • 3,532
  • 17
  • 18
Holger Just
  • 3,315
  • 1
  • 16
  • 23
  • 1
    logstash looks interesting, but looks like a lot of work to implement. Also, there are no current downloads in the download section? – Jonesome Reinstate Monica Feb 24 '11 at 05:42
  • 1
    Logstash is distributed as a rubygem. So you can just say `gem install logstash`. See http://rubygems.org/gems/logstash and http://code.google.com/p/logstash/wiki/GettingStartedCentralized – Holger Just Feb 24 '11 at 14:21
  • 1
    You can also use other search tools on the data collected by Logstash, such as Graylog2 which has an interface some people prefer. – Martijn Heemels Sep 04 '11 at 01:11
  • 6
    I recommend Logstash using Kibana ( http://kibana.org/ ), a highly scalable interface for Logstash. Yes, it's open source too :) – Paulo Coghi Oct 24 '12 at 13:51
  • @Paulocoghi Given that the first commit to Kibana occurred about three months after the last comment, I'm sure you can excuse me for not including it here. That said, Kibana is indeed rather awesome, esp. the new Ruby version. – Holger Just Oct 27 '12 at 09:47
9

I don't have a comparison matrix for the following in my mind, especially when it comes to comparison with splunk:

These are some fully operational tools:

Octopussy http://www.octopussy.pm

Logreport http://www.logreport.org/

Snare: http://www.intersectalliance.com/projects/index.html

Log surfer: http://www.crypt.gen.nz/logsurfer/

Log Analyser: http://loganalyzer.adiscon.com/

Log 2 timeline: http://log2timeline.net/#download ( this is more of a "timeline" analysis tool )

Finally, if you want to do some coding yourself but possibly have a more scalable solution: (the following are tools to collect log data, they don't necessary have all the functionality out of the box to search through the data.)

Honu https://github.com/jboulon/Honu

Chukwa http://wiki.apache.org/hadoop/Chukwa

Flume http://archive.cloudera.com/cdh/3/flume/

Edit: Added this comparison link: http://csgrad.blogspot.com/2010/07/guided-tour-of-hadoop-zoo-getting-data.html

Edit: Added Graylog2: Added Logstash. Logstash is probably the best positioned to day to become the "open source splunk replacement."

Not Now
  • 3,532
  • 17
  • 18