0

I need to get a vague idea of disk space requirements before I start forwarding logs to a Splunk instance. Each indexed line will have on average 320 characters and I will be indexing around 500,000 lines a day.

My assumptions are 1 byte per character and I'm ignoring space taken by Splunk for indices, etc. That's 160MB per day.

Would you say that's semi-accurate or totally off the mark?

Michael
  • 103
  • 2
  • 2
    Google is your friend. http://docs.splunk.com/Documentation/Splunk/6.1.1/Installation/Estimateyourstoragerequirements – Sven Jun 04 '14 at 13:22
  • Oh, but I don't have access to a Splunk instance to test this on. – Michael Jun 04 '14 at 13:26
  • But now you know that a) Splunk stores the incoming logs in a compressed format b) The index size is not negligible (10% - 110% of incoming raw data size). c) The actual size isn't as easy to compute as you expect, as it's highly dependent on your data. d) You really need a test instance of Splunk to get valid, reliable data. – Sven Jun 04 '14 at 13:29
  • Got it. I didn't actually know most of that. Basically it's impossible to give a ballpark estimate without actually testing it. – Michael Jun 04 '14 at 13:31

1 Answers1

0

Unless your logs are highly random I think you'll be extremely pleased with how much space Splunk will save you, if these are say syslog messages or the sort of messages you get out of apache/weblogic etc. then you'll see a very sharp decrease in space needed, perhaps by as much as 80-90%.

As an example we log about 17TB/day and keep the original logs for 90 days prior to discarding them (we filter them to into summary indexes) and we have roughly 160TB of 'hot' stored data, which is about a 90% reduction over incoming volume.

Hope this helps.

Chopper3
  • 100,240
  • 9
  • 106
  • 238