I am planning on using Webalizer to analyze and graph our IIS logs, but because we have a server farm Webalizer requires me to make sure that all of the logs are in chronological order (or else it will start skipping results).
Our logs are stored gzipped so I started by unzipping everything to separate files and then I used LogParser 2.2 in order to merge those files. My LogParser command was:
LogParser.exe -i:iisw3c "select * into combinedLogFile.log from *.log order by date, time" -o:w3c
I probably don't need * but I do need most of the fields because Webalizer will need them. This works perfectly fine on some of my logs, however one of our server farm clusters generate a LOT of logs, we have 14 servers where each server's logs are (at least) 2.5 GB per day (each log is in a separate day). So when I try and merge these logs LogParser just crashes with a meaningless generic error.
I assumed it was a memory issue and so I tried a number of ways to try and minimize the memory.
I am using powershell to call LogParser and so I started to try and pipe the input using the standard Powershell piping. (This caused an OutOfMemoryException in Powershell (instead of LogParser) sooner than just using the files in any way I could do it).
What I finally ended up with is using multiple named pipes being called from a batch file call to "Cat" directly piping that into LogParser...and I got back to where I started when I was pre-zipping them.
We have other scripts that process these same log files and none of them have issues (although their output is generally smaller than this ones will be).
So I just want to know if you have any ideas about a better way to merge all of these files or some LogParser script that will work as the one I came up isn't sufficient.
P.S. I know I could probably write a merging program in .NET as all of the individual logs are already sorted and so I wouldn't need to read more than a few rows at a time but I am trying to avoid having to do that if possible.