2
1
I'm currently parsing apache logs with that command:
tail -f /opt/apache/logs/access/gvh-access_log.1365638400 |
grep specific.stuff. | awk '{print $12}' | cut -d/ -f3 > ~/logs
The output is a list of domains:
www.domain1.com
www.domain1.com
www.domain2.com
www.domain3.com
www.domain1.com
In another terminal I then run this command:
watch -n 10 'cat ~/logs | sort | uniq -c | sort -n | tail -50'
The output is:
1023 www.domain2.com
2001 www.domain3.com
12393 www.domain1.com
I use this to monitor in quasi real time apache stats. The trouble is that logs
get very big very fast. I don't need logs for any other purpose than uniq -c
.
My question is: is there any way to avoid using a temporary file? I don't want to hand-roll my own counter in my language of choice, I'd like to use some awk
magic if possible.
Note that since I need to use sort
, I have to use a temp file in the process, because sorting on streams is meaningless (although uniq isn't).
1It doesn't work because it is meaningless to use sort on a stream, that's why I need a temp file in the process. – cpa – 2013-04-11T09:29:11.400
have you tried and saw that it didn't work for you or you are just assuming it will not work ?? The creation of the temporary file is the same thing as piping the output of your first command to the second command as its input. If you haven't tried, just try it. If you tried, what prob lem did you encounter ? – MelBurslan – 2013-04-11T09:54:05.893
1There several reasons why this doesn't work (and I've tried):
sort
waits for EOF before writing its output. I hope it's obvious why.tail -50
takes the last 50 lines from EOF.So in the end it amounts to the fact that
tail -f
on an apache log will never output EOF since it is constantly append lines to the file. Dumping results in a file is a way to achieve that. Sure, I could justtail
but it still requires parsing the log file every time, which is stupid. – cpa – 2013-04-11T10:05:40.757