How to grep for line numbers in a binary file?

0

I'm on Centos 5 Linux and using GNU grep v2.5.1 and looking at a 36GB log file. It is huge and I need to find around a million lines starting from the occurrence of a string 6307459 in the log file and view it in emacs. I'm using grep to find the line number of the occurrence and then using head and tail to get the section I'm interested in reviewing. The issue I'm facing is that grep finds the line but does not give it to me, instead printing message about it being a binary file:

> grep -n 6307459 /disk2/user/test/logs/2015-03-31-23-42-52-7224.log 
Binary file /disk2/user/test/logs/2015-03-31-23-42-52-7224.log matches

I imagine somewhere in the log file are some control characters that are tricking grep, but the beginning and end of the file looks like normal text.

I tried renaming it to /disk2/user/test/logs/2015-03-31-23-42-52-7224.log.txt, but it still says it is a binary file.

How can I get the line number of the occurrence of the pattern 6307459 in the file so that I can use head and tail to see 20 lines before the pattern and 1,000,000 after the pattern?

WilliamKF

Posted 2015-04-06T19:14:34.427

Reputation: 6 916

See: How do I grep through binary files that look like text? at serverfault SE

– kenorb – 2015-04-06T21:35:41.867

Answers

1

Per s g's linked answer from serverfault, passing -a to grep forces binary files to be treated as text files. Here is the detailed solution:

> grep -a -n 6307459 /disk2/user/test/logs/2015-03-31-23-42-52-7224.log
171560394:Rcvd client's reconnect count 6307459.

Using the found line number of 171560394, I then created the following command to get a million lines starting at 100 lines before the found pattern so that I can view it in emacs:

> head -n 172560294 /disk2/user/test/logs/2015-03-31-23-42-52-7224.log  \
  | tail -n 1000000 > /disk2/user/test/logs/2015-03-31-23-42-52-7224.log_mid

WilliamKF

Posted 2015-04-06T19:14:34.427

Reputation: 6 916