grep - recognize carriage return as new line

Question

I want to search a webserver running unix for php-files containing a specific string. Usually I use these commands to accomplish this:

find . -name "*.php" -print0 | xargs -0 grep -H -i "the string to search for"

This will find any php file containing "the string to search for", and print the file name and the line in which the match was made.

This has worked great so far, but now I've encountered a server where all the php-scripts don't have any line feeds, but instead only carriage returns. grep doesn't seem to recognize carriage return as new line, so the command above will print the entire contents of a file if there is a match within it, instead of just printing the line.

Any help would be greatly appreciated!

There are several examples on how to convert line endings using a perl one liner on this article. http://en.wikipedia.org/wiki/Newline — Zoredache, Sep 30 '11 at 20:50

score 2 · Answer 1 · edited Jun 11 '20 at 10:02

What about using (grep on my Ubuntu, pretty sure most of the grep's out there has this flag)

  -o, --only-matching
         Print only the matched (non-empty) parts of a matching line, with each such >part on a separate output line.

together with

  -b, --byte-offset
         Print  the  0-based byte offset within the input file before each line of >output.  If -o (--only-matching) is specified, print the offset of
         the matching part itself.

Then you have the filename and the part of it you want.

Also, how did you manage to mangle your files like that? I tried using VI to replace newlines with CR only. But that made grep and cat behave very strangely instead.

contents of file test

gggggggggggggggggggg^Mggggggggasdfgggggggg^Mgggggggggggggggggggg

~/test$ grep asdf test

gggggggggggggggggggg

~/test$ cat test

gggggggggggggggggggg

Looks normal in notepad

score 2 · Answer 2 · answered Sep 30 '11 at 23:18

Unfortunately, grep won't do what you want. There isn't a command line option to get it to recognize the CR character as a line separator. However, you can do what you want with a bit of awk instead! Try this:

find . -name '*.php' -print0 | \
    xargs -0 awk -v RS="\r" '/string to search for/ {print FILENAME ": " $0}'

Awk isn't nearly as fast as grep, so this method could take a lot longer depending on the number of files and their sizes. It may be worthwhile to simply convert all of the line endings of your PHP files if you're going to do a lot of grepping on them. If you don't have a convenient utility available to do this for you, this shell script ought to do it:

find . -name '*.php' | while read PHPFILE; do
    mv "$PHPFILE" "$PHPFILE".orig
    awk -v RS="\r" '{print $0}' < "$PHPFILE".orig > "$PHPFILE"
done

score 1 · Answer 3 · answered Sep 30 '11 at 20:37

1

What if you do something like this?

for i in `find . -name "*.php" -print` ; do grep -H -i "the string to search for" $i 2>/dev/null >/dev/null ; if [ $? -eq 0 ] ; then echo $i ; fi ;  done ;

then you should only get the output of the file that has what you are looking for.

answered Sep 30 '11 at 20:37

Brandon Rush

41
2

So, I'm only getting the filename. Was that what you intended? I guess it's better than what my line does but it would be great to see the actual match (line) of each file as well. – quano Sep 30 '11 at 20:48
Yes that my my intention, sorry for the confusion on that. – Brandon Rush Oct 04 '11 at 03:36

grep - recognize carriage return as new line

3 Answers3