grep simply fails when used on a few files

5

1

I've been trying for about the past 30 minutes to get this to work properly. grep is not exactly the most difficult thing to use, so I'm somewhat baffled as to why this won't work.

The files I'm trying to use grep on are simple XHTML log files. Their names are in the format name@domain.com.html, though I don't think that should matter, and inside is simple XHTML.

I copied one such log file to be testfile so you can see the output of some commands and why it's baffling to me:

[~/.chatlogs_windows/dec] > whoami
reid
[~/.chatlogs_windows/dec] > type grep
grep is /bin/grep
[~/.chatlogs_windows/dec] > uname -a
Linux reid-pc 2.6.35-22-generic #33-Ubuntu SMP Sun Sep 19 20:32:27 UTC 2010 x86_64 GNU/Linux
[~/.chatlogs_windows/dec] > cat /etc/issue
Linux Mint 10 Julia
[~/.chatlogs_windows/dec] > ls -lh testfile
-rw-r--r-- 1 reid reid  63K 2011-01-10 12:45 testfile
[~/.chatlogs_windows/dec] > tail -3 testfile 
</body>
</html>
[~/.chatlogs_windows/dec] > file testfile
testfile: XML document text
[~/.chatlogs_windows/dec] > grep html testfile 
[~/.chatlogs_windows/dec] > grep body testfile 
[~/.chatlogs_windows/dec] > grep "</html>" testfile 
[~/.chatlogs_windows/dec] > grep "</body>" testfile
[~/.chatlogs_windows/dec] > cat testfile | grep html
[~/.chatlogs_windows/dec] > cat testfile | wc -l
231
[~/.chatlogs_windows/dec] > cat testfile | tail -3
</body>
</html>
[~/.chatlogs_windows/dec] > chmod a+rw testfile && ls -lh | grep testfile
-rw-rw-rw- 1 reid reid  63K 2011-01-10 12:45 testfile
[~/.chatlogs_windows/dec] > grep html testfile

That's what I'm attempting to do. I want to just use grep -ri query . in ~/.chatlogs_windows, which normally works perfectly for me... but for some reason, it completely fails at going through these files.

If it matters, I copied these files off of my Windows 7 partition. But I chown'd them and gave myself all the appropriate permissions, and other programs (like cat) seem to read them just fine. I also copied testfile to testfile_unix and converted the line endings and tried that, but it didn't work either.

I'm using zsh, but I tried it on bash and that failed too. Also, grep works normally: I tried it out on my documents folder and it worked flawlessly.

If you need any more information, just let me know. I tried googling around, but I found no reason for grep to simply not work. Thanks in advance.

Reid

Posted 2011-01-10T19:11:28.857

Reputation: 255

In bash, after you execute one of those failing grep calls, show the exit status: echo $? (aside: ls -Alh | grep testfile is more easily written/typed as ls -lh testfile) – Doug Harris – 2011-01-10T19:16:12.100

it doesn't make sense to me either but just to make sure: did you try: cat testfile | grep "html"? – yasouser – 2011-01-10T19:24:46.550

It gives "1" back as a result. I didn't even think about that. But according to grep's man page, unless I'm misreading it, 1 is the result when the lines aren't found (no matches). – Reid – 2011-01-10T19:26:08.597

@anand.arumug: Yes, I tried that and all sorts of variations. No matter what I do, nothing is ever given back. – Reid – 2011-01-10T19:26:50.467

Try creating a new file (touch ~/test.txt) and copy/paste the contents of testfile. And try grepping. See what happens. – yasouser – 2011-01-10T19:40:34.513

That works... for some reason. But why does grepping the regular files not? – Reid – 2011-01-10T19:44:55.460

Answers

7

The grep tool doesn't recognise the UTF-16 file encoding.

RedGrittyBrick

Posted 2011-01-10T19:11:28.857

Reputation: 70 632

Grep and Windows files have stolen a tiny bit more of my soul. Thanks @Reid – Will – 2016-05-17T16:43:07.300

2Genius! For anyone else who may have this issue in the future, I used the command iconv -f UTF-16 -t UTF-8 testfile > testfile_enc to convert it and tested from there. It'd be nice if grep would inform you of that, though, instead of silently failing. – Reid – 2011-01-10T19:49:10.590

Great! learnt something interesting :) Thanks to @Reid and @RedGrittyBrick. – yasouser – 2011-01-10T20:16:43.623