What does this grep command actually do?

2

0

I'm trying to get a hang of grep. I've got the following command from a GeekLet script that someone made for getting the weather info off some website:

curl -s 'http://thefuckingweather.com/?zipcode=61820' | grep '"content\|"remark\|span' || sed 's/<[^>]*>//g' | sed 's/]*>//g' | sed 's/&#176;/°/'

I'm not worried about the sed command right now. I know it clears up the output to be neatly formatted, but for now I'm just trying to figure out the grep command.

I have a couple of questions that none of the guides/manuals over there seem to have a clear answer for:

  1. What does the backslash (\) do over here?
  2. What do the pipes "|" in between do?
  3. Why is "content\|" in double quotes?

Also any other ideas/guides that you know of which touch upon parsing html content with grep?

Sid

Posted 2012-12-07T03:12:33.347

Reputation: 123

Answers

3

  1. What does the backslash (\) do over here?

    grep uses an "escaped" pipe (|) to mean logical OR. In other words, grep 'foo\|bar' means print any lines that contain either "foo" or "bar".

  2. What do the pipes "|" in between do?

    See answer to 1.

  3. Why is "content\|" in double quotes?

    It is not. The quotes are part of the pattern being searched for, the output of the curl command you give contains these lines:

    </title><meta http-equiv="Content-Language" content="en-us" /> 
    [...]  
    <div class="content">
    

    The quote (not quotes, the second " belongs to the next pattern, "remark) before the word "content" is there to make grep print only the second of the lines above. It is part of the actual search pattern: "content.

terdon

Posted 2012-12-07T03:12:33.347

Reputation: 45 216

Thank you! That cleared it up. Also if someone else needs to know more about this I found a good grep resource here: http://www.cyberciti.biz/faq/howto-use-grep-command-in-linux-unix/

– Sid – 2012-12-08T09:46:10.987