How to grep only selected words from a file?

1

I need to get a list of all the email addresses from a file.

I was trying to use

grep @ filename

but that returns the entire line.

Is there anyway I can get it to return only the email address and not the whole line?

null_radix

Posted 2010-01-21T21:18:53.967

Reputation: 11

Answers

0

@rbright this solution seems to work. the "-o" option is what I was missing! But I ended up just writing a php script before I got your response.

$handle = fopen("MYFILE.txt","r");

if ($handle) {  
    while (!feof($handle)) {
        $buffer = fgets($handle);

        $pieces = explode(" ",  $buffer);
        foreach($pieces as $piece){
            if( strstr($piece, '@' && $piece != " ")){              
           echo $piece;             
            }
        }   
    }
 fclose($handle);
} 

This is a fast and dirty but it will hold up until there is a @ that is not in a email address which is very unlikely in my situation

null_radix

Posted 2010-01-21T21:18:53.967

Reputation: 11

1

That's going to depend on the format of the file. For example, let's say the file has

email@example.com stuff you don't want
email2@example.com more stuff you don't want
email3@example.com and more

then awk '/@/{print $1}' would seem to be the obvious answer.

Post an example of the file format if that's not what you've got.

gorilla

Posted 2010-01-21T21:18:53.967

Reputation: 2 204

the file has email address semi-randomly occurring through out the file. – null_radix – 2010-01-25T15:35:32.947

1

Note that properly matching any valid email address is a deep magic, so if you really want to catch everything with no false positives or negatives, you should use a regex someone else has written. But if you're just looking for a quick grep that is Good Enough, check out the -o option which will show only the matching text.

$ grep -Po '\S+@\S+\.\w+' yourfile.txt

That will catch some simple email addresses, along with some things that are not valid email addresses (like "@@@@.a"). Adjust your regex as appropriate. E.g., this one is more restrictive:

$ grep -Po '[\w+.]+@[\w.]+\.\w+' yourfile.txt

Ryan Bright

Posted 2010-01-21T21:18:53.967

Reputation: 641

regexes really aren't the right tool for handling email addresses. I'm not aware of any which will correctly handle every potential variation on the full address. For BNF specifications, you really should be using a parser. – gorilla – 2010-01-22T01:31:57.037