Using regular expressions in Linux with grep

2

I can't get this simple regular expression to work for matching emails:

'\w*(?:\.\w*)*@\w*(?:\.\w*)*\w\{2,5\}'

It should be working as I have tested it with regex pal and it works just fine. I think there's a problem with optional character class but I'm not sure.

brgs

Posted 2011-08-03T11:32:32.003

Reputation: 123

1What is it intended to do? – Mechanical snail – 2011-08-03T11:34:18.430

To match the email. – brgs – 2011-08-03T11:35:54.327

Then it's wrong. A lot more than \word characters are permitted in an e-mail address. See http://en.wikipedia.org/wiki/Email_address#Syntax.

– Mechanical snail – 2011-08-03T11:38:04.933

The actual specification is http://tools.ietf.org/html/rfc5322.

– Mechanical snail – 2011-08-03T11:39:04.917

Yeah, I know that, but I'm testing it on a certain email - email.something@blah.mailas.com – brgs – 2011-08-03T11:39:25.867

Thanks, but I won't be using this expression for anything, only for testing.. I just want to know WHY it doesn't work? Should I use egrep or something? – brgs – 2011-08-03T11:46:19.073

Egrep with expression '\w(.(\w))@\w(.(\w)).\w{2,5}' worked, I guess there's no optional character class when using (e)grep? – brgs – 2011-08-03T12:07:54.080

Answers

2

You should use grep with perl regular expression (-P option) which supports lookahead assertions like (?: ). Also curly braces shouldn't be escaped.

Try:

grep -P '\w*(?:\.\w*)*@\w*(?:\.\w*)*\w{2,5}'

Since perl expressions are experimental feature in GNU grep you may want to change (?: ) to ( ) and user extended expressions (-E):

grep -E '\w*(\.\w*)*@\w*(\.\w*)*\w{2,5}'

Some of the extended expression implementations do not support curly braces { and }. For portability you can use basic regular expressions.

To use basic regular expressions escape ( and ) and leave also { and } escaped.

grep '\w*\(\.\w*\)*@\w*\(\.\w*\)*\w\{2,5\}'

Paweł Nadolski

Posted 2011-08-03T11:32:32.003

Reputation: 966