Grep and regular expressions (consecutive letters)

1

I'm trying to figure out how to list consecutive letters in grep. For example if I wanted to list all words with two consecutive a’s OR i’s OR u’s in one line how would I do this?

From my understanding the command would look something like egrep [a]{2} | egrep [i]{2} | egrep [u]{2}

But what if it's the case that the word has aa and ii or any other combination of the three letters consecutively by twos?

에이바

Posted 2010-09-08T16:24:09.227

Reputation: 1 266

Grep is not egrep. Please be more specific in the title. – reinierpost – 2010-09-08T17:01:56.433

2Well it kind of is - they're the same program under the hood, but when invoked as egrep it just adds the -E option. That does change the regular expression syntax a little but it doesn't make a substantial difference in the question. – David Z – 2010-09-08T17:17:27.793

Answers

4

What you've got in your question is actually three separate commands: it'll search for anything that has aa, then pass those results to the next egrep which searches for anything that has ii, then passes those results to the last egrep which searches for uu. So you wind up with only those lines that contain all three of the combinations aa, ii, and uu.

You'd need to use just one egrep command, using a regular expression with alternation:

egrep 'aa|ii|uu'

This will match all lines that contain any of aa, ii, or uu.

David Z

Posted 2010-09-08T16:24:09.227

Reputation: 5 688

1

You can't. Regex only works with regular expressions. You will notice that regular expressions do not have any sort of memory (thus can't say (.*)*2 or something like that). You're looking for a finite atomota or a turring machine.

Daisetsu

Posted 2010-09-08T16:24:09.227

Reputation: 5 195

Hm. Well I know for example if I write a command like: egrep '(.)\1(.)\2(.)\3' /usr/share/dict/words It would show words with sequential consecutive letters, but I'm not able to pick specific ones. I want to find a word that has aa or a word that has ii or uu in it but not necessarily all three at the same time. I don't think this is an impossible task. Could you elaborate a bit more? – 에이바 – 2010-09-08T16:31:53.300

4Just combine the two ideas: egrep '([aiu])\1' – reinierpost – 2010-09-08T17:01:35.050

1Side note: technically when you use backreferences like \1, what you have isn't quite a regular expression. (An "irregular expression" I suppose :-P) – David Z – 2010-09-08T17:01:35.203