use of alternation "|" in sed's regex

80

12

I am using sed, GNU sed version 4.2.1. I want to use the alternation "|" symbol in a subexpression. For example :

echo "blia blib bou blf" | sed 's/bl\(ia|f\)//g'

should return

" blib bou "

but it returns

"blia blib bou blf".

How can I have the expected result ?

Cedric

Posted 2010-02-22T14:31:27.780

Reputation: 949

Answers

109

The "|" also needs a backslash to get its special meaning.

echo "blia blib bou blf" | sed 's/bl\(ia\|f\)//g'

will do what you want.

As you know, if all else fails, read the manual :-).

GNU sed user's manual, section 3.3 Overview of Regular Expression Syntax:

`REGEXP1\|REGEXP2'

Matches either REGEXP1 or REGEXP2.

Note the backslash...

Unfortunately, regex syntax is not really standardized... there are many variants, which differ among other things in which "special characters" need \ and which do not. In some it's even configurable or depends on switches (as in GNU grep, which you can switch between three different regex dialects).

This answer in particular is for GNU sed. There are other sed variants, for example the one used in the BSDs, which behave differently.

sleske

Posted 2010-02-22T14:31:27.780

Reputation: 19 887

8The standard BSD/OS X version of sed does support alternation, but only with "extended" regex syntax (-E) - which means no backslashes on either the pipes or the parentheses: echo "blia blib bou blf" | sed -E 's/bl(ia|f)//g' – Mark Reed – 2014-09-30T17:42:47.027

2I edited my answer to note that it's for GNU sed only. – sleske – 2015-07-14T10:57:40.120

36For anyone else confused by this answer | only works in gnu sed (gsed on os x) not vanilla sed (sed on os x). – Andrew Hancox – 2012-04-04T14:54:43.047

@AndrewHancox Thank you so much! I was about to rip all of the hair out of my head (and so far I'm doing pretty good compared to my manager on the hair-front) - I know I know RegEx enough to try | and | but I never thought about the fact that OSX might actually use a non-gnu sed. – phatskat – 2013-02-01T20:03:51.173

23

Since there are several comments regarding non-Gnu sed implementations: At least on OS X, you can use the -E argument to sed:

Interpret regular expressions as extended (modern) regular expressions rather than basic regular expressions (BRE's). The re_format(7) manual page fully describes both formats.

Then you can use regular expression metacharacters without escaping them. Example:

$ echo "blia blib bou blf" | sed -E 's/bl(ia|f)//g'
 blib bou 

Daniel Beck

Posted 2010-02-22T14:31:27.780

Reputation: 98 421

12

GNU sed also supports the -r option (extended regular expressions). This means you don't have to escape the metacharacters:

echo foohello barhello | sed -re "s/(foo|bar)hello/hi/g"

Output:

hi hi

jco

Posted 2010-02-22T14:31:27.780

Reputation: 223

Yes, -r option is really really helpful for the readability of the expressions. That should be the accepted answer. – рüффп – 2016-10-05T09:01:42.420

9

The \| does not work with sed on Solaris 10 either. What I did was use

perl -p -e 's/bl(ia|f)//g'

Joe Tennies

Posted 2010-02-22T14:31:27.780

Reputation: 91

2+1 for portability since, if a system has perl, it will always use this syntax, unlike sed. – evilsoup – 2013-05-30T01:30:35.177

4

Followup: sed -E allows it on MacOS. No backslash need for |.

 sed -E 's/this|orthat/oooo/g' infile

some ideas

Posted 2010-02-22T14:31:27.780

Reputation: 840

1

In the GnuWin32 on Windows sed the syntax is sed "s/thing1\|thing2/ /g" source > destination.

The quotes must of type " - this is "Required" for the command to be parsed.

twobob

Posted 2010-02-22T14:31:27.780

Reputation: 361