36

I'm running this command in a bash shell on Ubuntu 12.04.1 LTS. I'm attempting to remove both the [ and ] characters in one fell swoop, i.e. without having to pipe to sed a second time.

I know square brackets have special meaning in a regex so I'm escaping them by prepending with a backslash. The result I was expecting is just the string 123 but the square brackets remain and I'd love to know why!

~$ echo '[123]' | sed 's/[\[\]]//'
[123]
Xhantar
  • 1,002
  • 1
  • 8
  • 11
  • What I'm trying to ultimately achieve is to assign whatever's between the square brackets to a bash variable for use elsewhere in my bash script, so if there's a better way to achieve that (by using awk, maybe?), please let me know. – Xhantar Jan 11 '13 at 10:40
  • 2
    Just adding as a comment: You can use bash's PE feature as in: `str='[123]'; str1=${str/\[/}; str2=${str1/\]}; echo $str2` – Valentin Bajrami Jan 11 '13 at 11:08
  • 1
    @val0x00ff - Pure bash substitution.. thanks! :) Learned something new. – Xhantar Jan 11 '13 at 12:13

5 Answers5

51

This is easy, if you follow the manual carefully: all members inside a character class lose special meaning (with a few exceptions). And ] loses its special meaning if it is placed first in the list. Try:

$ echo '[123]' | sed 's/[][]//g'
123
$

This says:

  1. inside the outer [ brackets ], replace any of the included characters, namely:
    • ] and
    • [
  2. replace any of them by the empty string — hence the empty replacement string //,
  3. replace them everywhere (globally) — hence the final g.

Again, ] must be first in the class whenever it is included.

Saparagus
  • 623
  • 6
  • 4
  • Live example: https://github.com/PHPExpertsInc/DockerUpgrade/blob/master/docker-images-update#L18 – hopeseekr Sep 19 '20 at 17:33
  • Is this behavior a characteristic of regex in general, or just sed? – user1079505 Apr 20 '21 at 21:55
  • 2
    @user1079505, unfortunately there is no such thing stricty as "regex in general". Regex syntax has subtle variations in different environments (linux emacs, linux/unix utilities, shell, python, mongo, etc...). However, where character classes are allowed in regex, this is the typical behavior. – Saparagus Apr 21 '21 at 09:37
13

I'm not sure why that doesn't work but this does:

echo '[123]' | sed 's/\(\[\|\]\)//g'

or this:

echo '[123]' | sed -r 's/(\[|\])//g'

You can also try a different approach and match the string inside the brackets (assuming the string can be matched easily and is not defined by the brackets):

echo '[123]' | egrep -o "[0-9]+"

I'm having the same troubles with your original regex using grep so I suspect this is not just a sed thing.

Weirdly, these produce different results but one of them matches what you want:

echo '[123]' | egrep -o '[^][]+'
123

echo '[123]' | egrep -o '[^[]]+'
3]

Applying this to your original sed (and adding the /g modifier so it removes both brackets):

echo '[123]' | sed 's/[][]//g'
123
Ladadadada
  • 25,847
  • 7
  • 57
  • 90
  • Your 3rd approach (egrep -o...) looks like the cleanest solution to my problem. I'll only ever have integers in between the square brackets (and sorry, I should have mentioned that in my question) so I shouldn't run into any oddities I think. Thanks! – Xhantar Jan 11 '13 at 12:20
  • 5
    You can also use `tr`: `echo '[123]' | tr -d '[]'` - avoids regexp confusions about escaping. – James O'Gorman Jan 11 '13 at 12:53
  • @James O'Gorman - Interesting. For some reason I thought that `tr` can only translate one character max at a time, but I was wrong. Thanks! – Xhantar Jan 11 '13 at 13:41
3

To remove everything before and after the brackets :

$ echo '[123]' | sed 's/.*\[//;s/\].*//;'
123

If your data is like this always meaning starting and ending with square brackets:

$ echo '[123]' | sed 's/.//;s/.$//;'
123
Guru
  • 254
  • 1
  • 2
  • The data I'm working with will always start and end with a square bracket yes. I'd still like to know why my solution wasn't working though. Any ideas? And is there a way to do this without specifying 2x regex's? – Xhantar Jan 11 '13 at 10:49
  • 1
    @Guru this solution worked from me, and as for Xhantar ,This is a really late reply, but what I can see from your code and the Bash Beginners guide at tldp.org , you were trying to do multiple search and replace, one for '[' and the other for ']' which wont work, to space out two different search and replace use the ";" or the -e options. 's///g ; s///g' OR sed -e 's///g' -e 's///g' – ArunMKumar Sep 01 '16 at 22:11
1

You can escape the opening bracket using \[. For the closing bracket, use []].

user2428118
  • 115
  • 5
0

If you have a more complex string like 'abcdef[123]ghijk' you can also use internal bash command 'cut' to extract text only between square brackets:

$ echo 'abcdef[123]ghijk' | cut -d '[' -f 2 | cut -d ']' -f 1
123
valentt
  • 295
  • 3
  • 10