Substitution in text file **without** regular expressions

74

12

I need to substitute some text inside a text file with a replacement. Usually I would do something like

sed -i 's/text/replacement/g' path/to/the/file

The problem is that both text and replacement are complex strings containing dashes, slashes, blackslashes, quotes and so on. If I escape all necessary characters inside text the thing becomes quickly unreadable. On the other hand I do not need the power of regular expressions: I just need to substitute the text literally.

Is there a way to do text substitution without using regular expressions with some bash command?

It would be rather trivial to write a script that does this, but I figure there should exist something already.

Andrea

Posted 2012-05-09T15:00:43.013

Reputation: 945

related: https://stackoverflow.com/questions/29613304/is-it-possible-to-escape-regex-metacharacters-reliably-with-sed

– Ciro Santilli 新疆改造中心法轮功六四事件 – 2018-03-28T13:07:24.997

Related. tl;dr: use tr, sed with y/. Take a look to sponge. – Pablo A – 2020-02-11T15:44:31.227

Necessary to do it through bash? A simplistic solution would be to open in Word and do a find and replace all – Akash – 2012-05-09T15:04:04.977

18@akash Because systems that have bash always ship with Microsoft Word? ;) No.. Just kidding. The OP might want to do this on a remote machine or for a batch of files though. – slhck – 2012-05-09T15:07:03.890

@slhck :) Well, I guess gedit should have a similar option – Akash – 2012-05-09T15:09:33.487

An option would be to somehow correctly escape everything before passing it to sed, which is probably a futile effort considering all the switches and platform differences. – l0b0 – 2012-05-09T15:11:48.177

Answers

7

When you don't need the power of regular expressions, don't use it. That is fine.
But, this is not really a regular expression.

sed 's|literal_pattern|replacement_string|g'

So, if / is your problem, use | and you don't need to escape the former.

PS: About the comments, also see this Stackoverflow answer on Escape a string for sed search pattern.


Update: If you are fine using Perl try it with \Q and \E like this,

 perl -pe 's|\Qliteral_pattern\E|replacement_string|g'

@RedGrittyBrick has also suggested a similar trick with stronger Perl syntax in a comment here or here

nik

Posted 2012-05-09T15:00:43.013

Reputation: 50 788

25This answer should not be accepted – Steven Lu – 2016-08-04T18:01:41.327

5This misses the point entirely. The text to be matched could contain any wierdness. In my case it's a random password. You know how those go – Christian Bongiorno – 2018-06-11T18:01:30.240

1Use str =~ s/\Q$replace_this\E/$with_this/; in case of variables – Jithin Pavithran – 2018-09-30T14:26:36.590

I agree with Steven Lu, because it is simply wrong. The first character after 's' in "s/aaa/bbb/" is just an arbitrary delimiter and can be anything, as long as it doesn't occur in aaa or bbb. These three give exactly the same result, and 'aaa' will always be interpreted as a regular expression by sed: 's/aaa/bbb/', 's|aaa|bbb|', and even 'scaaacbbbc'. https://www.gnu.org/software/sed/manual/sed.html#The-_0022s_0022-Command

– Luc VdV – 2020-02-04T11:59:42.510

Thank you, I did not know about the difference between / and | – Andrea – 2012-05-09T15:18:16.393

70I'm not sure this answer is useful... The only difference between s||| and s/// is that the seperator character is different and so that one character doesn't need escaping. You could equally do s###. The real issue here is that the OP doesn't want to have to worry about escaping the contents of literal_pattern (which is not literal at all and will be interpreted as a regex). – Benj – 2012-05-09T15:31:43.667

18This will not avoid the interpretation of other special characters. What if search for 1234.*aaa with your solution it match much more than the intended 1234\.\*aaa. – Matteo – 2012-05-09T15:47:02.853

14

export FIND='find this'
export REPLACE='replace with this'
ruby -p -i -e "gsub(ENV['FIND'], ENV['REPLACE'])" path/to/file

This is the only 100% safe solution here, because:

  • It's a static substition, not a regexp, no need to escape anything (thus, superior to using sed)
  • It won't break if your string contains } char (thus, superior to a submitted Perl solution)
  • It won't break with any character, because ENV['FIND'] is used, not $FIND. With $FIND or your text inlined in Ruby code, you could hit a syntax error if your string contained an unescaped '.

Nowaker

Posted 2012-05-09T15:00:43.013

Reputation: 1 348

I had to use export FIND='find this; export REPLACE='replace with this'; in my bash script so that ENV['FIND'] and ENV['replace'] had the expected values. I was replacing some really long encrypted strings in a file. This was just the ticket. – dmmfll – 2016-06-24T12:45:48.653

This is a good answer answer because it's reliable and ruby is ubiquitous. Based on this answer I now use this shell script.

– loevborg – 2018-07-09T08:49:45.880

Unfortunately doesn't work when FIND contains multiple lines. – adrelanos – 2019-01-26T17:09:51.127

There's nothing that would prevent it from working with multiple lines in FIND. Use double quoted \n. – Nowaker – 2019-01-27T19:20:35.327

you dont need export, you can pass variables to subshells in sh by declaring them before the command on the same line; FIND='find this' REPLACE='replace with this' ruby .... with no newlines. Especially handy if you were never working with variables in the first place: FIND=\echo command` REPLACE=`cat replace.txt` ruby .... Use\` if you still want newlines in the actual script anyway ;) – Hashbrown – 2020-01-23T00:05:25.020

8

The replace command will do this.

https://linux.die.net/man/1/replace

Change in place:

replace text replacement -- path/to/the/file

To stdout:

replace text replacement < path/to/the/file

Example:

$ replace '.*' '[^a-z ]{1,3}' <<EOF
> r1: /.*/g
> r2: /.*/gi
> EOF
r1: /[^a-z ]{1,3}/g
r2: /[^a-z ]{1,3}/gi

The replace command comes with MySQL or MariaDB.

Derek Veit

Posted 2012-05-09T15:00:43.013

Reputation: 181

3take into account tht replace is deprecated and may not be disponible in the future – Rogelio – 2017-12-11T15:31:30.407

2Why on earth does such basic command come with a database? – masterxilo – 2018-08-08T07:03:16.270

6@masterxilo A better question might be – why does such a basic command not come with modern operating systems? ;-) – Mark Thomson – 2019-01-22T04:06:35.440

3

You could also use perl's \Q mechanism to "quote (disable) pattern metacharacters"

perl -pe 'BEGIN {$text = q{your */text/?goes"here"}} s/\Q$text\E/replacement/g'

glenn jackman

Posted 2012-05-09T15:00:43.013

Reputation: 18 546

3Or perl -pe 's(\Qyour */text/?goes"here")(replacement)' file – RedGrittyBrick – 2012-05-09T22:52:54.227

3

check out my Perl script. it do exactly what you need without implicit or explicit use of regular expression :

https://github.com/Samer-Al-iraqi/Linux-str_replace

str_replace Search Replace File # replace in File in place

STDIN | str_replace Search Replace # to STDOUT

very handy right? I had to learn Perl to do it. because I really really need it.

Samer Ata

Posted 2012-05-09T15:00:43.013

Reputation: 131

2

You can do it by escaping your patterns. Like this:

keyword_raw='1/2/3'
keyword_regexp="$(printf '%s' "$keyword_raw" | sed -e 's/[]\/$*.^|[]/\\&/g')"
# keyword_regexp is now '1\/2\/3'

replacement_raw='2/3/4'
replacement_regexp="$(printf '%s' "$replacement_raw" | sed -e 's/[\/&]/\\&/g')"
# replacement_regexp is now '2\/3\/4'

echo 'a/b/c/1/2/3/d/e/f' | sed -e "s/$keyword_regexp/$replacement_regexp/"
# the last command will print 'a/b/c/2/3/4/d/e/f'

Credits for this solutions goes here: https://stackoverflow.com/questions/407523/escape-a-string-for-a-sed-replace-pattern

Note1: this only works for non-empty keywords. Empty keywords are not accepted by sed (sed -e 's//replacement/').

Note2: unfortunately, I don't know a popular tool that would NOT use regexp-s to solve the problem. You can write such a tool in Rust or C, but it's not there by default.

VasyaNovikov

Posted 2012-05-09T15:00:43.013

Reputation: 2 329

This completely misses the OP's point. Obviously you can escape the pattern, but for some patterns this is tedious. – icecreamsword – 2018-10-02T19:35:26.033

@icecreamsword did you read my answer below the first line? The script does escaping automatically. – VasyaNovikov – 2018-10-03T04:24:51.620

1

You can use php's str_replace:

php -R 'echo str_replace("\|!£$%&/()=?^\"'\''","replace",$argn),PHP_EOL;'<input.txt >output.txt

Note: You would still need to escape single quotes ' and double quotes ", though.

simlev

Posted 2012-05-09T15:00:43.013

Reputation: 3 184

1

I pieced together a few other answers and came up with this:

function unregex {
   # This is a function because dealing with quotes is a pain.
   # http://stackoverflow.com/a/2705678/120999
   sed -e 's/[]\/()$*.^|[]/\\&/g' <<< "$1"
}
function fsed {
   local find=$(unregex "$1")
   local replace=$(unregex "$2")
   shift 2
   # sed -i is only supported in GNU sed.
   #sed -i "s/$find/$replace/g" "$@"
   perl -p -i -e "s/$find/$replace/g" "$@"
}

Boycott SE for Monica Cellio

Posted 2012-05-09T15:00:43.013

Reputation: 678

Doesn't work with newlines. Also doesn't help to escape newlines with \n. Any solution? – adrelanos – 2019-03-29T08:44:50.747

0

Node.JS equivalent of @Nowaker:

export FNAME='moo.txt'
export FIND='search'
export REPLACE='rpl'
node -e 'fs=require("fs");fs.readFile(process.env.FNAME,"utf8",(err,data)=>{if(err!=null)throw err;fs.writeFile(process.env.FNAME,data.replace(process.env.FIND,process.env.REPLACE),"utf8",e=>{if(e!=null)throw e;});});'

A T

Posted 2012-05-09T15:00:43.013

Reputation: 641

0

Heres one more "almost" working way.

Use vi or vim.

Create a textfile with your substitution in it:

:%sno/my search string \\"-:#2;g('.j');\\">/my replacestring=\\"bac)(o:#46;\\">/
:x

then execute vi or vim from the commandline:

vi -S commandfile.txt path/to/the/file

:%sno is the vi command to do search and replace without magic.

/ is my chosen separator.

:x saves and exits vi.

You need to escape backslashes '\' the forwardslash '/' may be replaced with e.g. a questionmark '?' or something else that is not in your search or replace-string, pipe '|' did not work for me tho.

ref: https://stackoverflow.com/questions/6254820/perform-a-non-regex-search-replace-in-vim https://vim.fandom.com/wiki/Search_without_need_to_escape_slash http://linuxcommand.org/lc3_man_pages/vim1.html

Samuel Åslund

Posted 2012-05-09T15:00:43.013

Reputation: 101

0

Using a Simple Python Script

Most systems have python ready to go these days. So here's a simple script that'll work for ya:

# replace.py
# USAGE: python replace.py bad-word good-word target-file.txt
#
import sys

search_term = sys.argv[1]
replace_term = sys.argv[2]
target_file = sys.argv[3]

with open(target_file, 'r') as file:
        content = file.read()

content = content.replace(sys.argv[1], sys.argv[2])

with open(target_file, 'w') as file:
        file.write(content)

One Caveat: This works great if your good/bad words are already in system/environment variables. Just make sure you use double-quotes to wrap the variables when passing to the script.

For example:

python replace.py "$BAD_WORD" "$GOOD_WORD" target-file.txt

However, these will not work:

# This breaks on $ or " characters
BAD_WORD="your-artibrary-string"

# This breaks on ' characters
BAD_WORD='your-artibrary-string'

# This breaks on spaces plus a variety of characters
BAD_WORD=your-artibrary-string

Handling Arbitrary Literal Characters

1. Write the Chars to Disk

If I need to provide a arbitrary literal value to a script (skipping any escaping), I generally write it to disk using this method:

head -c -1 << 'CRAZY_LONG_EOF_MARKER' | tee /path/to/file > /dev/null
arbitrary-one-line-string
CRAZY_LONG_EOF_MARKER

... where:

  • We're employing the Here Document mechanism to write literal text
  • We're using head and tee to delete the trailing newline that Here Docs create
  • We're preventing evalution of variables inside the Here Doc by quoting the EOL marker string

Here's a quick demo with tricky chars:

head -c -1 << 'CRAZY_LONG_EOF_MARKER' | tee /path/to/file > /dev/null
1"2<3>4&5'6$7 # 8
CRAZY_LONG_EOF_MARKER

2. Use Modified Python Script

Here's an updated script that reads from word files:

# replace.py
# USAGE: python replace.py bad-word.txt good-word.txt target-file.txt
#
import sys

search_term_file = sys.argv[1]
replace_term_file = sys.argv[2]
target_file = sys.argv[3]

print [search_term_file, replace_term_file, target_file]

with open(search_term_file, 'r') as file:
        search_term = file.read()
with open(replace_term_file, 'r') as file:
        replace_term = file.read()
with open(target_file, 'r') as file:
        content = file.read()

print [search_term, replace_term]
content = content.replace(search_term, replace_term)

with open(target_file, 'w') as file:
        file.write(content)

Ryan

Posted 2012-05-09T15:00:43.013

Reputation: 111

0

You can do this in sh without any script (though putting this "one-liner" into a script would be better) or non-standard external program (I reeeally liked @Nowaker's answer thanks to it's safety against injection, but this old CentOS box I needed this on didn't have ruby!).

Without attempting to escape the string (and account for issues with doing it correctly syntactically, knowing all the special characters, et cetera), we can just blanket encode all the strings so that nothing has the possibility of being special.

cat path/to/the/file | xxd -p | tr -d '\n' \
| sed "s/$(printf '%s' 'text' | xxd -p | tr -d '\n')/$(printf '%s' 'replacement' | xxd -p | tr -d '\n')/g" \
| xxd -p -r

This was just to match the asker's example, other users can obviously replace 'text' with "$text" if using a variable, or cat path/to/the/file with printf '%s' "$input" if not using a file.

You can even replace the /g with / to make it replace-once, or otherwise edit the regex outside the $() to "escape" only portions of the matcher (say, add a ^ after s/ to make it match only the start of the file).
If in the above you need ^/$ to match ends-of-lines again you'll need unencode those:

cat path/to/the/file | xxd -p | tr -d '\n' | sed 's/0a/\n/g'\
| sed "s/^$(printf '%s' 'text' | xxd -p | tr -d '\n')/$(printf '%s' 'replacement' | xxd -p | tr -d '\n')/g" \
| sed 's/\n/0a/g' | xxd -p -r

Which'll replace all lines in the file begining with 'text' to instead start with 'replacement'


Test:

Within ^/.[a]|$0\\{7}!!^/.[a]|$0\\{7}!!^/.[a]|$0\\{7}, replace literal ^/.[a]|$0\\{7} with literally $0\\

printf '%s' '^/.[a]|$0\\{7}!!^/.[a]|$0\\{7}!!^/.[a]|$0\\{7}' \
| xxd -p | tr -d '\n' \
| sed "s/$(printf '%s' '^/.[a]|$0\\{7}' | xxd -p | tr -d '\n')/$(printf '%s' '$0\\' | xxd -p | tr -d '\n')/g" \
| xxd -p -r

Output:

$0\\!!$0\\!!$0\\

Hashbrown

Posted 2012-05-09T15:00:43.013

Reputation: 1 720