sed: match a string between two different consecutive matches on all occurrencies

Because sed matches are "greedy" (more precisely, leftmost-longest), this is tricky. Try:

$ sed 's/OPEN/\n/g; s/[^\n]*\n//; s/CLOSE[^\n]*\n//g; s/CLOSE.*$//' file
qwertygrapessunshine

The above was tested on GNU sed. If you are on BSD/MacOS, some minor but annoying changes will likely be required.

How it works

Remember that sed, by default, reads in one line at a time into its pattern space. This means that, when we start processing a pattern space, it will never contain a newline character. Thus, we can use a newline character, \n, as a marker with no possibility of ambiguity.

s/OPEN/\n/g

Replace OPEN with newlines

By default, sed reads in only one line at a time into its pattern space. That means that the pattern space will never, on its own, have a newline character in it.
s/[^\n]*\n//

Remove everything before the first OPEN (which is now a newline).

Note that [\n]* matches zero or more of anything except a newline character. Consequently, [^\n]*\n matches zero or more of anything except a newline followed by a newline. This means it matches up to and including the next newline. By contrast, because sed expressions are "greedy" (leftmost-longest), .*\n matches anything up to and including the last newline in the pattern space.
s/CLOSE[^\n]*\n//g

Remove everything starting from CLOSE and going to the next newline.
s/CLOSE.*$//

Remove from the last CLOSE to the end of the line.

John1024

Posted 2019-08-02T21:01:47.773

Reputation: 13 893

beautifully explained. could you also tell me what is [^\n]* and what's the difference between it and things like .* – cablewelo2ma – 2019-08-02T21:41:46.263

Thank you. [\n]* matches zero or more of anything except a newline character. [^\n]*\n matches zero or more of anything except a newline followed by a newline. This means it matches up to and including the next newline. By contrast, .*\n matches up to and including the last newline in the pattern space. – John1024 – 2019-08-02T21:55:30.323

Understood. Thank you – cablewelo2ma – 2019-08-02T22:07:59.890

sed: match a string between two different consecutive matches on all occurrencies

Answers

How it works