Sed/awk/perl match all prefixes of a given string

2

I'd like to match all lines in a file that either my word is a prefix of, or the line is a prefix of my word. For example searching for "abc" should match:

a
ab
abc
abcd
abcxyz

but not:

xabc
zzab
xaz

The "my word is a prefix of" part is easy, just match on "^abc" of course, but I haven't come up a solution for the "line is a prefix of my word" bit. Tried something in awk but wasn't able to make the line contents a part of the regular expression.

Ossifer

Posted 2017-02-02T20:27:59.810

Reputation: 23

Perhaps you're looking for grep -e '^abc' -e 'abc$' – janos – 2017-02-02T20:33:35.697

That won't match the prefixes of "abc", but will match lines for with it is a suffix, like "xabc". – Ossifer – 2017-02-02T20:38:47.100

egrep -E '^ab?c?' ? Give us an example what you want as result – Alex – 2017-02-02T20:45:49.133

For that, you'd have to write grep -E '^a$|^ab$|^abc' – janos – 2017-02-02T20:52:35.470

1grep -e '^a' -e '^ab' -e '^abc'? – Alex – 2017-02-02T20:56:15.060

Alex and Janos: Yes, I can obviously enumerate the prefixes on my own, but this becomes tedious when the search string has many characters. I would also need to script it. – Ossifer – 2017-02-02T21:01:49.357

@Ossifer If you're scripting it, it's easy to create the regexp dynamically from the input, using a loop. – Barmar – 2017-02-02T21:34:41.620

Answers

1

There are two cases you need to handle, where the line is shorter than your search string or where it's longer.

When it's longer, you want to test if the beginning of the line is equal to the test string.

When it's shorter, you want to test if the beginning of the search string is equal to the line.

In the cases where the lengths are equal, either method works.

awk -v search=abc 'length() > length(search) ? substr($0, 1, length(search)) == search : substr(search, 1, length()) == $0' inputfile

Barmar

Posted 2017-02-02T20:27:59.810

Reputation: 1 913

Perfect -- thanks! I knew awk would be best, but didn't think of using substr instead attempting regex. – Ossifer – 2017-02-02T22:20:49.540