Take a file of regexes and return the first match for each in another file

1

I have two files. a.txt has a list of regexes, separated by newlines. b.txt has lines, some of which match the regexes in a.txt.

What I want: A command (grep, probably) that will take the two files, and for each line in a.txt prints the first whole-line match from b.txt. An ideal solution would also print the regex itself as a prefix and, if there is no match, print 'no match' or something else distinctive. One missing either or both of these is, however, good enough.

What I'm currently using to test solutions:

a.txt:

[abc]*qs
ab[cqs]*
w+x+

b.txt:

aqs
abqs
abs

Best things I've tried are grep -xf a.txt b.txt, which prints

aqs
abqs
abs

and grep -xcf a.txt b.txt which prints 3.

Ideal output would be

[abc]*qs aqs
ab[cqs]* abqs
w+x+ None

Minimally acceptable output would be

aqs
abqs

Jacob Kopczynski

Posted 2018-09-15T01:39:54.300

Reputation: 21

Answers

2

while read -r pattern; do
   printf '%s ' "$pattern"
   grep -x -m 1 "$pattern" b.txt || printf '%s\n' 'None'
done <a.txt

It works by reading patterns one by one, running grep for each of them and printing (printf) additional information where needed.

Note: grep -m 1 stops after the first match found, it's not POSIX though. If you don't have this option, replace the line containing grep with this one:

{ grep -x "$pattern" b.txt || printf '%s\n' 'None'; } | head -n 1

Kamil Maciorowski

Posted 2018-09-15T01:39:54.300

Reputation: 38 429