2
The following file, fruit_notes.txt, has three pipe-separated columns: fruit, color, and tasting notes. I would like to print all lines that have a duplicated color field. Order is not important.
banana|YELLOW|My turtle likes these.
cherry|RED|Sweet and tasty
grapefruit|YELLOW|Very juicy
grape|PURPLE|Yummy
lemon|YELLOW|Sour!
apple|RED|Makes great pie
orange|ORANGE|Oranges make me laugh.
This works...
> grep -F "`awk -F"|" '{print $2}' fruit_notes.txt | sort | uniq -d`" fruit_notes.txt
banana|YELLOW|My turtle likes these
cherry|RED|Sweet and tasty
grapefruit|YELLOW|Very juicy
lemon|YELLOW|Sour!
apple|RED|Makes great pie
However, it seems like an awkward (no pun intended) solution. It reads the file twice: once to find the duplicates in the color field, and again to find the lines matching the duplicate colors. It is also error-prone. For example, the following line would be incorrectly printed:
jalapeños|GREEN|My face turns RED when I eat these!
Is there a better way to do this, maybe using awk alone?
1That's brilliant. I wasn't sure if it was possible to do it in one pass. Doing some reading now to understand the magic that makes it work! – Sagebrush Gardener – 2019-06-13T01:05:08.917