What's the best way of getting only the final match of a regular expression in a file using grep?
Also, is it possible to begin grepping from the end of the file instead of the beginning and stop when it finds the first match?
You could try
grep pattern file | tail -1
or
tac file | grep pattern | head -1
or
tac file | grep -m1 pattern
I am always using cat (but this makes it a little longer way): cat file | grep pattern | tail -1
I would blame my linux admin course teacher at college who love cats :))))
-- You don't have to cat a file first before grepping it. grep pattern file | tail -1
and is more efficient, too.
For someone working with huge text files in Unix/Linux/Mac/Cygwin. If you use Windows checkt this out about Linux tools in Windows: https://stackoverflow.com/questions/3519738/what-is-the-best-way-to-use-linux-utilities-under-windows.
One can follow this workflow to have good performance:
zq
from the package.Quote from its github readme:
Creating an index
zindex needs to be told what part of each line constitutes the index. This can be done by a regular expression, by field, or by piping each line through an external program.
By default zindex creates an index of file.gz.zindex when asked to index file.gz.
Example:
create an index on lines matching a numeric regular expression. The capture group indicates the part that's to be indexed, and the options show each line has a unique, numeric index.
$ zindex file.gz --regex 'id:([0-9]+)' --numeric --unique
Example: create an index on the second field of a CSV file:
$ zindex file.gz --delimiter , --field 2
Example:
create an index on a JSON field orderId.id in any of the items in the document root's actions array (requires jq). The jq query creates an array of all the orderId.ids, then joins them with a space to ensure each individual line piped to jq creates a single line of output, with multiple matches separated by spaces (which is the default separator).
$ zindex file.gz --pipe "jq --raw-output --unbuffered '[.actions[].orderId.id] | join(\" \")'"
Querying the index
The zq program is used to query an index. It's given the name of the compressed file and a list of queries. For example:
$ zq file.gz 1023 4443 554
It's also possible to output by line number, so to print lines 1 and 1000 from a file:
$ zq file.gz --line 1 1000
The above solutions only work for one single file, to print the last occurrence for many files (say with suffix .txt
), use the following bash script
#!/bin/bash
for fn in `ls *.txt`
do
result=`grep 'pattern' $fn | tail -n 1`
echo $result
done
where 'pattern'
is what you would like to grep.
If you have several files, use inline-for:
for a in *.txt; do grep "pattern" $a /dev/null | tail -n 1; done
The /dev/null provides a second file so grep will list the filename where the pattern is found.