Counting number of occurrences of a string in all files in a folder

3

1

How do i count all the number of occurrences of a particular string in all files in a folder

For example,

1.txt : 'hahaha hehe' 2.txt : 'ha hee'

I would like to count the number of occurrences of 'ha' in a folder containing all these files.

What i could think of is

grep "ha" | wc-l

But it just gives me individual occurrences in file output

aceminer

Posted 2016-03-12T02:54:28.563

Reputation: 143

Answers

5

You're close. To get a total count of all occurrences of "ha" within all .txt files in a folder:

grep -o "ha" *.txt | wc -l

From man grep:

-o, --only-matching
       Print only the matched (non-empty) parts of a matching line, with
       each such part on a separate output line.

This works because each match is printed on a separate line, thus allowing wc -l to count all of them.

By default, however, grep only finds the first occurrence on a line and outputs the whole line. Likewise, option -c only finds the first occurrence in all lines, then outputs how many lines had 1 (or more) matches.

EDIT:

Here is a way to print the total number of occurrences within each individual file (with filenames):

find *.txt -printf 'echo "$(grep -o "ha" %p | wc -l) %p";' | sh

#Example output
3 file1.txt
1 file2.txt

Explanation:

find *.txt - finds .txt files

-printf - prints everything between the single-quotes (formatted) to standard output, replacing occurrences of %p with find's output (file names)

$(grep -o "ha" %p | wc -l) - works as above

| sh - the output from -printf (which are commands) are piped to a shell and executed

Note that printf is invoked once per filename.

flyingfinger

Posted 2016-03-12T02:54:28.563

Reputation: 211

How about if I want to find out number of occurrences in each individual file – aceminer – 2016-03-12T05:55:05.223

Guys, is there some reason for why you cannot use the features of grep? grep -H -c ip6 * seems to me what's requested – Gombai Sándor – 2016-03-12T22:22:25.610

@GombaiSándor: -H prints the filenames but the -c option still only finds one match per line. OP wants all – flyingfinger – 2016-03-15T00:08:23.607

3

Instead of using grep, try use ag -c ha SilverSearcher:

1.txt:3
2.txt:1

It is more faster! If you are using ubuntu, you can install package silversearcher-ag.

Sergey Voronezhskiy

Posted 2016-03-12T02:54:28.563

Reputation: 166