11

I've got a huge directory on my computer and I need to search in every ruby file inside for a string.

I could have done it like this : grep -R "string" *.rb but it takes really long and I'd like to use pv (pipe viewer) to show a progress bar to be able to monitor grep progress.

But I don't really know how can I write this command because there are still some things I just can't understand about this command.

Has someone got any idea ?

HopelessN00b
  • 53,385
  • 32
  • 133
  • 208
Cydonia7
  • 233
  • 1
  • 2
  • 6

6 Answers6

15

pv operates on pipes (not commands) -- It is a volume gauge showing how much data has gone past a given point in the pipeline.
Your grep command is not a pipeline (| - the pipe operator is nowhere to be found) - it's just a single command doing its thing. pv can't help you here, you just have to trust that grep is actually doing its thing on all of the input files.

You could cobble something together with find, pv, xargs & grep (find . -name "*.rb" | pv | xargs grep [regex] looks like it might be promising, but you would have to tell pv how big the find output is for it to give meaningful results.

Frankly it seems like more work than it's worth. Just run your grep, wait patiently, and deal with the output when it's done.

voretaq7
  • 79,345
  • 17
  • 128
  • 213
4

Two more methods:

for file in *.rb; do echo $file; grep "string" $file >> output.txt; done

Or, in a different shell while your original command is running, find the pid of the grep command and then:

strace -q -s 256 -e trace=open -p [pid] 2>&1 | head

Both of the above will show you which file the grep command is currently working on. You can find the total number of files with:

ls -l *.rb | wc -l

Lastly, use this to figure out which number the current file is in the list:

ls -l *.rb | grep -n [the current filename]

P.S. My answers assume that all your files are in a single directory. If they are not, you will have to use find instead of ls and *.rb as thinice suggested.

Ladadadada
  • 25,847
  • 7
  • 57
  • 90
1

I'm not sure what OS you're using, but grep -R "string" *.ext may not be working correctly for you.

You might be better served using find in conjunction with grep:

find . -type f -name "*.rb" -print0 |xargs --null grep "string"

thinice
  • 4,676
  • 20
  • 38
1

In recent versions of pv there is an "-d"-Option to watch all the FDs of another process.

So in theory pv will not only work as a pipe but also as a progress-indicator for a whole process. (For example, try it with the PID of your Firefox)

For the Problem above a simpler idea is the following: While the grep is running, use lsof together with watch.

$ watch -n 1 "lsof | grep -n $PWD"

That way you can monitor the progress of your grep.

Jan Walzer
  • 11
  • 1
0

Have you already tried

grep -R "string" *.rb | pv

I don't know if it actually works to because it doesn't know how many bits of total data to search because it is recursive?

nhutto
  • 121
  • 4
  • 4
    I don't think this will do what he wants -- `pv` will be operating on the output of the grep (so even if he specified the full input size `pv` only sees the output coming out the end of the pipe -- It would be *way* under-counting bytes. – voretaq7 Dec 21 '11 at 17:49
0

I usually use the proc-filesystem on Linux systems, i.e.

ls -al /proc/<pid of grep>/fd

This lists all the files that the grep-invocation has open currently and thus gives an impression of where in the search it is currently.

centic
  • 221
  • 3
  • 11