29

I was wondering if there's any way to display some kind of progress info when searching for files in linux using find. I often find myself searching for files on a big disk and some kind of progress indicator would be very helpfull, like a bar or at least the current directory "find" searches in. Are there any scripts that do that, or does find support some hooks?

HopelessN00b
  • 53,385
  • 32
  • 133
  • 208
Vlad Balmos
  • 395
  • 1
  • 3
  • 6
  • thanks for the answers, i'll check all the solutions and decided which one is better. If it were up to me i would mark all the answers as accepted. – Vlad Balmos Dec 15 '11 at 12:01
  • depending on which search criteria youb are using locate is much more faster than find – B14D3 Dec 15 '11 at 14:34

8 Answers8

32

with this trick you can see the current folder - but no progress bar - sorry.

 watch readlink -f /proc/$(pidof find)/cwd
ThorstenS
  • 3,084
  • 18
  • 21
  • that's cool. Should be noted thou that you need superuser privileges in order to access cwd. thanks! – Vlad Balmos Dec 15 '11 at 12:15
  • 1
    @ThorstenS works fine on Ubuntu 14.04 LTS but not on Ubuntu 16.04 LTS (even with `sudo`) maybe because the OS is now hidding the `cwd` of the process from the user – SebMa May 23 '20 at 13:18
9

A little utility called pv (pipe viewer) may help. From the fantastic summary by Peteris Krumins:

Pipe viewer is a terminal-based tool for monitoring the progress of data through a pipeline.

You can use pv in a number of ways. When playing around here, I put it immediately after a pipe to monitor progress of the output generated by find (should pass stdin to stdout untouched)

find / -mtime -1h | pv > /dev/null

which will show output a bit like this:

6.42MB 0:01:25 [31.7kB/s] [         <=>      ]

(I redirected stdout to /dev/null so I could see the progress bar in action without output flying by. This is likely not your intent with find, so tailor accordingly)

I'm honestly not sure how well this works in the wild. For "expensive" finds like the one above (traversing from root), it appeared to work fairly well. For simpler commands in a deeper node in the directory tree, pv failed miserably. These commands are returning results immediately, so a progress bar is probably moot here.

At any rate, play around and see if this works at all for what you need. Food for thought, at least.

tcdyl
  • 241
  • 1
  • 2
  • What would that progress bar show? Neither `find`, nor `pv` know how long the search will take,, so they cannot compute the percentage. All we can see in the `pv` output is the time since the search was started. – minaev Dec 15 '11 at 10:02
  • This is correct. I had thought there was some magic going on, somewhere, that allowed pv to check the progress of the directory traversal (which is incorrect). Given standard input at a constant rate, **pv** simply moves the progress bar at a constant frequency. Try `yes | pv > /dev/null` to observe – tcdyl Dec 15 '11 at 10:07
  • 1
    +1 for a nice utility – Vlad Balmos Dec 15 '11 at 12:26
  • Judging progress is non trivial. Not even web browsers loading pages are able to do that. I guess for file contents you can divide by the file size, but for Unix streams you don't know the total amount of data is usually, and this tool is meant to be flexible for any kind of stream data, not just files. – Sridhar Sarnobat Oct 16 '16 at 19:08
7

I searched for this today and got here via Google. I had a long-running find running on OS X and apparently, watch doesn't exist there. So here's another solution:

lsof -Fn -a -c find -d cwd +r 10

  • lsof = list of open files
  • -Fn = just show the name of the file/directory (prefixed with 'n' character, skip this if you prefer the full lsof output
  • -a = tell lsof to show only lines matching all criteria (by default it shows lines matching any criteria)
  • -c find = show files/directories opened by process named find (actually, process whose name starts with find, but it's case-sensitive so Finder won't show up)
  • -d cwd = show lines with FD (filedescriptor) cwd (current working directory)
  • +r 10 = show output every 10 seconds until no open files are found (find is finished)

This will show the directory find is processing every 10 seconds, so it should give an idea if find is still working and how far it has progressed.

Marie Fischer
  • 1,943
  • 1
  • 13
  • 13
5

There's an example of parallel searches with find in man find. Using it, you can perform multiple checks for every item, performing multiple actions depending on which condition works. The first check may be, for example, simple-print, so all names are printed to stdout. The second check will do what you want. Something like:

find /work \( -fprint /dev/stderr \) , \( -name 'core' -exec ls -l {} \; \)

If the second check should display filenames, too, you can redirect one of them to stderr using -fprint /dev/stderr.

jarno
  • 183
  • 1
  • 7
minaev
  • 1,549
  • 12
  • 13
  • I have'nt tested this but I think is the right way. – Rolf Feb 10 '19 at 14:30
  • You could use `-fprintf /dev/stderr '%h\n'` as the first action to print current directory during the operation, but it prints one directory line per each file. It would be better, if it printed only when the directory changes. – jarno Mar 22 '21 at 06:21
3

This is a list of current files opened by find, so it's the same of what find is looking 'right now'.

Its lightweight as just query address of file descriptos used by find every second and don't interfer in find itself. Also you can do it with any program you wish.

# watch -n 1 'ls -l /proc/$(pidof find)/fd | cut -d ">" -f 2 | grep -v /dev/'

The grep -v /dev/ is to hide files STDOUT, STDIN and STDERR, that are files used to receive and print data on your console.

  • 2
    The above much higher-voted answers (as of the date I write this, anyways) from years ago simply don't work for me. *This* answer works. However, I suggest a modification of this answer to really see what the `find` is doing: `watch -n 1 'ls -l /proc/$(pidof find)/fd | grep -v /dev/'` (i.e, show the full line and show each file, not just the folder, and update every second). Thanks so much! – Dan Nissenbaum Feb 06 '21 at 04:48
2

AFAIK, it doesn't, and implementing it would be nontrivial.

... Hmm. Perhaps a script running find <target dir> -type d first, storing the list and then echoing each dir before running a find <list item> -maxdepth 1 <rest of find parameters> in a for loop.

Note that you're trading a /significant/ loss of performance in exchange for being able to vaguely see what it's doing.

Shadur
  • 1,297
  • 1
  • 10
  • 20
1

If all you want is some kind of indication that something is happening, you can simply add an exec echo for each entry:

find . [other arguments] -exec echo -n '.' \;
noel
  • 111
  • 3
0

Not an exact answer to the question, but I imagine most use cases of find involve doing something non-trivial with the results, for example piping each result to tesseract for OCR processing.

An elegant way to obtain a progress bar is to use GNU Parallel with the --bar option (alternatively, there is also a --progress option).

Minimal example:

seq 1000 | parallel --bar sleep

Produces:

0% 3:997=16m12s 11

sappjw
  • 103
  • 4
tb90
  • 1