Bash: Find folder containing two files

3

3

Hy, I got a large directory-tree. I want to find out all directories which contain a file, which name ends with ".ext1" and a file which name ends with ".ext2".

How is this possible? I thought about using two finds, one for ".ext1" and one for ".ext2", but then I need to find the intersection, how can this be done?

Thanks!

theomega

Posted 2010-09-15T12:29:24.090

Reputation: 934

Answers

5

Here's a relatively simple solution that runs find only once, stores its output in a temporary file and then splits out and collates the results for the two extensions.

tmp=$(mktemp)
find . -name '*.ext1' -o -name '*.ext2' | sort >"$tmp"
comm -12 <(<tmp sed -n 's!/[^/]*\.ext1$!!p' | sort) \
         <(<tmp sed -n 's!/[^/]*\.ext2$!!p' | sort)
rm "$tmp"

Another method is to run find to iterate over the directories and uses an auxiliary program to check whether a pattern has matches. Note that the existence check works in this case, but you need something more complicated if your search pattern doesn't match itself and is a possible file name.

find . -type d -exec sh -c '{ set "$0"/*.ext1; [ -e "$1" ]; } &&
                            { set "$0"/*.ext2; [ -e "$1" ]; }' {} \;

Here's a zsh solution that operates like this last command:

echo **/*(/e\''set -- $REPLY/*.ext1(N[1]) $REPLY/*.ext2(N[1]); ((#==2))'\')

Here's another zsh solution that looks for *.ext1 and selects only the directories that also have *.ext2:

echo ./**/*.ext1(e\''REPLY=${REPLY:h}; set -- $REPLY/*.ext2(N); ((#))'\')

Here's a partial Perl solution; due to the vicissitudes of Perl's globbing, it won't work if directory names contain spaces (there are ways to fix this, but I can't find one that's remotely elegant).

perl -l -MFile::Find -e \
  'find {no_chdir => 1,
         wanted => sub {<$_/*.ext1> and <$_/*.ext2> and print}}, "."'

Gilles 'SO- stop being evil'

Posted 2010-09-15T12:29:24.090

Reputation: 58 319

2

If you know the exact name of each file:

find start_dir -type d -exec test -e {}/file.ext1 -a -e {}/file.ext2 \; -print

The exceptionally ugly hack will work if all you know are the extensions:

find start_dir -type d -execdir bash -c 'shopt -s nullglob; eval '\''test -n "$(echo '{}'/*.ext1)" -a -n "$(echo '{}'/*.ext2)"'\''' \; -print

It could also be pretty slow if you have a lot of directories to search.

Paused until further notice.

Posted 2010-09-15T12:29:24.090

Reputation: 86 075

+1 for making me laugh when your self-proclaimed "hack" started to unravel :-) . Well played, sir. – Daniel Andersson – 2012-05-12T15:03:01.510

1

If run from the head of the directory tree, this seems to work

find $(find . -name "*.ext1" -printf %h\\n) -name "*.ext2" -printf %h\\n

W_Whalley

Posted 2010-09-15T12:29:24.090

Reputation: 3 212

+1 Good! But it can output more instances of a same directory if it contains two or more files that match the pattern, for example: a.ext1 b.ext1 c.ext2. [...] | sort -u may help. – cYrus – 2010-09-15T14:47:39.697

1It doesn't work if directory names include spaces. – Paused until further notice. – 2010-09-15T14:57:25.093

@Dennis: Right! Again... :) – cYrus – 2010-09-15T15:06:20.747

0

There must be a better solution, but:

find -type d | while read i ; do [ `ls "$i" -1 | grep -oe '.ext1$\|.ext2$' | sort -u | wc -l` = 2 ] && echo $i ; done

cYrus

Posted 2010-09-15T12:29:24.090

Reputation: 18 102

You need the directory to search in there find <directory> -type d ... – dtlussier – 2010-09-15T15:12:24.617

I prefer that find works in the current working directory. – cYrus – 2010-09-15T15:21:41.503