Generate a list of files by piping output of find command into another find command?

2

I need to generate a list of files for use in a shell script. The list should be all files (in a specified directory) that are hardlinked. I want to replace the hardlinks with symlinks.

(Obviously, I can't delete the last hardlink. And this question is related to this other question which has a fatal flaw.)

I'm open to any suggestions about how to do this. If you think this question is a duplicate, please make sure the other answer actually works. I haven't found a working solution yet that meets these requirements.

  • looks in a directory that potentially contains hardlinked files to keep
  • searches for other hardlinked files from a top level directory or file system root
  • both directories can be provided as parameters
  • can also act on files of specified types only (e.g., images)

My (new) idea is to pipe the output of this find

find "$dir" -type f -links +1

Into this one:

find "$topdir" -xdev -samefile <output from other find> -printf '%i:%p\n' | sort --field-separator=:

If that will work, then I will provide the resulting list to a while-loop similar to this (from the original code):

last_inode=
while IFS= read -r path_info
do
   inode=${path_info%%:*}
   path=${path_info##*:}
   if [[ $last_inode != $inode ]]; then
       printf "$inode\n"
       last_inode=$inode
       path_to_keep=$path
   else
       rm -- "$path"
       ln -s -- "$path_to_keep" "$path"
   fi
done

I can also add a parameter like -iname "*.jpg" to the (first) find command to act on JPEG files only. (I'm also open to better suggestions here too.)

MountainX

Posted 2012-03-30T00:58:49.363

Reputation: 1 735

Answers

1

Here's a solution that works. I tested it fairly extensively. However, I welcome better answers. I'd rather select someone else's answer than my own (which says something about my confidence in my bash scripting skills).

find "$dir" -type f -links +1 -exec find "$topdir" -xdev -samefile '{}' -printf '%i:%p\n' \; | sort --field-separator=:

Here's the whole solution, extending the linked question (assuming it works):

#!/bin/bash
set -o nounset
topdir='/'
dir='/MotherBoards/Tyan S2720 Thunder i7500/IntelNetworkAdapterDrivers/Setup/'

echo "starting..."

# For each path which has multiple links
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# (except ones containing newline)
last_inode=
while IFS= read -r path_info
do
   inode=${path_info%%:*}
   path=${path_info##*:}
   if [[ $last_inode != $inode ]]; then
       printf "$inode\n"
       last_inode=$inode
       path_to_keep=$path
   else
       printf "$inode\tln -s\t'$path_to_keep'\t'$path'\n"
       rm -- "$path"
       ln -s -- "$path_to_keep" "$path"
   fi
done < <( find "$dir" -type f -links +1 -exec find "$topdir" -xdev -samefile '{}' -printf '%i:%p\n' \; | sort --field-separator=: )

# Warn about any excluded files
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
buf=$( find "$dir" -type f -links +1 -wholename '*
*' )
if [[ $buf != '' ]]; then
    echo 'Some files not processed because their paths contained newline(s):'$'\n'"$buf"
fi

echo "finished"
exit 0

MountainX

Posted 2012-03-30T00:58:49.363

Reputation: 1 735