How can you see the actual hard link by ls?

104

I run

ln /a/A /b/B

I would like to see at the folder a where the file A points to by ls.

Léo Léopold Hertz 준영

Posted 2009-07-25T09:41:26.477

Reputation: 4 828

1Hard links aren't pointers, symlinks are. They're multiple names for the same file (inode). After a link(2) system call, there's no sense in which one is the original and one is the link. This is why, as the answers point out, the only way to find all the links is find / -samefile /a/A. Because one directory entry for an inode doesn't "know about" other directory entries for the same inode. All they do is refcount the inode so it can be deleted when the last name for it is unlink(2)ed. (This is the "link count" in ls output). – Peter Cordes – 2015-04-21T18:47:45.683

@PeterCordes: Is the refcount actually stored IN the hardlink entry? That's what your wording implies ("All they do is refcount the inode...") But that wouldn't make sense if the links don't know anything about each other, since when one updated, all the others would somehow have to be updated. Or is the refcount stored in the inode itself? (Forgive me if it's a dumb question, I consider myself a newbie and I'm still learning). – loneboat – 2015-07-06T20:51:42.820

1The refcount is stored in the inode, as you eventually figured out must be the case, from the other facts. :) Directory entries are named pointers to inodes. We call it "hard linking" when you have multiple names pointing to the same inode. – Peter Cordes – 2015-07-06T20:58:11.033

Answers

182

You can find inode number for your file with

ls -i

and

ls -l

shows references count (number of hardlinks to a particular inode)

after you found inode number, you can search for all files with same inode:

find . -inum NUM

will show filenames for inode NUM in current dir (.)

zzr

Posted 2009-07-25T09:41:26.477

Reputation: 1 943

1@BeowulfNode42 This command is great, but it needs the shared root folder of the same files at least. – Itachi – 2016-09-26T04:41:10.853

this answer gives a pragmatic "do this" but i feel strongly that @LaurenceGonsalves answers the "how" and/or "why" questions.

– Trevor Boyd Smith – 2016-11-16T18:12:26.800

50you could just run find . -samefile filename – BeowulfNode42 – 2013-11-25T00:02:12.303

There isn't really a well-defined answer to your question. Unlike symlinks, hardlinks are indistinguishable from the "original file".

Directory entries consist of a filename and a pointer to an inode. The inode in turn contains the file metadata and (pointers to) the actual file contents). Creating a hard link creates another filename + reference to the same inode. These references are unidirectional (in typical filesystems, at least) -- the inode only keeps a reference count. There is no intrinsic way to find out which is the "original" filename.

By the way, this is why the system call to "delete" a file is called unlink. It just removes a hardlink. The inode an attached data are deleted only if the inode's reference count drops to 0.

The only way to find the other references to a given inode is to exhaustively search over the file system checking which files refer to the inode in question. You can use 'test A -ef B' from the shell to perform this check.

Laurence Gonsalves

Posted 2009-07-25T09:41:26.477

Reputation: 5 021

35That means that there is no such thing as a hard link to another file, as the original file is also a hard link; hard links point to a location on disk. – jtbandes – 2009-07-26T00:03:58.710

12@jtbandes: Hard links point to an inode which points to the actual data. – dash17291 – 2013-06-13T19:34:10.677

UNIX has hard links and symbolic links (made with "ln" and "ln -s" respectively). Symbolic links are simply a file that contains the real path to another file and can cross filesystems.

Hard links have been around since the earliest days of UNIX (that I can remember anyway, and that's going back quite a while). They are two directory entries that reference the exact same underlying data. The data in a file is specified by its inode. Each file on a file system points to an inode but there's no requirement that each file point to a unique inode - that's where hard links come from.

Since inodes are unique only for a given filesystem, there's a limitation that hard links must be on the same filesystem (unlike symbolic links). Note that, unlike symbolic links, there is no privileged file - they are all equal. The data area will only be released when all the files using that inode are deleted (and all processes close it as well, but that's a different issue).

You can use the "ls -i" command to get the inode of a particular file. You can then use the "find <filesystemroot> -inum <inode>" command to find all files on the filesystem with that given inode.

Here's a script which does exactly that. You invoke it with:

findhardlinks ~/jquery.js

and it will find all files on that filesystem which are hard links for that file:

pax@daemonspawn:~# ./findhardlinks /home/pax/jquery.js
Processing '/home/pax/jquery.js'
   '/home/pax/jquery.js' has inode 5211995 on mount point '/'
       /home/common/jquery-1.2.6.min.js
       /home/pax/jquery.js

Here's the script.

#!/bin/bash
if [[ $# -lt 1 ]] ; then
    echo "Usage: findhardlinks <fileOrDirToFindFor> ..."
    exit 1
fi

while [[ $# -ge 1 ]] ; do
    echo "Processing '$1'"
    if [[ ! -r "$1" ]] ; then
        echo "   '$1' is not accessible"
    else
        numlinks=$(ls -ld "$1" | awk '{print $2}')
        inode=$(ls -id "$1" | awk '{print $1}' | head -1l)
        device=$(df "$1" | tail -1l | awk '{print $6}')
        echo "   '$1' has inode ${inode} on mount point '${device}'"
        find ${device} -inum ${inode} 2>/dev/null | sed 's/^/        /'
    fi
    shift
done

user53528

Posted 2009-07-25T09:41:26.477

Reputation:

@pax: There seems to be a bug in the script. I start it by . ./findhardlinks.bash while being in OS X's Zsh. My current window in Screen closes. – None – 2009-07-25T16:31:37.153

4@Masi The issue is your initial . (same as the source command). That causes the exit 1 command to exit your shell. Use chmod a+x findhardlinks.bash then execute it with ./findhardlinks.bash or use bash findhardlinks.bash – njsf – 2009-07-25T23:08:15.487

Please, see my reply to your answer at http://superuser.com/questions/12972/to-see-hardlinks-by-ls/13233#13233

– Léo Léopold Hertz 준영 – 2009-07-26T16:42:00.727

Best answer, by far. Kudos. – MariusMatutiae – 2015-06-27T08:24:50.247

Yeah, great explanation with the correct answer :) – sMyles – 2017-05-26T02:49:54.577

@Joe do you have any suggestion as to what to use for device? when attempted on my Mac I had to replace the 6th position with the 9th position. Also stat has different flags on Mac. should be -f instead of -c. – guyarad – 2019-01-16T17:57:59.577

Found a solution to df output issues here. Simply add -P to df command to get POSIX-compliant output

– guyarad – 2019-01-17T08:18:09.243

@guyarad Good, glad you got it figured out. Because I have no idea what this is anymore. That was from 7 years ago – Joe – 2019-01-17T19:36:08.087

Shame on you @Joe ! not remembering a comment on a SE post you wrote 7 years ago :) – guyarad – 2019-01-20T05:30:27.637

3To do this programmatically, it's probably more resilient if you use this instead: INUM=$(stat -c %i $1). Also NUM_LINKS=$(stat -c %h $1). See man stat for more format variables you can use. – Joe – 2012-01-03T20:12:21.513

ls -l

The first column will represent permissions. The second column will be the number of sub-items (for directories) or the number of paths to the same data (hard links, including the original file) to the file. Eg:

-rw-r--r--@    2    [username]    [group]    [timestamp]     HardLink
-rw-r--r--@    2    [username]    [group]    [timestamp]     Original
               ^ Number of hard links to the data

eyelidlessness

Posted 2009-07-25T09:41:26.477

Reputation: 381

3Helpful in determining IF a given file has [other] hard links, but not WHERE they are. – mklement0 – 2015-02-11T03:48:48.263

Also, there's no technical distinction between a hard-link and an original file. They are both identical in that they simply point to the inode which in turn point to disc content. – guyarad – 2019-01-16T17:59:32.627

How about the following simpler one? (Latter might replace the long scripts above!)

If you have a specific file <THEFILENAME>and want to know all its hardlinks spread over the directory <TARGETDIR>, (which can even be the entire filesystem denoted by /)

find <TARGETDIR> -type f -samefile  <THEFILENAME>

Extending the logic, if you want to know all the files in the <SOURCEDIR> having multiple hard-links spread over <TARGETDIR>:

find <SOURCEDIR> -type f -links +1   \
  -printf "\n\n %n HardLinks of file : %H/%f  \n"   \
  -exec find <TARGETDIR> -type f -samefile {} \;

Loves Probability

Posted 2009-07-25T09:41:26.477

Reputation: 253

3@silvio: You can only create hard links to files, not directories. – mklement0 – 2015-02-11T03:45:53.667

@mklement0: You are right! – silvio – 2015-02-11T10:38:10.860

The . and .. entries in directories are hardlinks. You can tell how many subdirs are in a directory from the link count of .. This is moot anyway, since find -samefile . still won't print any subdir/.. output. find (at least the GNU version) seems to be hardcoded to ignore .., even with -noleaf. – Peter Cordes – 2015-04-21T18:53:59.940

also, that find-all-links idea is O(n^2), and runs find once for each member of a set of hardlinked files. find ... -printf '%16i %p\n' | sort -n | uniq -w 16 --all-repeated=separate would work, (16 isn't wide enough for a decimal representation of 2^63-1, so when your XFS filesystem is big enough to have inode numbers that high, watch out) – Peter Cordes – 2015-04-21T19:06:37.773

turned that into an answer – Peter Cordes – 2015-04-21T19:33:28.253

This is for me the best answer! but i would not use -type f because the file can be a directory too. – silvio – 2013-08-30T11:40:44.810

There are a lot of answers with scripts to find all hardlinks in a filesystem. Most of them do silly things like running find to scan the whole filesystem for -samefile for EACH multiply-linked file. This is crazy; all you need is to sort on inode number and print duplicates.

With only one pass over the filesystem to find and group all sets of hardlinked files

find dirs   -xdev \! -type d -links +1 -printf '%20D %20i %p\n' |
    sort -n | uniq -w 42 --all-repeated=separate

This is much faster than the other answers for finding multiple sets of hardlinked files.
find /foo -samefile /bar is excellent for just one file.

-xdev : limit to one filesystem. Not strictly needed since we also print the FS-id to uniq on
! -type d reject directories: the . and .. entries mean they're always linked.
-links +1 : link count strictly > 1
-printf ... print FS-id, inode number, and path. (With padding to fixed column widths that we can tell uniq about.)
sort -n | uniq ... numeric sort and uniquify on the first 42 columns, separating groups with a blank line

Using ! -type d -links +1 means that sort's input is only as big as the final output of uniq so we aren't doing a huge amount of string sorting. Unless you run it on a subdirectory that only contains one of a set of hardlinks. Anyway, this will use a LOT less CPU time re-traversing the filesystem than any other posted solution.

sample output:

...
            2429             76732484 /home/peter/weird-filenames/test/.hiddendir/foo bar
            2429             76732484 /home/peter/weird-filenames/test.orig/.hiddendir/foo bar

            2430             17961006 /usr/bin/pkg-config.real
            2430             17961006 /usr/bin/x86_64-pc-linux-gnu-pkg-config

            2430             36646920 /usr/lib/i386-linux-gnu/dri/i915_dri.so
            2430             36646920 /usr/lib/i386-linux-gnu/dri/i965_dri.so
            2430             36646920 /usr/lib/i386-linux-gnu/dri/nouveau_vieux_dri.so
            2430             36646920 /usr/lib/i386-linux-gnu/dri/r200_dri.so
            2430             36646920 /usr/lib/i386-linux-gnu/dri/radeon_dri.so
...

TODO?: un-pad the output with awk or cut. uniq has very limited field-selection support, so I pad the find output and use fixed-width. 20chars is wide enough for the maximum possible inode or device number (2^64-1 = 18446744073709551615). XFS chooses inode numbers based on where on disk they're allocated, not contiguously from 0, so large XFS filesystems can have >32bit inode numbers even if they don't have billions of files. Other filesystems might have 20-digit inode numbers even if they aren't gigantic.

TODO: sort groups of duplicates by path. Having them sorted by mount point then inode number mixes things together, if you have a couple different subdirs that have lots of hardlinks. (i.e. groups of dup-groups go together, but the output mixes them up).

A final sort -k 3 would sort lines separately, not groups of lines as a single record. Preprocessing with something to transform a pair of newlines into a NUL byte, and using GNU sort --zero-terminated -k 3 might do the trick. tr only operates on single characters, not 2->1 or 1->2 patterns, though. perl would do it (or just parse and sort within perl or awk). sed might also work.

Peter Cordes

Posted 2009-07-25T09:41:26.477

Reputation: 3 141

1%D is the filesystem identifier (it is unique for the current boot while no filesystems are umounted), so following is even more generic: find directories.. -xdev ! -type d -links +1 -printf '%20i %20D %p\n' | sort -n | uniq -w 42 --all-repeated=separate. This works as long no given directory contains another directory on the filesystem level, also it looks at everything which can be hardlinked (like devices or softlinks - yes, softlinks can have a link count greater than 1). Note that dev_t and ino_t is 64 bits long today. This likely will hold as long as we have 64 bit systems. – Tino – 2015-11-09T14:34:28.280

@Tino: great point about using using ! -type d, instead of -type f. I even have some hardlinked symlinks on my filesystem from organizing some collections of files. Updated my answer with your improved version (but I put the fs-id first, so the sort order at least groups by filesystem.) – Peter Cordes – 2015-11-09T18:45:37.093

This is somewhat of a comment to Torocoro-Macho's own answer and script, but it obviously won't fit in the comment box.

Rewrote your script with more straightforward ways to find the info, and thus a lot less process invocations.

#!/bin/sh
xPATH=$(readlink -f -- "${1}")
for xFILE in "${xPATH}"/*; do
    [ -d "${xFILE}" ] && continue
    [ ! -r "${xFILE}" ] && printf '"%s" is not readable.\n' "${xFILE}" 1>&2 && continue
    nLINKS=$(stat -c%h "${xFILE}")
    if [ ${nLINKS} -gt 1 ]; then
        iNODE=$(stat -c%i "${xFILE}")
        xDEVICE=$(stat -c%m "${xFILE}")
        printf '\nItem: %s[%d] = %s\n' "${xDEVICE}" "${iNODE}" "${xFILE}";
        find "${xDEVICE}" -inum ${iNODE} -not -path "${xFILE}" -printf '     -> %p\n' 2>/dev/null
    fi
done

I tried to keep it as similar to yours as possible for easy comparison.

Comments on this script and yours

One should always avoid the $IFS magic if a glob suffices, since it is unnecessarily convoluted, and file names actually can contain newlines (but in practice mostly the first reason).
You should avoid manually parsing ls and such output as much as possible, since it will sooner or later bite you. For example: in your first awk line, you fail on all file names containing spaces.
printf will often save troubles in the end since it is so robust with the %s syntax. It also gives you full control over the output, and is consistent across all systems, unlike echo.
stat can save you a lot of logic in this case.
GNU find is powerful.
Your head and tail invocations could have been handled directly in awk with e.g. the exit command and/or selecting on the NR variable. This would save process invocations, which almost always betters performance severely in hard-working scripts.
Your egreps could just as well be just grep.

Daniel Andersson

Posted 2009-07-25T09:41:26.477

Reputation: 20 465

If you just want groups of hardlinks, rather than repeated with each member as the "master", use find ... -xdev -type f -links +1 -printf '%16i %p\n' | sort -n | uniq -w 16 --all-repeated=separate. This is MUCH faster, as it only traverses the fs once. For multiple FSes at once, you'd need to prefix the the inode numbers with a FS id. Maybe with find -exec stat... -printf ... – Peter Cordes – 2015-04-21T19:14:06.927

turned that idea into an answer – Peter Cordes – 2015-04-21T19:33:40.440

xDEVICE=$(stat -c%m "${xFILE}") does not work on all systems (for example: stat (GNU coreutils) 6.12). If the script outputs "Item: ?" at the front of each line, then replace this offending line with a line more like the original script, but with xITEM renamed to xFILE: xDEVICE=$(df "${xFILE}" | tail -1l | awk '{print $6}') – kbulgrien – 2014-03-28T20:23:44.833

Based on the findhardlinks script (renamed it to hard-links), this is what I have refactored and made it work.

Output:

# ./hard-links /root

Item: /[10145] = /root/.profile
    -> /proc/907/sched
    -> /<some-where>/.profile

Item: /[10144] = /root/.tested
    -> /proc/907/limits
    -> /<some-where else>/.bashrc
    -> /root/.testlnk

Item: /[10144] = /root/.testlnk
    -> /proc/907/limits
    -> /<another-place else>/.bashrc
    -> /root/.tested

# cat ./hard-links
#!/bin/bash
oIFS="${IFS}"; IFS=$'\n';
xPATH="${1}";
xFILES="`ls -al ${xPATH}|egrep "^-"|awk '{print $9}'`";
for xFILE in ${xFILES[@]}; do
  xITEM="${xPATH}/${xFILE}";
  if [[ ! -r "${xITEM}" ]] ; then
    echo "Path: '${xITEM}' is not accessible! ";
  else
    nLINKS=$(ls -ld "${xITEM}" | awk '{print $2}')
    if [ ${nLINKS} -gt 1 ]; then
      iNODE=$(ls -id "${xITEM}" | awk '{print $1}' | head -1l)
      xDEVICE=$(df "${xITEM}" | tail -1l | awk '{print $6}')
      echo -e "\nItem: ${xDEVICE}[$iNODE] = ${xITEM}";
      find ${xDEVICE} -inum ${iNODE} 2>/dev/null|egrep -v "${xITEM}"|sed 's/^/   -> /';
    fi
  fi
done
IFS="${oIFS}"; echo "";

Torocoro-Macho

Posted 2009-07-25T09:41:26.477

Reputation: 21

I posted comments on this script as a separate answer. – Daniel Andersson – 2012-06-13T07:40:57.033

You can configure ls to highlight hardlinks using an 'alias', but as stated before there is no way to show the 'source' of the hardlink which is why I append .hardlink to help with that.

Add the following somewhere in your .bashrc

alias ll='LC_COLLATE=C LS_COLORS="$LS_COLORS:mh=1;37" ls -lA --si --group-directories-first'

Daniel Sokolowski

Posted 2009-07-25T09:41:26.477

Reputation: 651

A GUI solution gets really close to your question:

You cannot list the actual hardlinked files from "ls" because, as previous commentators have pointed out, the file "names" are mere aliases to the same data. However, there actually is a GUI tool that gets really close to what you want which is to display a path listing of file names that point to the same data (as hardlinks) under linux, it is called FSLint. The option you want is under "Name clashes" -> deselect "checkbox $PATH" in Search (XX) -> and select "Aliases" from drop-down box after "for..." towards the top-middle.

FSLint is very poorly documented but I found that making sure the limited directory tree under "Search path" with the checkbox selected for "Recurse?" and the aforementioned options, a listing of hardlinked data with paths and names that "point" to the same data are produced after the program searches.

Charles

Posted 2009-07-25T09:41:26.477

Reputation: 11

FSlint can be found at http://www.pixelbeat.org/fslint/

– mklement0 – 2015-02-11T03:43:19.163