71

Is unlink any faster than rm?

Marcin
  • 813
  • 1
  • 6
  • 4
  • 11
    "Premature optimization is the root of all evil (or at least most of it) in programming." -- Donald Knuth http://en.wikiquote.org/wiki/Donald_Knuth – chris Jul 10 '09 at 13:25

3 Answers3

77

Both are a wrapper to the same fundamental function which is an unlink() system call.

To weigh up the differences between the userland utilies.

rm(1):

  • More options.
  • More feedback.
  • Sanity checking.
  • A bit slower for single calls as a result of the above.
  • Can be called with multiple arguments at the same time.

unlink(1):

  • Less sanity checking.
  • Unable to delete directories.
  • Unable to recurse.
  • Can only take one argument at a time.
  • Marginally leaner for single calls due to it's simplicity.
  • Slower when compared with giving rm(1) multiple arguments.

You could demonstrate the difference with:

$ touch $(seq 1 100)
$ unlink $(seq 1 100)
unlink: extra operand `2'

$ touch $(seq 1 100)
$ time rm $(seq 1 100)

real    0m0.048s
user    0m0.004s
sys     0m0.008s

$ touch $(seq 1 100)
$ time for i in $(seq 1 100); do rm $i; done

real    0m0.207s
user    0m0.044s
sys     0m0.112s

$ touch $(seq 1 100)
$ time for i in $(seq 1 100); do unlink $i; done

real    0m0.167s
user    0m0.048s
sys     0m0.120s

If however we're talking about an unadulterated call to the system unlink(2) function, which I now realise is probably not what you're accounting for.

You can perform a system unlink() on directories and files alike. But if the directory is a parent to other directories and files, then the link to that parent would be removed, but the children would be left dangling. Which is less than ideal.

Edit:

Sorry, clarified the difference between unlink(1) and unlink(2). Semantics are still going to differ between platform.

Dan Carley
  • 25,189
  • 5
  • 52
  • 70
  • Does that mean that in unix filesystems removing a directory and recursively all files under it will always be an operation that is proportional to number of files/dirs it contains? When happens when I unlink a directory that is parent to other dirs/files? It never gets wiped out and I lost this space forever? – Marcin Jul 10 '09 at 09:46
  • 6
    It is technically possible leave orphaned directories/files on most if not all file systems. Fixing this generally means running a file system repair tool. On Unix/Linux these tools are known as 'fsck' and some specific variations for different file systems. If they do recover something they will normally leave it in a directory called 'lost+found' – ConcernedOfTunbridgeWells Jul 10 '09 at 10:01
  • 1
    Correct. rm will recurse from the bottom of the tree up. You can demonstrate how with: `mkdir -p 1/2/3; touch 1/one 1/2/two 1/2/3/three; rm -ri 1`. If you unlinked the parent directory then space consumed by the children should be lost until such time that fsck finds the discrepancy. – Dan Carley Jul 10 '09 at 10:03
  • 1
    What are you talking about? $ mkdir -p 1/2/3 $ unlink 1 unlink: cannot unlink `1': Is a directory Users causing "memory" leak requiring fsck? Unlikely! – Thomas Jul 10 '09 at 11:36
  • 1
    Both Linux and FreeBSD manpages explicitly state that it will fail when trying to run unlink() on a directory. – Thomas Jul 10 '09 at 11:44
  • I guess it depends on your implementation of `unlink(1)`. `unlink(2)` will certainly not care. – Dan Carley Jul 10 '09 at 11:45
  • Dan C: what are you talking about? This is complete jibberish! Have you even tested what you're saying? $ strace -e unlink unlink 1 unlink("1") = -1 EISDIR (Is a directory) unlink: cannot unlink `1': Is a directory – Thomas Jul 10 '09 at 12:35
  • I've tested it on FreeBSD and Linux (ext2 and XFS) – Thomas Jul 10 '09 at 12:37
  • Also, I was referring to the manpages of unlink(2). Of course they care! – Thomas Jul 10 '09 at 12:40
  • 1
    Turns out Solaris with UFS does allow root to unlink non-empty directories. Not normal users though. Is there anything else that does? Not ZFS on Solaris, not anything on Linux or FreeBSD. – Thomas Jul 10 '09 at 13:17
  • Is that with userland or system unlink()? That was the difference I was attempting to make. – Dan Carley Jul 10 '09 at 13:22
11

At the POSIX spec level, what rm does is specified much more tightly than what unlink does.

The portability of the outcome seems likely to be better using rm, if your script has to run across OS's.

Nick
  • 101
  • 4
Mike G.
  • 401
  • 3
  • 14
6

The slow part of removing is the filesystem code and disk stuff, not the userspace preparation of the unlink() system call.

I.e.: if the speed difference matters, then you shouldn't be storing the data on the file system.

unlink is just a rm "light". rm has more features but they do the same thing.

Thomas
  • 1,446
  • 11
  • 16