du which counts number of files/directories rather than size

13

2

I am trying to clean up a hard drive which has all kinds of crap on it accumulated over the years. du has helped reduce disk usage, but the whole thing is still unwieldily not due to the total size, but due to the sheer number of files and directories in total.

Is there a way I can do something like du but not counting file size, but rather number of files and directories? For example: a file is 1, and a directory is the recursive number of files/directories inside it + 1.

Edit: I should have been more clear. I'd like to not only know the total number of files/directories in /, but also in /home, /usr etc, and in their subdirectories, recursively, like du does for size.

Jesse

Posted 2013-04-20T07:47:21.893

Reputation: 565

2

Think you might be looking for something like a slightly modified version of the answers here http://superuser.com/questions/198817/recursively-count-all-the-files-in-a-directory

– James – 2013-04-20T08:30:48.627

Answers

4

The following PHP script does the trick.

#!/usr/bin/php
<?php 

function do_scan($dir, $dev) {
  $total = 1;

  if (\filetype($dir) === 'dir' && \lstat($dir)['dev'] == $dev) {
    foreach (\scandir($dir) as $file) {
      if ($file !== '.' && $file !== '..') {
        $total += do_scan($dir . \DIRECTORY_SEPARATOR . $file, $dev);
      }
    }

    print "$total\t$dir\n";
  }

  return $total;
};

foreach (\array_slice($argv, 1) as $arg) {
  do_scan($arg, \lstat($arg)['dev']);
}

Put that in a file (say, "treesize"), chmod +x it and run it with ./treesize . | sort -rn | less.

Jesse

Posted 2013-04-20T07:47:21.893

Reputation: 565

Why is this the accepted answer?! You are assuming php is on the machine, which is not always the case. The script is not documented and to specific. While it is ok to answer your own question on SE, this answer does not even provide an Answer to your own question; or you did not asked the question you had in mind when the problem occured... Unfortunately I cannot downvote it, I have to few point... still, bad answer! – user1810087 – 2018-08-09T09:35:34.110

I can't write the script in any language without assuming an interpreter for that language is installed. The script prints the total number of files and directories beneath each directory recursively. So a du that simply counts instead of summing size, which is exactly what the original question asked. – Jesse – 2018-08-16T06:42:10.880

11

I have found du --inodes useful, but I'm not sure which version of du it requires. On Ubuntu 17.10, the following works:

du --inodes      # all files and subdirectories
du --inodes -s   # summary
du --inodes -d 2 # depth 2 at most

Combine with | sort -nr to sort descending by number of containing inodes.

krlmlr

Posted 2013-04-20T07:47:21.893

Reputation: 572

1This looks a lot more like what I want than the accepted answer. – Sridhar Sarnobat – 2019-03-16T01:54:50.653

8

The easiest way seems to be find /path/to/search -ls | wc -l

Find is used to walk though all files and folders.
-ls to list (print) all the names. This is a default and if you leave it out it will still work the same almost all systems. (Almost, since some might have different defaults). It is a good habit to explicitly use this though.

If you just use the find /path/to/search -ls part it will print all the files and directories to your screen.


wc is word count. the -l option tells it to count the number of lines.

You can use it in several ways, e.g.

  • wc testfile
  • cat testfile | wc

The first option lets wc open a file and count the number of lines, words and chars in that file. The second option does the same but without filename it reads from stdin.


You can combime commands with a pipe |. Output from the first command will be piped to the input of the second command. Thus find /path/to/search -ls | wc -l uses find to list all files and directory and feeds the output to wc. Wc then counts the number of lines.

(An other alternative would have been `ls | wc', but find is much more flexible and a good tool to learn.)


[Edit after comment]

It might be useful to combine the find and the exec.

E.g. find / -type d ! \( -path proc -o -path dev -o -path .snap \) -maxdepth 1 -exec echo starting a find to count to files in in {} \; will list all directories in /, bar some which you do not want to search. We can trigger the previous command on each of them, yielding a sum of files per folder in /.

However:

  1. This uses the GNU specific extension -maxdepth.
    It will work on Linux, but not on just any unix-a-alike.
  2. I suspect you might actually want a number fo files for each and every subdir.

Hennes

Posted 2013-04-20T07:47:21.893

Reputation: 60 739

Sorry, not just one level deep though, but for all levels (that's what I meant by "recursively" in my edit). – Jesse – 2013-04-22T12:23:27.067

Instead of the exec echo you trigger a find | wc for each dir. I know it is possible, but I can't seem to discover how today. I guess I keep making the same mistake somehow. * Goes to brew coffee * . – Hennes – 2013-04-22T12:46:59.010

2

ncdu is great for this!

From the man page, you can show counts per directory and order by counts as well:

[...]
KEYS
       C   Order by number of items (press again for descending order)
[...]
       c   Toggle display of child item counts.

For example:

ncdu output

jobevers

Posted 2013-04-20T07:47:21.893

Reputation: 121

1

Here's a solution that uses bash, inspired by a post from Unix & Linux.

find . -type d | while read -r dir; do \
    printf "%s:\t" "$dir"; find "$dir" -type f | wc -l; done

If there are some folders that you don't want to see the details of, like .git, you can exclude them from the list with grep.

find . -type d |grep -v "./.git/.*" | while read -r dir; do \
    printf "%s:\t" "$dir"; find "$dir" -type f | wc -l; done

Don Kirkby

Posted 2013-04-20T07:47:21.893

Reputation: 883

1

Exploit the fact that dirs and files are separated by /. This script does hot meet your criteria, but serves to inspire a full solution. You should also consider indexing your files with locate.

geee: /R/tb/tmp
$ find  2>/dev/null | awk -F/ -f filez  | sort -n
files:  57
3       imagemagick
7       portage
10      colemak-1.0
25      minpro.com
42      monolith
80      QuadTree
117     themh
139     skyrim.stings
185     security-howto
292     ~t
329     skyrim
545     HISTORY
705     minpro.com-original
1499    transmission-2.77
23539   ugent-settings

>

$ cat filez
{
a[$2]++;     # $1= folder,  $2 = everything inside folder.
}

END {
        for (i in a) {
                if (a[i]==1) {files++;}
                else { printf "%d\t%s\n", a[i], i; }
        }
        print "files:\t" files
}

>

 $ time locate /  | awk -F/ -f /R/tb/tmp/filez  | sort -n
 files:  13
 2
 2       .fluxbox
 10      M
 11      BIN
 120     bin
 216     sbin
 234     boot
 374     R
 854     dev
 1351    lib
 2018    etc
 9274    media
 30321   opt
 56516   home
 93625   var
 222821  usr
 351367  mnt
 time: Real 0m17.4s  User 0m4.1s  System 0m3.1s

Ярослав Рахматуллин

Posted 2013-04-20T07:47:21.893

Reputation: 9 076

2Why do I have .fluxbox in / ? :D – Ярослав Рахматуллин – 2013-04-22T11:05:26.760