10

I am on a large instance on Amazon's EC2 servers. I run the df command and get:

root@db:~# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             9.9G  9.1G  284M  98% /
tmpfs                 3.8G     0  3.8G   0% /lib/init/rw
varrun                3.8G  116K  3.8G   1% /var/run
varlock               3.8G     0  3.8G   0% /var/lock
udev                  3.8G   80K  3.8G   1% /dev
tmpfs                 3.8G     0  3.8G   0% /dev/shm
/dev/sdb              414G  957M  392G   1% /mnt
/dev/sdf               50G   12G   35G  26% /byp
/dev/sdk               99G   31G   63G  33% /backups

I then run the du command and get:

root@db:/# du -s -h /*
31G     /backups
5.5M    /bin
136K    /boot
12G     /byp
80K     /dev
5.8M    /etc
12K     /home
70M     /lib
11M     /lib32
0       /lib64
16K     /lost+found
759M    /mnt
4.0K    /opt
du: cannot access `/proc/6917/task/6917/fd/4': No such file or directory
du: cannot access `/proc/6917/fd/4': No such file or directory
0       /proc
31M     /root
7.7M    /sbin
4.0K    /selinux
4.0K    /srv
0       /sys
11M     /tmp
1.1G    /usr
114M    /var

If you notice, when you add up all the sizes on the du command output of non-mounted directories, you will not get anywhere close to 9.1G as seen in the df command.

Does this mean I have a bad disk? If so, how can I fix it?

sheats
  • 203
  • 1
  • 2
  • 6

3 Answers3

21

It's entirely possible that you have a very large deleted file (or lots of little ones) that a process still has an open file handle on. The way to find them is to run

# lsof | grep "deleted"

If you see lots of lines that end with "(deleted)" then you can find the process Id that has them open and restart it. Once that happens, your disk space should return.

If this does not fix it, then I'd recommend a fsck.

David Pashley
  • 23,151
  • 2
  • 41
  • 71
  • 1
    Awesome! That was it. I had a postgresql log that was still getting logged to. – sheats Sep 03 '09 at 18:39
  • If you want to free up some space without restarting a daemon, rather than deleting the file, use "echo > file" instead. This will truncate the file, but because the handle is still open, it will become a sparse file. This means it will still be the same size, but will take up much less disk space. – David Pashley Sep 03 '09 at 19:57
  • `lsof +L1` can sometimes work better than that grep... – derobert Sep 05 '09 at 06:23
5

There are a bunch of reasons du does not equal df. See the answers to this question.

Some are overlay mounts, lots of small files and a larger block size, and deleted files that are still in use. Overlay mounts are when you mounted a filesystem on a mount point that had files in it, so du doesn't see them.

The main difference between the two is that df just checks the superblock and trusts it, where as du scans all the files that it is able to see, and adds them up. See this IBM link for information on the superblock.

Kyle Brandt
  • 82,107
  • 71
  • 302
  • 444
4

Always use the -x option with du when you are chasing problems like this. It keeps du from crossing filesystems.

  • This wouldn't have made any difference, as the OP explicitly asked for /*, so asked for the other partitions. It would have helped if something was mounted under, say, "/mnt/backups" – David Pashley Sep 03 '09 at 19:56