6

I know this is a common occurrence when the result of du -sh is smaller than that returned by the file system with df -h. (Because some files are still open in processes etc..) But in my case I have the opposite.

I'm running Ubuntu 12.04 and trying to get the used size of an NFS mount

df -h returns 270G used while du -sh of the mounted folder returns 320G used.

Update: I'm mounting the partition with the following options:

nas-server:/path/to/mount /mnt/mount/point  nfs proto=tcp,rsize=8192,wsize=8192,hard,intr,exec

Does anyone know where this could come from? The correct amount on the disk should be 270G~

Thanks for your help. I'll provide any extra information necessary.

D.Mill
  • 379
  • 5
  • 15
  • Does "correct amount" refer to the amount that nas-server reports locally (i.e. not over NFS, but local df or Web UI or similar)? – ptman Sep 16 '13 at 09:32
  • Yes. I've also copied all data onto an external drive to double check and both `df` and `du` returned the same value of about ~270G (locally) – D.Mill Sep 16 '13 at 09:56
  • I would check inode usage using `df -i` and run fsck on the filesystem. – ptman Sep 16 '13 at 10:21
  • 1
    Hmm... how does NFS handle hard links? Is it possible that `du` is couning hard links as separate files due to NFS? – Martin von Wittich Sep 16 '13 at 23:21
  • Also I'd try to run `du -ms *` both on the share from the client perspective and locally on the server. By comparing the output, you might be able to narrow it down. – Martin von Wittich Sep 16 '13 at 23:22
  • Another possibility might be sparse files. If NFS doesn't support them properly, `du` might believe that they use more space than they actually do. – Martin von Wittich Sep 16 '13 at 23:26
  • 2
    This difference can be triggered by block size on the NAS server. du shows number of byes, df - number of blocks * block size. Additionally, df asks NAS server for usage information, du counts it locally, – kofemann Sep 17 '13 at 07:20

2 Answers2

2

du counts the blocks used by hardlinked files once, not once per hardlink. However, there are some ways this de-duplication can fail:

  1. The table du uses to de-dup hardlinked files is a fixed size. If you have more hardlinked files than the table can store, de-duplication may not be successful. (Some versions of du have a dynamicaly sized table and don't have this problem.)

  2. The de-duplication is based on inode values. If the NAS server shows different inode numbers for files that are hardlinked, then de-duplication is not possible. Some NAS servers do a great job of presenting inodes because they use a filesystem that has inodes. Others have to "fake it" and don't do a good job.

By the way...

du counts just the file data.

df counts blocks use for the file data plus all the meta data: directories, superblock, inode table, direct/indirect/doubleindirect blocks and so on.

Therefore, df should return a smaller "used" size than du. If the opposite is happening, I would presume this de-duplication is broken or the NAS Server has done something that makes df display invalid information.

TomOnTime
  • 7,567
  • 6
  • 28
  • 51
1

Thanks to everyone for the great answers and suggestions. I'm posting to answer my own specific problem. Just on the offchance it helps someone (which it very well might not given the circumstances)

The NAS I was getting my filesystem information from has - if I'm not mistaking - a certain level of virtualisation as far as the partitions go (HP X9000). Therefore df should return an accurate "estimate" if all goes well.

However due to a bug with the NAS that has since then been fixed, the size of the virtual partition was not being updated and therefore df would return an invalid (outdated) value. Showing 270Go instead of the actual correct value of 320Go (I made a mistake in my comments)

All the above issues stemmed from this. Since then this has also occured on occasion when the NAS has been working in degraded mode (for whatever reason)

Thanks again guys.

D.Mill
  • 379
  • 5
  • 15