Discrepancy between df and du

2

I have a question pertaining the difference from 'df -h' and 'du -bs'. I have seen several questions here about it, but the issue was always that 'df -h' was reporting more used space than 'du -bs'. I have the opposite problem:

[root@CDPPRIM01 oracle-dbf]# du -bs
336178610176    .

[root@CDPPRIM01 datafiles]# df -h | grep dbf
/dev/sda2             360G  272G   85G  77% /opt/oracle-dbf

[root@CDPPRIM01 datafiles]# ls -la
total 284550000
drwxrwxr-x 2 oracle oinstall        4096 Apr  9 14:13 .
drwxr-xr-x 4 root   root            4096 Apr  3 10:33 ..
-rw-r----- 1 oracle dba          9748480 Apr 11 17:01 control01.ctl
-rw-r----- 1 oracle dba          9748480 Apr 11 17:01 control02.ctl
-rw-r----- 1 oracle dba        968892416 Apr 11 16:20 pn310_admin_DATA22.dbf
-rw-r----- 1 oracle dba        104865792 Apr 11 16:20 pn310_admin_DATA.dbf
-rw-r----- 1 oracle dba      32212262912 Apr 11 16:20 pn310_DATA11.dbf
-rw-r----- 1 oracle dba      32212262912 Apr 11 17:00 pn310_DATA12.dbf
-rw-r----- 1 oracle dba      32212262912 Apr 11 17:00 pn310_DATA13.dbf
-rw-r----- 1 oracle dba      32212262912 Apr 11 16:20 pn310_DATA14.dbf
-rw-r----- 1 oracle dba       5242888192 Apr 11 17:01 pn310_DATA15.dbf
-rw-r----- 1 oracle dba       1073750016 Apr 11 17:00 pn310_DATA.dbf
-rw-r----- 1 oracle dba       5798633472 Apr 11 16:20 pn310_dwe_DATA20.dbf
-rw-r----- 1 oracle dba       1073750016 Apr 11 16:20 pn310_dwe_DATA.dbf
-rw-r----- 1 oracle dba        104865792 Apr  3 10:42 pn310_dwe_TEMP.dbf
-rw-r----- 1 oracle dba       5263859712 Apr  3 11:28 pn310_dwe_temp_TEMP9.dbf
-rw-r----- 1 oracle dba      32212262912 Apr 11 16:20 pn310_ep_DATA16.dbf
-rw-r----- 1 oracle dba      32212262912 Apr 11 16:20 pn310_ep_DATA17.dbf
-rw-r----- 1 oracle dba      32212262912 Apr 11 16:20 pn310_ep_DATA18.dbf
-rw-r----- 1 oracle dba       9437192192 Apr 11 16:20 pn310_ep_DATA19.dbf
-rw-r----- 1 oracle dba       1073750016 Apr 11 16:50 pn310_ep_DATA.dbf
-rw-r----- 1 oracle dba        104865792 Apr  3 10:42 pn310_ep_TEMP.dbf
-rw-r----- 1 oracle dba      16001277952 Apr  3 11:27 pn310_ep_temp_TEMP8.dbf
-rw-r----- 1 oracle dba        104865792 Apr 11 06:00 pn310_TEMP.dbf
-rw-r----- 1 oracle dba      16001277952 Apr 11 17:01 pn310_temp_TEMP7.dbf
-rw-r----- 1 oracle dba      11811168256 Apr 11 16:51 pn310_xmp_DATA21.dbf
-rw-r----- 1 oracle dba       1073750016 Apr 11 17:00 pn310_xmp_DATA.dbf
-rw-r----- 1 oracle dba        104865792 Apr  3 10:42 pn310_xmp_TEMP.dbf
-rw-r----- 1 oracle dba       2042634240 Apr  3 11:29 pn310_xmp_temp_TEMP10.dbf
-rw-r----- 1 oracle dba        566239232 Apr 11 17:00 sysaux01.dbf
-rw-r----- 1 oracle dba       4802486272 Apr 11 16:57 sysaux_DATA24.dbf
-rw-r----- 1 oracle dba        754982912 Apr 11 16:58 system01.dbf
-rw-r----- 1 oracle dba       4613742592 Apr 11 17:00 system_DATA23.dbf
-rw-r----- 1 oracle dba       1073750016 Apr 10 23:07 temp01.dbf
-rw-r----- 1 oracle dba       1073750016 Apr  3 10:38 temp02.dbf
-rw-r----- 1 oracle dba       3221233664 Apr  3 11:31 temp_TEMP11.dbf
-rw-r----- 1 oracle dba       1073750016 Apr 11 17:01 undotbs01.dbf
-rw-r----- 1 oracle dba       1073750016 Apr 11 17:00 undotbs02.dbf
-rw-r----- 1 oracle dba      13958651904 Apr 11 17:01 undotbs1_DATA26.dbf
-rw-r----- 1 oracle dba          5251072 Apr 11 16:20 users01.dbf
-rw-r----- 1 oracle dba       1068507136 Apr 11 16:20 users_DATA25.dbf

Adding all the files, we get 336178593792, which in GB is equal to: 336178593792/1024/1024/1024 = 313GB, which is more than the 272GB reported by 'df -h'.

I have already done a umount and fsck to check the partition and it is clean. Does anyone know what might be the reason behind this behaviour?

Jose Miguel Dores

Posted 2013-04-11T16:12:33.090

Reputation: 21

1What does df | grep dbf (without the -h) return? This is probably an issue with one of the two reporting powers of 1024 and the other powers of 1000. – terdon – 2013-04-11T16:17:50.630

[root@CDPPRIM01 pn310-point-patches]# df | grep dbf /dev/sda2 376931920 284749228 88291512 77% /opt/oracle-dbf – Jose Miguel Dores – 2013-04-11T17:02:43.843

Answers

1

One possibility would be sparse files. If big parts of a given file contain no data, no physical blocks have to be allocated to these parts of the file, assuming that the OS and file system support it.

To check for sparse files, use invoke ls with the switches -s (show space occupied on hard drive) and -k (show sizes in kibibyte blocks).

Example output:

$ df -H /dev/sda1
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1       118G   38G   75G  34% /
$
$ dd if=/dev/zero of=1GB-normal bs=1GB count=1 # normal file (1 GB)
$
$ df -H /dev/sda1
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1       119G   38G   75G  34% /
$
$ dd if=/dev/zero of=1GB-sparse bs=1GB count=0 seek=1GB # sparse file (1 GB)
$
$ df -H /dev/sda1
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1       119G   38G   75G  34% /
$
$ ls -lks
total 976568
976568 -rw-rw-r-- 1 dennis dennis 976563 Apr 11 13:05 1GB-normal
     0 -rw-rw-r-- 1 dennis dennis 976563 Apr 11 13:06 1GB-sparse

Since the sparse file occupies (almost) no physical space on the disk, the output of df doesn't change after creating it.

Dennis

Posted 2013-04-11T16:12:33.090

Reputation: 42 934

1

I am not really sure what is going on here but I can give you some pointers.

  1. du -b implies the --apparent-size option. From the du man page:

    --apparent-size
          print  apparent  sizes,  rather than disk usage; although the apparent
          size is usually smaller, it may be larger due to holes in (`sparse')
          files,  internal  fragmentation, indirect blocks, and the like
    -b, --bytes
          equivalent to `--apparent-size --block-size=1'
    
  2. I'm not sure how oracle deals with its database files but it would not surprise me if some were indeed sparse. I think this is the likeliest reason, see here for a discussion of how this can affect du.

  3. This page has a nice explanation of some of the differences between the ways the two programs calculate disk usage.

  4. Some more relevant info from the df man page:

    -h, --human-readable
          print sizes in human readable format (e.g., 1K 234M 2G)
    Display values are in units of the first available  SIZE  from
    --block-size,  and the DF_BLOCK_SIZE, BLOCK_SIZE and BLOCKSIZE
    environment variables.  Otherwise, units default to 1024 bytes
    (or 512 if POSIXLY_CORRECT is set).
    
    SIZE  may be (or may be an integer optionally followed by) one
    of following: KB 1000, K 1024, MB 1000*1000, M 1024*1024,  and
    so on for G, T, P, E, Z, Y.
    

terdon

Posted 2013-04-11T16:12:33.090

Reputation: 45 216