6

Why would these two commands report free space so differently? This started happening last year on several Ubuntu 18.04 VMs from time to time.

df reports 100% used while du -smh shows only 2.3G of the 4.0G (total) in use.

This is often followed by rsyslog or syslog-ng filling up /var with error messages when the syslog server is down for maintenance.

# df -mh /var
Filesystem              Size  Used Avail Use% Mounted on
/dev/mapper/new_vg-var  4.0G  3.9G     0 100% /var

# du -smh /var
2.3G    /var

# fuser /var
Nstevens
  • 181
  • 2
  • 10

2 Answers2

8

df queries the file system for total blocks used, see man 3 statvfs. Fast and an accurate accounting of the volume, but no detail on which files.

du loops through files and adds up their sizes. Slower, but can print per file sizes.

A discrepancy may mean deleted files are still open by some program. Most commonly such large files are logs or databases, but could be anything.


This is often followed by rsyslog or syslog-ng filling up /var with error messages when the syslog server is down for maintenance.

Root cause would be your remote logging configuration.

Short term, properly rotate log files, such as by running logrotate ad-hoc. Note that typical config in /etc/logrotate.d/*syslog sends rsyslog a HUP signal to reopen new log files.

Consider increasing the size of /var to handle actual size of log files.

Revise logging configuration to do something appropriate when remote is down and when disk space is low. rsyslog may be configured with queues that use a finite amount of space and discard messages on queue full. rsyslog config example from SLES knowledge base, will need to be customized for your logging setup:

# cat /etc/rsyslog.d/ora_audit.conf
if ( $syslogfacility-text == 'local1' ) and ( $syslogseverity == 4 /* warning */ )  then {
        $WorkDirectory /var/spool/rsyslog       # where to place spool files
        $ActionQueueFileName RemoteQueue        # unique name prefix for spool files
        $ActionQueueMaxDiskSpace 1G             # 1gb space limit (use as much as possible)
        $ActionQueueSaveOnShutdown on           # save messages to disk on shutdown
        $ActionQueueType LinkedList             # run asynchronously
        $ActionResumeRetryCount -1              # infinite retries if host is down
        $ActionQueueTimeoutEnqueue 0            # causes that the message will be discarded immediately if the queue is full
        *.* @@aaa.bbb.ccc.ddd:514               # IP of remote syslog server:port 514
        stop
}

# Above setup discards any messages, if queue size limit is reached (in this setup 1 GB) immediately !
John Mahowald
  • 30,009
  • 1
  • 17
  • 32
6

Most common reasons for df to be larger than du:

  • You're not running du on the entire filesystem.
    • You don't have access privileges on all the directories.
    • You did something like du -s /filesystem/*, and you're missing the dotfiles at the top. Rerun as du -sx /filesystem
    • (Rare) You have shadowed some of your filesystem with another mount. By mounting on top, du cannot reach the files to determine the size. Can size them by unmounting the shadowing filesystem or often with a loop mount of the original filesystem.
  • You have files that are still open, but deleted. du cannot access them to read any longer, but they still take space. Try lsof +aL1 /filesystem to find the orphaned files and the process holding them open. When the process closes the file, the space will be released.

Most common reason for df to be smaller than du:

  • There's an additional filesystem mounted inside and your du is descending into it and counting that space as well. Rerun as du -sx /filesystem
BowlOfRed
  • 216
  • 2
  • 7