19

The drive is constantly filling up. You've hunted down all the spare and random files you can. grep'd for coredump files and even removed some of the un-needed backups...

What would be your next move.

The actual server in question has 10GB of website files and the OS shouldn't take any more than 10GB so how do you track down what's filling a 50GB (virtual) drive?

HopelessN00b
  • 53,385
  • 32
  • 133
  • 208
Gareth
  • 8,413
  • 13
  • 43
  • 44

10 Answers10

25

Surely there are more elaborate ways, but the one I remember is

du --max-depth=1 -h /

Now take the directory that uses up most space (du --max-depth=1 -h /yourdir) and go deeper until you find your culprit.
If want your output sorted by size and don't care for the human-readable format, you could also do du --max-depth=1 /your_dir | sort -n

Marie Fischer
  • 1,943
  • 1
  • 13
  • 13
  • Yeah. I do pretty much the same thing "du -S | sort -n -r |less". I really would love to see a programme that looked like htop and cron'd like mlocate but when ran gave you accurate and contemporary info about the files on your system. – Gareth Jul 21 '09 at 01:17
  • 1
    Instead of starting from / on webserver try starting from http_root. If there is no success there then one can go for '/'. Directory running du on '/' will take lot of time. – Saurabh Barjatiya Jul 21 '09 at 05:43
12

I find ncdu (http://dev.yorhel.nl/ncdu) to be quite helpful for this.

Scott
  • 1,062
  • 1
  • 11
  • 11
  • Perfect. Brilliant interface and the ability to manipulate files from within the programme. Cheers! – Gareth Aug 03 '09 at 23:24
5

I use the Gnome program baobab. You can run this on your desktop and t can connect via SSH to the server. It shows an easy to read graphical map of disk space usage. It's installed under Gnome as "Disk Usage Analyzer"

Josh
  • 9,001
  • 27
  • 78
  • 124
3

Give gt5 a try.

Dennis Williamson
  • 60,515
  • 14
  • 113
  • 148
2

df -k shows which fs are the problem. Then cd to the top level dir for it and run du -xk | sort -n | tail -25 this will show the top 25 dir, sorted, for sun 9 or earlier, replace the x with a d.

Ronald Pottol
  • 1,683
  • 1
  • 11
  • 19
  • Yeah, similar to what I just mentioned in @Marie Fischer's answer. Why use the -k (block size) though rather than -h for human? – Gareth Jul 21 '09 at 01:24
  • -k is used so that all sizes are reported in kb. This is useful for sort else sort would put 10kb before 20mb while sorting. – Saurabh Barjatiya Jul 21 '09 at 05:41
1

Note that files can be deleted while still being written to, so they use diskspace while their creating process is running, but not have a filename.

This makes it unfindable with the usual tools - you can use lsof to investigate which processes have open files.

0

If you can run software on the system, then xdiskusage will graphically show you which directories/files are eating your space. Extremely useful.

I believe KDE contains something similar.

If it's text-only and you cannot install extra software, then creative use of du will probably get you there as well.

sleske
  • 9,851
  • 4
  • 33
  • 44
0

here's something I cobbled together to track down some rogue processes on our database servers: rabbitfinder

#!/bin/sh
tree -s -f > /tmp/out1 && sleep 5 && tree -s -f > /tmp/out2; diff /tmp/out1 /tmp/out2 | egrep "\|--" | awk -F[ '{print $2}' | awk -F] '{print $2 }' | sort | uniq | xargs fuser -f | xargs ps -lFp

it's kinda kludgey and not very robust, but it works thusly:

  1. generate a recursive tree list of the current directory
  2. wait 5 seconds
  3. generate another list
  4. compare the two outputs
  5. fuser the files that have changed size and
  6. ps -lFp will show the files what process owns them

    user@poseidon:~$ tree -s -f > /tmp/out1 && sleep 5 && tree -s -f > /tmp/out2; diff /tmp/out1 /tmp/out2 | egrep "\|--" | awk -F[ '{print $2}' | awk -F] '{print $2 }' | sort | uniq | xargs fuser -f | xargs ps -lFp
    ./tmp/output:       
    F S UID        PID  PPID  C PRI  NI ADDR SZ WCHAN    RSS PSR STIME TTY          TIME CMD
    0 R 1000     14310 14275 23  80   0 -  1072 -        748   1 22:19 pts/2    00:00:06 dd if /dev/zero of ./output bs 1024 count 10000000
    
kenorb
  • 5,943
  • 1
  • 44
  • 53
Greeblesnort
  • 1,739
  • 8
  • 10
0
  1. cd to the web servers home directory (apache's home directory)
  2. run the command "du -a |head -30|sort -nr"
  3. it will give you 30 largest disk consuming files/directories
  4. you can find them and delete (if not usefull)
  • This is not going to work unless you change the order of `head` and `sort`. Also you should make use of the formatting features. – kasperd Feb 22 '16 at 11:22
0

You can use the following commands to find what files or folders taking too much space.

E.g. to display the biggest top 20 directories in the current folder, use the following one-liner:

du -ah . | sort -rh | head -20

or:

du -a . | sort -rn | head -20

For the top 20 biggest files in the current directory (recursively):

ls -1Rs | sed -e "s/^ *//" | grep "^[0-9]" | sort -nr | head -n20

or with human readable sizes:

ls -1Rhs | sed -e "s/^ *//" | grep "^[0-9]" | sort -hr | head -n20

The second command to work on OSX/BSD properly (as sort doesn't have -h), you need to install sort from coreutils. Then add the bin folder to your PATH.

You can define these commands as aliases (e.g. add to your rc files such as .bash_profile):

alias big='du -ah . | sort -rh | head -20'
alias big-files='ls -1Rhs | sed -e "s/^ *//" | grep "^[0-9]" | sort -hr | head -n20'

Then run big or big-files inside the folders which you think take place (e.g. in /home).

kenorb
  • 5,943
  • 1
  • 44
  • 53