0

We have an Ubuntu 12.04 server running on a VMware ESXi host, with multiple partitions, running Zimbra 8 (mta/ldap only). One of partitions is a 30GB partition mounted as /opt/zimbra/data. I had a rude wake up call this morning from my manager claiming the mail server was down.

I logged in and took a look and sure enough, all commands were reporting that there wasn't enough free space on the /opt/zimbra/data partition. I was trying to figure out which file it was that was using up all the space, but both df and du failed me. Here are the outputs from the various commands after I had tracked down the file by doing an ls one directory at a time:

zimbra@mail:/opt/zimbra/data/ldap/mdb/db$ du -sh .
20M .

zimbra@mail:/opt/zimbra/data/ldap/mdb/db$ du --apparent-size data.mdb
31409532    data.mdb


zimbra@mail:/opt/zimbra/data/ldap/mdb/db$ df -h .
Filesystem      Size  Used Avail Use% Mounted on
/dev/sde1        30G  1.3G   28G   5% /opt/zimbra/data


zimbra@mail:/opt/zimbra/data/ldap/mdb/db$ ls -allh
total 20M
drwxr-xr-x 2 zimbra zimbra 4.0K Nov 26 13:50 .
drwxr-xr-x 3 zimbra zimbra 4.0K Jul 16 05:47 ..
-rw------- 1 zimbra zimbra  30G Nov 26 14:04 data.mdb
-rw------- 1 zimbra zimbra 8.0K Nov 26 14:07 lock.mdb

Note that the file data.mdb is using up all 30Gigs, but is not being taken into account when reporting the total space.

We have since created a new partition, copied the files over, and have things up and running, but I am still curious what would have caused the incorrect reporting of used space on the partition. We still have the old partition lying around, so if there are other commands I don't know of that would yeild more accurate results, I would like to try them out.

Update:

Output from ls -alsh

root@mail:/opt/zimbra/data/ldap/mdb/db# ls -alsh
total 20M
4.0K drwxr-xr-x 2 zimbra zimbra 4.0K Nov 26 13:50 .
4.0K drwxr-xr-x 3 zimbra zimbra 4.0K Jul 16 05:47 ..
 20M -rw------- 1 zimbra zimbra  30G Nov 26 14:04 data.mdb
4.0K -rw------- 1 zimbra zimbra 8.0K Nov 26 14:07 lock.mdb

So it looks like the file is indeed a sparse file, and was very close to the partition size. I still have no idea why commands like touch or cp started returning no space left on disk, as this file only seems to have actually been using around 20 MB.

Wrapping up, here is what I have come up with to list all files, including sparse files, recursively and order the results by file size desc:

ls -aldSh $(find .) | grep -v '^d'

For any Zimbra users who might end up here with a similar issue, the following links will help:

MDB: Maximum database size

ldap database went from 97meg to 86gig

Update 2:

Another thing to check is if the partition has run out of inodes. Of particular interest are the logger and zmstat subdirectories. These contain a number of small files, and can quickly run out of inodes before running out of space, if mounted on their own partitions. Most commands will still return a "no space left on device" error, which can be misleading.

df -i can be used to show information on the number of free inodes. For example, I have a partition that has about 80% free space according to df -h, but still returns a "no space left on device" error because it is out of inodes:

root@mail:/opt/zimbra/logger# df -h
Filesystem       Size  Used Avail  Use%  Mounted on
/dev/sdh1        20G   3.9G   16G  21%   /opt/zimbra/logger

root@mail:/opt/zimbra/logger# df -i
Filesystem       Inodes   IUsed     IFree IUse% Mounted on
/dev/sdh1        5120     5120         0  100%  /opt/zimbra/logger
jeshurun
  • 254
  • 2
  • 7

0 Answers0