0

On my AIX 6.1 server, i have a problem on a VIO LPAR.

A filesystem seams to be full with the 'df' command, but not with 'du' or 'ls' for example. I searched but I don't understand where the problem comes from.

The 'df' command shows :

[root@VIO2] /var/vio/storagepools/VIO2_storfs_rvg #df -IMvm | grep var
/dev/hd9var   /var                 1024.00    497.91    526.09   49%     9226   122167     8%
/dev/livedump /var/adm/ras/livedump    256.00      0.36    255.64    1%        4    58200     1%
/dev/VIO2_storfs_rvg /var/vio/storagepools/VIO2_storfs_rvg 409600.00 409600.00      0.00  100%       39       57    41%

The 'du' command :

[root@VIO2] /var/vio/storagepools/VIO2_storfs_rvg #du -sx *
0       lost+found
41943040        rootvg_ge
41943040        rootvg_lp
41943040        rootvg_pr_en
41943040        rootvg_pr_gf
41943040        rootvg_pr_io
41943040        rootvg_pr_ot
41943040        rootvg_pr_si
41943040        rootvg_te_gf
3016960 rootvg_te_iodas
0       rootvg_te_ot
0       rootvg_te_si
37748736        te_hd

The 'ls' command :

[root@VIO2] /var/vio/storagepools/VIO2_storfs_rvg #ls -alR
total 376310120
drwxr-xr-x    3 root     system         4096 Apr 22 22:27 .
drwxr-xr-x    3 root     system          256 Jan 28 2016  ..
-rw-r--r--    1 root     system          219 Apr 21 09:54 .rootvg_ge
-rw-r--r--    1 root     system          221 Apr 21 09:55 .rootvg_lp
-rw-r--r--    1 root     system          224 Oct 28 10:58 .rootvg_pr_en
-rw-r--r--    1 root     system          219 Oct 28 10:59 .rootvg_pr_gf
-rw-r--r--    1 root     system          221 Oct 28 10:59 .rootvg_pr_io
-rw-r--r--    1 root     system          221 Oct 28 11:26 .rootvg_pr_ot
-rw-r--r--    1 root     system          219 Apr 21 09:56 .rootvg_pr_si
-rw-r--r--    1 root     system          219 Oct 28 11:01 .rootvg_te_gf
-rw-r--r--    1 root     system          221 Oct 28 11:01 .rootvg_te_io
-rw-r--r--    1 root     system          221 Oct 28 11:02 .rootvg_te_ot
-rw-r--r--    1 root     system          219 Apr 21 09:57 .rootvg_te_si
-rw-r--r--    1 root     system          211 Apr 21 10:07 .te_hd
drwxr-xr-x    2 root     system          256 Jan 28 2016  lost+found
-rw-r--r--    1 root     system   21474836480 Apr 22 21:09 rootvg_ge
-rw-r--r--    1 root     system   21474836480 Apr 22 21:17 rootvg_lp
-rw-r--r--    1 root     system   21474836480 Apr 22 21:26 rootvg_pr_en
-rw-r--r--    1 root     system   21474836480 Apr 22 21:35 rootvg_pr_gf
-rw-r--r--    1 root     system   21474836480 Apr 22 21:44 rootvg_pr_io
-rw-r--r--    1 root     system   21474836480 Apr 22 21:53 rootvg_pr_od
-rw-r--r--    1 root     system   21474836480 Apr 22 22:02 rootvg_pr_si
-rw-r--r--    1 root     system   21474836480 Apr 22 22:11 rootvg_te_gf
-rw-r--r--    1 root     system   1544679424 Apr 22 22:11 rootvg_te_io
-rw-r--r--    1 root     system            0 Apr 22 22:19 rootvg_te_ot
-rw-r--r--    1 root     system            0 Apr 22 22:27 rootvg_te_si
-rw-r--r--    1 root     system   19327352832 Apr 24 08:08 te_hd
./lost+found:
total 8
drwxr-xr-x    2 root     system          256 Jan 28 2016  .
drwxr-xr-x    3 root     system         4096 Apr 22 22:27 ..

And some 'fuser' commands :

[root@VIO2] /var/vio/storagepools/VIO2_storfs_rvg #fuser -dV /var/vio/storagepools/VIO2_storfs_rvg
/var/vio/storagepools/VIO2_storfs_rvg:

[root@VIO2] /var/vio/storagepools/VIO2_storfs_rvg #fuser -dV /var
/var:

Thanks in advance if anyone can explain!

cd25_flo
  • 1
  • 1
  • 1
  • 1
    Classic symptom of a deleted-but-still-open file. Deleting a file isn't sufficient to free the space it uses. Files held open by a process aren't removed from disk until they're closed. Use the [`fuser` command](https://www.ibm.com/support/knowledgecenter/en/ssw_aix_61/com.ibm.aix.cmds2/fuser.htm) on a file: `fuser -c te_hd`, for example. – Andrew Henle Apr 24 '17 at 10:17
  • Which OS are you using? What does df -h show? – inaki Apr 24 '17 at 11:23
  • Another possibility is that you have a filesystem mounted over top of a directory (mount point) that contains files. The `du` command will not show these, because it shows disk usage for the files it sees. The `df` command would show these because it looks at the amount of space left for each block device. – Tim S. Apr 24 '17 at 20:21
  • Possible duplicate of [Disk space usage doesn't add up with df & du](https://serverfault.com/questions/379831/disk-space-usage-doesnt-add-up-with-df-du) – Jenny D May 12 '17 at 10:32

4 Answers4

1

The df program reports the amount of space available to a non-root user no matter who runs it. This has been historically true, and I imagine it remains so. The rational is that if a regular program fills a partition, root has a little extra workspace to correct the problem. This was especially true if the offending process was still trying to consume all available space.

I don't have access to an AIX machine, but you might look in sys/mount.h, if it's around.

iceberg /usr/include 521> grep f_bavail sys/mount.h
        int64_t         f_bavail;       /* free blocks avail to non-superuser */
Erik Bennett
  • 111
  • 3
0

I solved my problem as "Tim S" said in a comment above. The problem was a mounted partition on top of the folder which contained the big data footprint.

In my case a backup script copied 100GB on "/media/backup-data", when the partition that is usually mounted there was not actually mounted. So the files ended up on the root partition itself. Eventually, when the usual partition was mounted at "/media/backup-data", "du" was seeing the mounted partition, but not the files on the root partition in the same folder. Contrary, "df" was seeing files on the root partition. Hence, the difference between "df -h" and "du -shx /".

In my case - just unmount all partitions and check if "du" finds anything unusual on root partition.

toshko
  • 21
  • 3
0

Same I don't have access to AIX Machine, but on linux, you can check the percentage reserved to root and services with command:

sudo tune2fs -l /dev/sda1 | grep 'Reserved'

and change it with command

sudo tune2fs -m 1 /dev/sdXY (here 1 percent is reserved)

See more info here: https://unix.stackexchange.com/questions/7950/reserved-space-for-root-on-a-filesystem-why

DevOps
  • 720
  • 3
  • 15
0

Finally I solved the problem by unmounting the partition (with 'force' option, because seems in use...), and verifying the consistency with 'fsck' before remounting.

I had some errors : bad superblock, allocation map dirty, inode map dirty...

'fsck' corrected these errors and after remounting, everything is good!

Thank you for your answers.

cd25_flo
  • 1
  • 1
  • 1
  • What most likely happened is you had an open but deleted file that filled the file system (that was deleted by someone trying to "fix" the full file system), and it was that open file that caused the file system to be in use. So you used a forced unmount to unmount the busy file system. But the forced unmount of the in-use filesystem is what then corrupted your filesystem. – Andrew Henle Apr 25 '17 at 10:22