Short version: rm -rf mydir
, with mydir
(recursively) containing 2.5 million files, takes about 12 hours on a mostly idle machine.
More information: Most of the files being deleted are hard links to files in other directories (the directory being deleted is actually the oldest backup made by rsnapshot
; the rm
command is actually given by rsnapshot
). So it's mostly directory entries being deleted - the file content itself isn't much; it's in the order of some tens of GB.
I'm far from certain that btrfs
is the culprit. I recall backup was also very slow before I started to use btrfs
, but I'm not certain that the slowness was in the deletion.
The machine is an Intel Core i5 2.67 GHz with 4 GB RAM. It has two SATA disks: one has the OS and some other stuff, and the backup disk is a 1 TB WDC WD1002FAEX-00Z3A0
. The motherboard is an Asus P7P55D.
Edit: The machine is a Debian wheezy with Linux 3.16.3-2~bpo70+1
. This is how the filesystem is mounted:
root@thames:~# mount|grep rsnapshot
/dev/sdb1 on /var/backups/rsnapshot type btrfs (rw,relatime,compress=zlib,space_cache)
Edit: Using rsync -a --delete /some/empty/dir mydir
takes about 6 hours. A significant improvement over rm -rf
, but still too much I think. (Explanation of why rsync
is faster than rm
: "[M]ost filesystems store their directory structures in a btree format, the order [in] which you delete files is ... important. One needs to avoid rebalancing the btree when you perform the unlink.... rsync -a --delete
... does deletions in-order")
Edit: I attached another disk which had 2.2 million files (recursively) in a directory, but on XFS. Here are some comparative results:
On the XFS disk On the BTRFS disk
Cached reads[1] 10 GB/s 10 GB/s
Buffered reads[1] 80 MB/s 115 MB/s
Walk tree[2] 11 minutes 43 minutes
rm -rf mydir[3] 7 minutes 12 hours
[1] With hdparm -T /dev/sdX
and hdparm -t /dev/sdX
.
[2] Time taken to run find mydir -print|wc -l
immediately after boot.
[3] On the XFS disk, this was soon after walking the tree with find
. On the BTRFS disk it is the old measurement (and I don't think it was with the tree cached).
It appears to be a problem with btrfs
.