4

Currently I have a btrfs mount point with a structure like this:#

/mountpoint/month/day/hourAs24/

and each leaf directory contains between 5,000 to 20,000 small files. There I keep the files for two month. Each day I remove the directory older than 60 days with the command

rm -R /mountpoint/month/day/

This command takes ages to run and the load on the server is extremely high while the command runs.

Would it be better to create btrfs subvolumes under /mountpoint/month/ for every day and then purge the subvolumes in one single command?

Are there any other fast and lightweight solutions to get rid of the files under one btrfs directory in a single command?

Edit: To clarify the situation. In the structure the folders month, day and hourAs24 are variables that are substituted by the correspondent values of the current date-time.

Edit after solution: It works smoothly on my test machine. And all of the following works live with the mountpoint mounted! First I create normal directories for each month with

mkdir /mountpoint/month

Then I create btrfs subvolumes for each day of month with

btrfs subvolume create /mountpoint/month/day

then I create the normal directories for each hour of day with

mkdir /mountpoint/month/day/hourAs24

And after 60 days I can easily purge the directory of the day

btrfs subvolume delete /mountpoint/month/day

(Now I have to wait 60 days to see the performance on the production server)

mailq
  • 16,882
  • 2
  • 36
  • 66

3 Answers3

8

I would go the subvolumes route, myself. You just can't beat it for speed, and if you tilt your head and squint a bit, you can even say that it looks like the "right" way to store the files anyway...

womble
  • 95,029
  • 29
  • 173
  • 228
2

Umounting and then deleting a filesystem, followed by creating a new filesystem and mounting it, will be a lot faster than removing thousands of objects in a single filesystem. The metadata operations required for the delete are orders of magnitude higher than the I/O operations required to bulk-remove and then recreate a new one.

LVM is flexible enough to handle that sort of thing.

Alternately, if that isn't possible, you could go the route of creating large loopback-files in the BTRFS filesystem that you then format as BTRFS and mount to your directories. Not quite as fast as the LVM method, but still markedly faster (or should be) than unlinking all those files. IIRC, BTRFS also supports sparse-files which may be a good choice for this route if you go there.

sysadmin1138
  • 131,083
  • 18
  • 173
  • 296
  • 1
    Uhm... why would you faff around with LVM or loopback files when btrfs effectively has that ability built-in? It's the "subvolumes" the OP was talking about. – womble Jul 31 '11 at 18:37
  • 2
    I haven't worked with btrfs before, so that's where I'm basing my recommendation. If BTRFS can do sub-filesystems like that, then use that. – sysadmin1138 Jul 31 '11 at 18:39
-3

find maybe faster:

find /mountpoint -mtime +60 -delete
quanta
  • 50,327
  • 19
  • 152
  • 213
  • 4
    No, this is even slower. As it first has to scan the WHOLE mountpoint where we even know that the old files are in one single directory. And to top the slowness it will execute one `rm` process for each found file. – mailq Jul 31 '11 at 15:53
  • If the structure is like the original poster mentioned, how does he put all the files older than 60 days into `/mountpoint/month/day/`? I will test with `time` command and append to the above answer. – quanta Jul 31 '11 at 16:27
  • The files are in the correct folder "automagically", because the program creates the files in the correct folder by design. – mailq Jul 31 '11 at 16:35