4

The reason I ask is that I use rsnapshot for backups with a /backups separate raid1 ext3 back-up disk. Unfortunately removing back-ups (happens every 4 hours) takes up to a full hour! rm -rf /backups/server/hourly.5 will take very long, while all it does it removing the hardlinks, since most data is filled with hardlinks.

ZFS is lovely, but I am thinking of BtrFS, XFS or maybe just ext4 for the new backupserver. ZFS is not suitable at all for production in Linux environments, so that's not an option, although it seems by far the best fs. I will be using BackupPC on CentOS or Debian as software instead of rsnapshot this time. I was considering Bacula, but it seems it has no advantages whatsoever over BackupPC, but is harder to configure and requires an agent to be installed.

I would like a FS that deletes hardlinks fast. I don't see why this has to take an hour, since nothing actually happens to the data anyway.

General advice on back-ups is welcome, but I think if I use backuppc, raid1 for back-ups with a file system that is both fast and ready for production, I have a good back-up environment.

ujjain
  • 3,963
  • 15
  • 50
  • 88

3 Answers3

2

XFS deletes files, at least, in O(1), while the ext(mumble) family is in O(log n) (n being the size of the file). I don't know how that translates into deleting tons of links, but it's a start.

Bill Weiss
  • 10,782
  • 3
  • 37
  • 65
2

I use ext4 for my file system. Try using the relatime (relative atime) to minimize inode updates to the directory as you delete files.

Raid writes tend to be slower than reads as multiple disks must be written. This is compounded by journal writes. You might try using an external journal on a separate set of disks.

Deleting files from directories with a large number of files tends to be significantly slower than deleting files from directories with fewer files. I believe this is due to writing more directory blocks. However, fixing this requires fixing the allocation in the directories you are backing up. Directories with large numbers of files have caused me problems with many applications.

BillThor
  • 27,354
  • 3
  • 35
  • 69
  • +1 for ext4. Personally, I would prefer `noatime` to `relatime` in this particular use-case, since BackupPC doesn't really have much use for file access times anyway. – Steven Monday Jun 19 '11 at 17:07
1

A while back, I installed BackupPC on Debian Lenny with RAID1 and LVM, and I chose to go with Reiserfs (mounted with the noatime option). I would have preferred to go with ext4, but, at the time, Lenny's ext4 support was still a bit ... new. The BackupPC FAQ recommended using Reiserfs over ext3, so that was it.

Anyway, so far so good, no problems. Reiserfs has been very solid. The way BackupPC works, you probably don't really need to worry much about file/hard-link deletion performance. That's because BackupPC runs its cleanup jobs separately (but concurrently, if necessary) with backups and restores. I've never had an issue with a cleanup job running too long or otherwise interfering with normal operations.

In my case, a more important consideration is how efficiently the filesystem handles many small files. Apparently, Reiserfs is very good for this, as it minimizes internal fragmentation by so-called "block suballocation" (or "tail packing"). Still, if I were to have to choose again today, I would probably go with ext4. The benefits of running the same FS that most Linux kernel devs are running probably outweigh any minor technical advantages that a niche FS (like Reiserfs) may confer.

Steven Monday
  • 13,019
  • 4
  • 35
  • 45