I am running a BackupPC server with a hardware RAID 5 for the main storage of the backups. Since the machine was created on a tiny budget, the controller is a 3Ware 9500S-4LP for the PCI port and the drives are slow 200 GB SATA types.
However, even with this hardware, I see far worse performance than expected. The clients and the backup server use rsync as a transport over a Gigabit network, which is never even close to saturation. Backing up a normal Linux installation of about 5 GB takes over three hours.
So I monitored the server using the atop
process monitor. It showed that neither processor nor memory use are critical, but read accesses to the RAID are the bottleneck.
When I built the server, I chose RAID 5 because according to this tabular overview of RAID characteristics it seemed the best compromise between read performance and space efficiency on a 4 port controller.
By the way, although this is a backup server, using rsync means there are far more reads than writes here -- around 1000 times more, currently. I suppose that moving and linking older files in BackupPC's hierarchy of old backups also contributes a lot to this.
So, how would you optimize performance on this machine? I have the following tunables:
- Using a different transport with BackupPC (tar is an alternative)
- Changing the array's filesystem from ext4 (noatime) to something else
- Changing the RAID level (preferably not, due to data loss)
- Recreate the array with a different stripe size (preferably not, due to data loss)
- adding more memory to use as a buffer cache
- adding a second controller and more drives (yes, I have those around)
- Change the controller (preferably not, due to financial constraints)
- Change all drives (preferably not, due to financial constraints)