I want to replicate in the region of 10Tb of data (lots of smallish files, low level of churn) across a WAN with minimal impact on the available infrastructure.
While I could simply use rsync, this means looking for the changes and comparing the local and remote data (disk I/O, network bandwidth and CPU costs) although rsync does this efficiently, I wonder of there is a more efficient solution which can track changes and propagate them (preferably bidirectionally).
The storage itself is iSCSI on HP NAS devices. We have looked previously at using its built-in replication capabilities but found them to be slow and unreliable.
DRBD mirrors would require additional hardware at both ends. Which would be rather expensive. I've also been bitten by DRBD replication failures in the past.
Would glusterfs be more efficient? Would it be really dumb to go with a 2 node setup? Is there a better solution?