Several years ago I had the same demands as you do. The solution I chose was to use ZFS via the ZFS-FUSE driver on my storage server. My thinking was that my personal photos, scanned documents, and other similar files were things that I may access only occasionally, so it may be a very long time, say a year or more, before I notice that a file has been corrupted due to a drive error or the like.
By that time, all of the backup copies I have may be this bit-rotted version of the file(s).
ZFS has a benefit over RAID-5 in that it can detect and repair errors in the data stored on the individual discs, even if the drives do not report a read error while reading the data. It will detect, via checksums, that one of the discs returned corrupted information and will use the redundancy data to repair that disc.
Because of the way the checksumming in ZFS is designed, I felt that I could rely on it to store infrequently used data for long periods of time. Every week I run a "zpool scrub" which goes through and re-reads all the data and verifies checksums.
ZFS-FUSE has performed quite well for me over the last few years.
In the distant past, for a client, I implemented a database system that stored checksum information on all files stored under a particular directory. I then had another script that would run periodically and check the file against the checksum stored in the database. With that we could quickly detect a corrupted file and restore from backups. We were basically implementing the same sorts of checks that ZFS does internally.
How about unison? It is used for two-way synchronisation but it surely checks the checksum of a file.
– taper – 2019-04-14T10:32:41.977