Detecting and preventing HFS Plus/NTFS bit rot

2

1

Neither HFS Plus nor NTFS does any data integrity checking; aka checks for “bit rot” on data files stored on the system. This is concerning because Time Machine and similar tools cannot detect if they are backing up corrupt data.

Are there tools that can detect corruption and warn me of it?

What is the best consumer strategy for keeping my data integrous? Do I have to go all the way and create a ZFS/Btrfs NAS to store my information?

Update:

After some research I found that there are 2 ports of ZFS to Mac OS X:

This seems the best approach to gaining data integrity on Mac OS.

hekevintran

Posted 2014-12-04T23:02:54.933

Reputation: 2 211

Answers

1

I’ve researched dealing with this issue for fairly large—20TB+—enterprise-sized storage systems and the consumer reality is this: ZFS based systems is really the only way to deal with this. If data rot is a real concern, then I would recommend having at least one other hard drive for backups that you backup. Not RAID or anything magical, but simply another external drive that is synced using a tool like rsync if you are comfortable using the command line or Carbon Copy Cloner which is basically an app that performs the same function as rsync but has a nice user interface.

I did do some searching right now and found ZFS on Linux which sounds interesting; an open source implementation of ZFS for Linux systems. If you are comfortable rolling up your sleeves and setting up Linux/Unix stuff, this could be a potential solution for a do-it-yourself NAS. But I do not have direct experience with it, so can’t speak for it’s long term usefulness in a “production” environment.

JakeGould

Posted 2014-12-04T23:02:54.933

Reputation: 38 217

1I've been using ZFS for years now on my home server, with FreeBSD. It's really great but eats as much RAM as you throw at it. ZFS for Linux should be sufficiently stable by now, I'm using it on one of my dedicated servers. No problems there either. – Daniel B – 2014-12-04T23:41:50.440

@DanielB Very good to know! Now I have a new project to spend time on! – JakeGould – 2014-12-04T23:46:17.833

2I'd add setting it up right is a bit tricky. There's a SF regular who's a wizard at this, and if I recall correctly he suggests having a fast SSD for ZIL and l2arc. You're also going to want a ton of ram on your storage box as Daniel B mentioned. – Journeyman Geek – 2014-12-05T00:16:34.307

It'd probably be easier to use FreeNAS to set up ZFS on FreeBSD than doing it from scratch, since you're looking to build a NAS anyway (unless, of course, you prefer to set it up manually so you understand it). – Suchipi – 2014-12-05T06:25:40.367

@Suchipi “Unless, of course, you prefer to set it up manually so you understand it.” For the original poster, maybe that is the best option. But I do Linux/Unix development, systems administration and security work. Even if I setup something the easy way I assure you at some point I will have to “dig deeper” to really get it to work the way a client desires. So I would rather build from scratch; works better for my process. – JakeGould – 2014-12-05T06:29:18.167

There’s really no need for any trickery in SOHO environments. My system has a Sandy Bridge Pentium and runs RAID-Z2 without problems at ~10k IOPS and >400 MB/s with 6 disks. ZIL and L2ARC are unnecessary for most mass storage applications. After all, you only have Gigabit Ethernet, right? ;) – Daniel B – 2014-12-05T08:33:35.920

@DanielB Could you explain your last point some more. You lost me with the ZFS terms. What are ZIL and L2ARC, why are they unnecessary and what's the relation to ethernet? – hekevintran – 2014-12-05T18:41:25.507

@DanielB Instead of elaborating in comments it might be better if you posted a new answer outlining how you have successfully setup an NFS with ZFS on a small scale. – JakeGould – 2014-12-05T20:10:29.907

4

It's worth noting that Microsoft now has ReFS (resilient file system) on Windows 2012+ and Windows 8.1, which does check integrity. Furthermore, if you run ReFS on a mirrored storage space, it can automatically correct those errors by using bits from the other side of the mirror.

ReFS doesn't support all the features of NTFS, so you'll have to decide if any of the things it's missing are important for the files or workloads you need it for.

briantist

Posted 2014-12-04T23:02:54.933

Reputation: 780

Well you don't specify your desired operating system and you listed NTFS so I figured it was fair game that Windows might be acceptable! I may be biased as a primarily Windows sysadmin, but Microsoft's stuff is pretty great lately, especially on the server side. Do you have a specific concern? – briantist – 2014-12-05T04:50:58.113

“Well you don't specify your desired operating system…” I am not the original poster. I am simply a site user who gave you a “+1” for this, said “Good tip!” and explained by only personal issue with it: Microsoft. As for whether this is an issue with the original poster, not my position to say. – JakeGould – 2014-12-05T04:52:50.440

Whoops my mistake! I'm on mobile so it's easier to miss. I'm still curious if you have a specific concern, as it would add to the discussion and would probably be useful to the OP. – briantist – 2014-12-05T04:55:28.507

1My concern is the original poster mentions only Time Machine (an Apple specific Mac OS X tool) as their backup tool and might only be approaching this issue from the Apple side where sometimes—thanks to Fuse—Mac users format drives in NTFS for cross-platform compatibility and will not be mounting—or managing this—in a Microsoft OS environment that would really be able to handle ReFS. If somehow ReFS is cross platform in usability and resiliency without OS concerns, great! But I have a feeling that might be a stumbling block to it being adopted in this scenario as presented. – JakeGould – 2014-12-05T04:59:22.217

That's a good point, and it's unlikely that ReFS will be a viable option on direct hardware attached to a Mac. If it were a Windows file server sharing a ReFS volume over Cifs or NFS it might work, but I'll add that ReFS doesn't support alternate data streams like NTFS does, so it couldn't support resource forks either. – briantist – 2014-12-05T05:02:54.547

@briantist Great to hear that Microsoft already has a product that addresses this issue properly. Yes as JakeGould guessed I am primarily an Apple user so this doesn't help me directly, but maybe it'll motivate Apple to release something similar. – hekevintran – 2014-12-05T19:01:59.100

4

chkbit is a lightweight bitrot detection tool (OS X/Linux/Windows).

chkbit cannot repair bitrot, its job is simply to detect it.

You should

  • backup regularly.
  • run chkbit before each backup.
  • check for bitrot on the backup media.
  • in case of bitrot restore from a checked backup.

laktak

Posted 2014-12-04T23:02:54.933

Reputation: 2 223

0

For Windows I've discovered this little program:

"DiskFresh is a simple yet powerful tool that can refresh your hard disk signal without changing its data by reading and writing each sector and hence making your disk more reliable for storage"

It does a full read/write cycle of all the sectors on a disk, so that you can prevent bitrot.

I have some SATA harddisks which I use for archiving purposes, so they are not always connected to the computer. I have them in a plastic enclosure and are kept in a drawer together with some moisture absorbing bags. If and when needed I just slide them in one of those sata disk slots I've installed on the tower. Because they sit for extended periods of time offline, I've some concerns for bitrot on those. Found this utility and tried it out on these disks. Just be prepared that it will take a long time, as it performs a full read-write on the whole disk. I usually use it once a year overnight.

noctrex

Posted 2014-12-04T23:02:54.933

Reputation: 1

Please read how to recommend software in answers, particularly the bits in bold; then edit your answer to follow the guidelines there. Thanks!

– bertieb – 2018-11-22T18:34:27.083

0

You can always manually compute checksums with md5sum and check them periodically, or you can use btrfs, which has an online checksumming feature. On the other hand, it really is kind of redundant and unnecessary since disk drives already have their own error detecting and correcting codes.

psusi

Posted 2014-12-04T23:02:54.933

Reputation: 7 195

And if one day the MD5 checksum doesn't match, then what happens? ZFS allows for snapshots of data so you can recover a valid version of a file. On a normal file system, if the data doesn't check out, the data is gone unless there is a backup. – JakeGould – 2014-12-05T00:24:55.857

@JakeGould, snapshots won't help with that either since all of the snapshots share the same data blocks on disk if they were not modified intentionally. – psusi – 2014-12-05T04:07:53.027

Snapshots would help. You need to investigate how ZFS systems work. They are designed specifically to deal with data rot issues like this and provide a way to alert and restore data. – JakeGould – 2014-12-05T04:47:50.653

@JakeGould, no they would not. The whole point of snapshots is that they do NOT duplicate the data and thus, take up twice the space. ZFS can tell you that the data has become corrupted but unless you configure it with a redundant raid configuration, it can't recover the data. If you are using raid, then that is what provides the backup copy that hopefully is still good, not snapshots. – psusi – 2014-12-05T14:42:44.550

@psusi Running md5sum is not really a good solution since it alone does nothing for recovery which is the whole point of wanting to know where the corruption is. I don't think that checking integrity is redundant because drives have error correction. The world is complicated, hardware breaks all the time. Recovery is the goal. – hekevintran – 2014-12-05T18:39:38.190

@hekevintran, the way you stated the question it was purely about detection, not recovery. If you want recovery too, then you want par2, which can generate extra parity that can be used to correct some errors. – psusi – 2014-12-06T00:48:52.587