0

Bit-rot is very real, having caught one example I now face an HFS+ file system error investigation - suggestions welcome.

Dear all, I'm looking for advice please about next steps having found a single bit flip difference between two ~1.5 TB HFS+ disk images.

Years ago I decided I really ought to better look after my piles of old (mainly Mac based) legacy data. I started to scheme towards setting up a ZFS hosted archive. As my ageing drive collection grew, I kept files replicated in a messy ad-hoc way on multiple drives (to try to mitigate against physical drive failure). Recently I finally got around to setting up a ZFS array and managed to get the first of the old archive data copied over.

Perhaps unwisely, impatient to recover some spare space, I started to delete a few obvious duplicate files by comparing size/modification-date etc. Anyway, one of my duplicated files was an old disk image of an earlier 1.5 TB drive and, on a whim, I thought I would test the speed of the ZFS array and double check there had been no corruption in the original files, by md5 summing both array hosted copies and comparing each against the other before deleting one of what I assumed would be identical copies...

Well, I was surprised when the two 1.5TB files of same byte length and modification dates etc had different hashes! Uh-oh..

I then did a bytewise comparison of the two files (using cmp -l ) and I found a single mismatching byte difference with two octal representations (022 and 222). ...meaning, a single bit-flip difference between the two 1.5TB HFS+ filesystem images.

I have checked that the disk images don't have an associated checksum attached (which DiskUtility under MacOS will generate.. because then that would have told which image was 'ok' and IO could just delete the faulty file...) but neither had a checksum created at time of imaging.

Now what? I'd like to identify the file within the file-system images which has the disputed bit (after all it might be an unimportant file.. or old 'free space'.. but it might also be non-file meta-data; merely attempting to read the file having mounted the image read-only might tell me which image ok and which is not.. but how do I map a byte-offset location on a disk-image to a file on a filesystem?).

Any suggestions as to the best tools/way to find which file on that file-system contains the particular byte-offset? or any other approach I've missed? Perhaps an automated way of systematically comparing every file within a part of mounted HFS+ images? I suppose it might be possible to script something to recurse over both.. but if it doesn't give a clean-hit then I'm still in the dark.. and if I could map the specific byte location to a file then that ought to be quicker.. (and it would be interesting to know).

Many thanks!

M.

user73225
  • 31
  • 2
  • It may not at file. But metadata or unused space so compare just files may not help to find out what is affected by bitrot – gapsf Sep 23 '22 at 20:23
  • yes, I agree - I mention those possibilities in the post - I suppose that a brute-force file comparison might give me a chance though.. the filesystem was more than 50% full, so more than 50% chance :-).. but would be nicer to have a mapping process from byte-offset to file.. presumably requires some software which can logically interpret file-system layout? – user73225 Sep 24 '22 at 06:13
  • To compare files on two mounted fs you may use diff -r – gapsf Sep 24 '22 at 06:29

0 Answers0