Analyse allocated space in Ext4 partition to improve data recovery efficiency

2

1

I'm trying to use Lubuntu to recover as much data as possible from a failing 4TB hard disk drive.

According to GParted the main partition, formatted in Ext4, contains only 553GB of data. I attempted to make a full clone with HDDSuperClone, a GUI software for Linux, similar to the command line tool ddrescue in purpose and functionality. At first it was working well, with few errors / skipped sectors and a good average copy rate (~60MB/s). But about midway through it started to have more severe issues, with some areas not being read at all, forming a pattern in alternating stripes of good reads and bad reads, which is typically indicative that one head is defective. At that point I stopped the recovery.

I had recovered about 1.7TB, and it had been copying only 00s for quite a while, so I thought that all the relevant data would be secured on the recovery drive already. But it turns out that the main partition can not be mounted (while it can still be mounted on the source drive, albeit with difficulty), and reputed data recovery software (R-Studio, DMDE) can not reconstruct the original directory structure or retrieve original file names. And when opening the recovery drive in WinHex I can see that it is totally empty beyond 438GB, which would mean that about 115GB are missing – although I don't understand how that would be possible, as filesystems are supposed to write data on the outermost areas available, where the reading / writing speed is better, to optimize the performance of HDDs.

Now, to get the most out of what's left, considering that the drive's condition might deteriorate quickly at the next serious recovery attempt, I'm looking for any method that could analyse the metadata structures and report the allocated / unallocated space, so that I could target the recovery to those relevant areas instead of wasting precious time reading gigabytes of zeroes. A little command line program developed some years ago by the author of HDDSuperClone, ddru_ntfsbitmap (part of ddr_utility), can do this automatically with NTFS partitions: it analyses the $Bitmap file and generates a “mapfile” for ddrescue which effectively restricts the copy to the sectors marked as allocated (provided that this system file can be read in its entirety); it can also generate a “mapfile” to recover the $MFT first, which is tremendously useful (the MFT contains all the files' metadata and directory structure information, if it's corrupted or lost, only “raw file carving” type of recovery is possible). But even this highly competent individual doesn't know how to do the same with Linux partitions, as he replied on this HDDGuru thread.

So, even if it's not fully automated, I would need a procedure that could analyse an Ext4 partition, quickly and efficiently so as to not wear the drive further in the process, and report that information either as a text log or as a graphic presentation. Perhaps a defragmentation program would do the trick?

And generally speaking, where are the important metadata structures (“inodes” if I'm not mistaken) located on a Linux partition? Is there a single file equivalent to NTFS $Bitmap, or is the information about file / sector allocation determined through a more complex analysis? (If that's relevant, the drive was in a WDMyCloud network enclosure, factory configured and running with a Linux operating system.)

GParted analysis of the source drive.

At about 47% of the recovery serious issues start to appear, the alternating stripes of good reads (green) and bad reads (grey) indicate that one head has failed.

GabrielB

Posted 2019-05-26T16:50:25.230

Reputation: 598

1Does at least tune2fs -l /dev/sdb4 or even debugfs -R "show_super_stats" /dev/sdb4 recognize the filesystem? – user1686 – 2019-05-26T16:59:35.763

Use ddrescue, or gddrescue if you really need a gui. It looks like this tool has just managed to waste a lot of your time and effort, and possibly killed the drive beyond further repair. (Or refer to a recovery specialist if the data is that valuable.) – djsmiley2k TMW – 2019-05-26T17:29:44.667

@djsmiley2k What makes you say that ? The same thing would most likely have happened if I had done the cloning with ddrescue. It happens. And it's not wasted since I recovered 438GB (which I can scan by way of file signature search, it seems to contain the most valuable files already). The drive is not mine, I told the owner about the risks of proceeding further without replacing the defective head(s), he replied that he would not pay the hefty price of a full-blown recovery service, and that I could go on and try whatever I can to improve the current result which is already satisfying. – GabrielB – 2019-05-26T18:28:54.077

1The way I read your question, I thought you had almost no data that was recoverable. If you're now asking about recovering any more data, you might try putting the disk in a sealed plastic bag, in the freezer overnight. - In this case I hope the tool you used, supports resuming? – djsmiley2k TMW – 2019-05-26T18:30:43.797

@djsmiley2k Yes it does, it works with a log file just like ddrescue (and can import ddrescue logs, or export the logs it produces in ddrescue format, if one wishes to continue the recovery with that tool instead, or use ddrescueview to visualize the status – this can be done directly with HDDSCViewer, which I used to make the screenshot above). As for the “freezer trick”, from what I read it used to make some sense in some situations with early low capacity HDDs, but is dubious at best or even foolish with current designs which reach a extremely high degree of data density and sensitivity. – GabrielB – 2019-05-26T22:08:12.153

@grawity The sdb4 partition from the source drive can still be mounted, although it hangs a lot I can explore the directories (hence the related question about trying to at least copy the empty file tree). It's the partition on the recovery drive that can't be mounted / explored / analysed. Apparently important metadata files are missing in the recovery, yet still accessible in the source, I just don't know where. Is it possible that they're located beyond 1.7TB, with that much totally empty space before ? Likewise, is it possible that ~115GB of allocated data was written near the end ? – GabrielB – 2019-05-26T22:25:58.243

No answers