Fixing partition table after writing 0s to beginning

Question

I have a sw raid10 setup like: /dev/sdX > partition /dev/sdX1 > raid md0 > lvm > fs. I'm an idiot, and wrote 4gb of 0s to /dev/md0 with dd

dd if=/dev/zero of=/dev/md0 bs=4k count=1000000 conv=fdatasync

Anything I can do to recover data?

I tried restoring backup superblocks as per this link, but when I pick a backup block I get

e2fsck 1.42.5 (29-Jul-2012)
e2fsck: Invalid argument while trying to open /dev/md0

The superblock could not be read or does not describe a correct ext2
filesystem.  If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
  e2fsck -b 8193 <device>

I do have most of the data backed up, and anything that isn't is not important. I just don't want to have to copy over a few hundred gb of data when I think there's a way to restore the partition table.

Restoring the partition table isn't going to help. You've lost the first 4GB of your data, partition table, OS etc... bye bye data. Time to do a full restore. — hookenz, Jun 30 '14 at 04:01

kasperd · Accepted Answer · 2014-06-30T06:27:31.193

If your description of the scenario is accurate, then your partition table is unharmed. You may however have lost everything else on the disk.

Your attempt to use fsck is failing, because you are not taking the lvm layer into account. When you wrote to md0, you didn't overwrite the start of the ext2 file system, you overwrote the start of the lvm physical device.

Recovering data in this case require intimate knowledge with both lvm and ext2 (whether it is really ext2, ext3, or ext4 probably doesn't matter, they are similar enough).

I don't know where lvm keeps the metadata, but it seems likely much of it would be at the start of the media, it seems likely it would be a lot shorter than 4GB you wrote to, so some of those 4GB probably contained the start of your ext2 file system. So you may be missing metadata from both layers.

My recommendation is to not touch the corrupted disk yet. Instead start restoring onto a new disk. Once you have restored, figure out if you are still missing anything. In case you turn out to have a problem with the backup, then you can start looking into recovering the corrupted disk.

The restore may take a while, but it is probably going to be a lot faster than any recovery attempt from the corrupted disk. And you can do something productive while waiting for the restore.

Note that with 5 GB of data overwritten, just too much is likely to be lost. Even in the case of a more or less successful recovery in terms of being able to mount the filesystem and list the directories (which probably would have lost their names), there is not much to tell corrupt files (i.e. ones which have blocks within the first 4 GB of the disk) apart from intact ones. The only reasonable way to get *everything* back is to restore from a backup. — the-wabbit, Jun 29 '14 at 22:20
@the-wabbit Figuring out which files are corrupted is not the hardest part. First create a RAID-6 array from brand new disks, for the recovery process. Having four times the size of the file system you are trying to recover, gives about the right amount of room to work with. First creat a raw image of everything that can be physically read and make the file immutable. Then create a copy where missing sectors are overwritten with all zeros, create another copy where missing sectors are overwritten with all ones. Identical files recovered from the two are unlikely to be corrupt. — kasperd, Jun 29 '14 at 22:43

Fixing partition table after writing 0s to beginning

1 Answers1