6

A server of mine had a drive failure of some sort which caused the OS (CentOS 5) to crash and stop working (it refuses to boot).

So we put another drive with a working OS and from there we try to mount the partitions in the old drive.

Most partitions mount fine except for one: the /var partition, where my MySQL tables reside.
When I try to mount that one, I see these errors with dmesg:

sd 0:0:1:0: Unhandled sense code
sd 0:0:1:0: SCSI error: return code = 0x08100002
Result: hostbyte=invalid driverbyte=DRIVER_SENSE,SUGGEST_OK
sdb: Current: sense key: Medium Error
Add. Sense: Unrecovered read error

Info fld=0x4a47e
JBD: Failed to read block at offset 9863
JBD: recovery failed
EXT3-fs: error loading journal.

Is there a way I can recover the data in that partition?


EDIT:
As requested, the output of tune2fs -l /dev/sdb2 is:

tune2fs 1.39 (29-May-2006)
Filesystem volume name:   /var1
Last mounted on:          <not available>
Filesystem UUID:          d84f5181-24f3-40ce-9eaa-601ae5ae33bd
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file
Default mount options:    user_xattr acl
Filesystem state:         clean
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              26214400
Block count:              26214063
Reserved block count:     1310703
Free blocks:              25127226
Free inodes:              26213665
First block:              0
Block size:               4096
Fragment size:            4096
Reserved GDT blocks:      1017
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         32768
Inode blocks per group:   1024
Filesystem created:       Thu May 13 18:14:28 2010
Last mount time:          Thu Nov 29 12:52:00 2012
Last write time:          Wed Mar 27 20:29:28 2013
Mount count:              15
Maximum mount count:      -1
Last checked:             Thu May 13 18:14:28 2010
Check interval:           0 (<none>)
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:               128
Journal inode:            8
Default directory hash:   tea
Directory Hash Seed:      35f38c48-3933-4c99-bde2-63b0eccf200d
Journal backup:           inode blocks

EDIT 2:
As suggested by @Hartmut, I run fsck.ext3 /dev/sdb2 with the following result:

e2fsck 1.39 (29-May-2006)
/var1: recovering journal
/var1: Attempt to read block from filesystem resulted in short read while reading block 11931

JBD: Failed to read block at offset 9863
fsck.ext3: No such device or address while trying to re-open /var1
e2fsck: io manager magic bad!
GetFree
  • 1,460
  • 7
  • 23
  • 37

3 Answers3

7

It appears that your hard drive has had a physical failure, and that it has affected a block containing the ext3 journal.

You will need a second blank hard drive, at least as large as the failed drive partition, to perform any sort of recovery of this disk. You will also need a destination to copy recovered files to, so let's call it a third blank hard drive, network file share, etc.

The general recovery process is going to be:

  1. Copy the failed partition to the new drive using dd conv=noerror or better dd_rescue. This may take some time.

  2. Perform all further operations on the copy Here I assume that you have copied /dev/sdb2 to /dev/sdc2 and that you will recover files to /dev/sdd2.

  3. Since the journal is corrupt, we will remove it:

    tune2fs -O ^has_journal /dev/sdc2
    
  4. Now complete an fsck of the device. This may take some time.

    e2fsck /dev/sdc2
    
  5. Mount the filesystem read-only and attempt to recover files.

    mount -o ro /dev/sdc2 /mnt/baddrive
    mount /dev/sdd2 /mnt/recoveredfiles
    cp -av /mnt/baddrive/* /mnt/recoveredfiles
    
  6. In no case should you ever use the original disk again. Replace it (under warranty, if it is still under warranty).

Michael Hampton
  • 237,123
  • 42
  • 477
  • 940
  • I would make this: you need a second and third drive. And add step 1.5: copy this to the third drive so you can mess up the first copy and restore it from the second copy. You do not want to rely on being able to copy it off the bad disk again. – Dennis Kaarsemaker Mar 28 '13 at 16:24
  • Good advice, but I also have the feeling that the OP doesn't have that many disks laying around. As always, caution is advised. – Michael Hampton Mar 28 '13 at 16:28
  • @GetFree, another way to mount a filesystem without touching its journal is to pass `noload` to `mount`, like this: `mount -o ro,noload ` – kostix Sep 30 '13 at 15:06
  • It's important to note that before modifying the `has_journal` flag you might need to clear the `needs_recovery` flag. – Nick Chapman Mar 03 '16 at 01:58
2

Did you try mounting it as ext2 filesystem with mount -t ext2 ... ? ext3 is backward compatible with ext2, so it should just ignore the journal that seems to be broken. It's not an ideal solution, but it may let you access some data if you're lucky!

fab4am
  • 69
  • 3
  • 1
    I've just tried that but I get the error `EXT2-fs: sdb2: couldn't mount because of unsupported optional features (4).` – GetFree Mar 28 '13 at 09:11
  • Please post the output of `tune2fs -l /dev/sdb2` – etagenklo Mar 28 '13 at 14:00
  • @etagenklo, there, I updated the question. – GetFree Mar 28 '13 at 14:51
  • please post your command you used to mount the FS with ext2 and did you try the "fsck.ext3" with this partition? – Hartmut Mar 28 '13 at 14:56
  • @Hartmut, I used the command `mount -t ext2 /dev/sdb2 /mnt/sdb2`. Also, I posted the results of `fsck.ext3` in the question. – GetFree Mar 28 '13 at 15:48
  • You can disable the filesystem journal using `tune2fs -O ^has_journal /dev/sdb2`. Afterwards, you can try to mount it again using `mount -t ext2 /dev/sdb2 /mnt/sdb2` or alternatively try mounting it as ext3 again: `mount /dev/sdb2 /mnt/sdb2`. – etagenklo Mar 28 '13 at 17:43
  • The drive disappeared from the list `fdisk -l`. I believe that happend after running fsck.ext3. I rebooted the machine but still the drive does not appear and I dont know how to make it appear again. – GetFree Mar 28 '13 at 23:38
1

It could be possible that superblocks of filesystems has been corrupted. you can follow below steps to recover superblocks.

# dumpe2fs /dev/sdb2 | grep -i superblock

Sample output :

Primary superblock at 0, Group descriptors at 1-6
Backup superblock at 32768, Group descriptors at 32769-32774
Backup superblock at 98304, Group descriptors at 98305-98310
Backup superblock at 163840, Group descriptors at 163841-163846
Backup superblock at 229376, Group descriptors at 229377-229382

Either you can fsck the partition with alternative superblock or you can mount the partition with alternative superblock without fsck on filesystem .

To check filesystem

# fsck.ext3 -b 32768 /dev/sda2

To mount filesystem with alternative superblock:

# mount sb={alternative-superblock} /dev/device /mnt
# mount sb=32768 /dev/sdb2 /mnt

and Try to browse files.

Amit Biyani
  • 51
  • 1
  • 1
  • 3
  • What would be the most recent superblock backup? the first one in the list or the last one? – GetFree Mar 29 '13 at 01:50
  • AFAIK, these all are same until whole HDD corrupt. Plus, recovery sometimes matters which filesystem are you using ext2, ext3 or ext4. [SEE HERE](http://sandeepbhalla.com/2012/09/06/superblock-read-error-diagnosing-correct-ext-file-system-surviving-fsck-ext3-ext4-messages-in-precise-pangolin-12-04/) – Amit Biyani Mar 30 '13 at 05:26
  • also you should try to run testdisk on particular filesystem to see further details – Amit Biyani Mar 30 '13 at 05:27
  • The disk had ext3 partitions, but it's gone now, it suffered some kind of physical failure that damaged several sectors. – GetFree Apr 01 '13 at 08:31