2

I've got a Lenny server that has got a SAN connection configured as the only PV for a VG named 'datavg'.

Yesterday, I've updated the box with Debian patches and gave it a reboot.

After the reboot, it didn't boot up saying that it couldnt find /dev/mapper/datavg-datalv.

This is what I did:
- booted in rescue-mode and commented the mount in /etc/fstab
- rebooted into full-user mode. (mountpoint is /data, only postgresql could not start)
- did vgdisplay, lvdisplay, pvdisplay to find out what happened to the volume group. (datavg was missing entirely)

After that, I noticed that the LUN is visible from Linux and that the LVM partition is also visible:

# ls -la /dev/mapper/mpath0*
brw-rw---- 1 root disk 254, 6 2009-11-23 15:48 /dev/mapper/mpath0
brw-rw---- 1 root disk 254, 7 2009-11-23 15:48 /dev/mapper/mpath0-part1


- Then, I tried pvscan in order to find out if it could find the PV. Unfortunately, it didnt detect the partition as a PV.
- I ran pvck on the partition, but it did not find any label:

# pvck /dev/mapper/mpath0-part1 
  Could not find LVM label on /dev/mapper/mpath0-part1


- Then, I was wondering if the LUN was perhaps empty, so I made a dd of the first few MB. In this, I could see the LVM headers:

datavg {
id = "removed-hwEK-Pt9k-Kw4F7e"
seqno = 2
status = ["RESIZEABLE", "READ", "WRITE"]
extent_size = 8192
max_lv = 0
max_pv = 0

physical_volumes {

pv0 {
id = "removed-AfF1-2hHn-TslAdx"
device = "/dev/dm-7"

status = ["ALLOCATABLE"]
dev_size = 209712382
pe_start = 384
pe_count = 25599
}
}

logical_volumes {

datalv {
id = "removed-yUMd-RIHG-KWMP63"
status = ["READ", "WRITE", "VISIBLE"]
segment_count = 1

segment1 {
start_extent = 0
extent_count = 5120

type = "striped"
stripe_count = 1        # linear

stripes = [
"pv0", 0
]
}
}
}
}

Note that this came from the partition where pvck could not find an LVM label!


- I decided to write a new LVM label to the partition and restore the parameters from the backup file.

pvcreate --uuid removed-AfF1-2hHn-TslAdx --restorefile /etc/lvm/backup/datavg  /dev/mapper/mpath0-part1


- Then I ran a vgcfgrestore -f /etc/lvm/backup/datavg datavg
- After that, I appears when I issue a pvscan.
- With a vgchange -ay datavg, I activated the VG and the LV came available.
- When I tried to mount the LV, it did not find any filesystem. I tried recovery in several ways, but did not succeed.
- After making a DD of the affected LV, I've tried to recreate the superblocks with

mkfs.ext3 -S /dev/datavg/backupdatalv


- but the result of this cannot be mounted:

# mount /dev/datavg/backupdatalv /mnt/
mount: Stale NFS file handle

The fact that this can happen in the first place is not very nice to say the least, so I want to find out everything I can about this malfunction.

My questions:
- How can it be that the LVM label disappears after patches and a reboot?
- Why is the filesystem not there after salvaging the PV? (Did the pvcreate command trash the data?)
- Is the ext3 filesystem in the LV still salvageable?
- Is there anything I could have done to prevent this issue?

Thanks in advance, Ger.

Ger Apeldoorn
  • 565
  • 3
  • 10

1 Answers1

4

I once ran into a similar problem. In our case, someone created a partition to hold the PV, but when they ran the pvcreate command, they forgot to specify the partition and instead used the whole device. The system ran fine until a reboot, when LVM could no longer find the PV.

So in your case, is it possible that someone ran "pvcreate /dev/mapper/mpath0" at the time of creation rather than "pvcreate /dev/mapper/mpath0-part1"? If so, you'll need to remove the partition table from the disk containing the PV.

From the pvcreate(8) man page to delete a partition table:

dd if=/dev/zero of=PhysicalVolume bs=512 count=1

The LVM code in the kernel will not recognize a whole device PV if there is a partition table on the device. Once we removed the partition table, the PV was recognized and we could access our data again.

Shane Meyers
  • 1,008
  • 1
  • 7
  • 17
  • Thanks Shane, that could be what happened. Unfortunately, I had to reinitialize the LUN so we will never find out for sure. But I am happy that I do have an explanation. Thanks again, Ger. – Ger Apeldoorn Jan 23 '10 at 16:09