7

I am using rsnapshot on a Debian Wheezy server. This was recently upgraded from squeeze. Since the upgrade, I am getting the following error from the hourly cron job:

remote rm -rf /share/HDA_DATA/backup/rsnapshot/hourly.3 p1=-rf p2=/backup/rsnapshot/hourly.3/
remote cp -al /share/HDA_DATA/backup/rsnapshot/hourly.0 /share/HDA_DATA/backup/rsnapshot/hourly.1 p1=-al p2=/backup/rsnapshot/hourly.0
  Logical volume "rsnapshot" successfully removed
  Logical volume "rsnapshot" successfully removed
  Unable to deactivate open raid5-dl-real (254:4)
  Failed to resume dl.
----------------------------------------------------------------------------
rsnapshot encountered an error! The program was invoked with these options:
/usr/bin/rsnapshot hourly 
----------------------------------------------------------------------------
ERROR: Removal of LVM snapshot failed: 1280

Two LVM volumes are correctly backed up and Logical volume "rsnapshot" successfully removed, but then it gets to volume dl in lvm VG raid5 and see the unable to deactivate raid5-dl-real.

The name of my lvm snapshot is called raid5/rsnapshot. raid5-dl-real does not correspond to a volume name - the real device is /dev/mapper/raid5-dl.

So if this is the dl volume itself, why would lvm be trying to deactivate it?

Note that this was happening to an entirely different volume originally, so I removed it from the backup. Now it has shifted to this one.

The rsnapshot log isn't very enlightening either:

[16/Jul/2013:17:26:26] /sbin/lvcreate --snapshot --size 512M --name rsnapshot /dev/raid5/dl
[16/Jul/2013:17:26:29] /bin/mount /dev/raid5/rsnapshot /mnt/lvm-snapshot
[16/Jul/2013:17:26:32] chdir(/mnt/lvm-snapshot)
[16/Jul/2013:17:26:32] /usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded . /backup/rsnapshot/hourly.0/dl/
[16/Jul/2013:17:27:57] rsync succeeded
[16/Jul/2013:17:27:57] chdir(/root)
[16/Jul/2013:17:27:57] /bin/umount /mnt/lvm-snapshot
[16/Jul/2013:17:27:58] /home/share/scripts/rsnapshot_lvremove --force /dev/raid5/rsnapshot
[16/Jul/2013:17:29:02] /usr/bin/rsnapshot hourly: ERROR: Removal of LVM snapshot failed: 1280
[16/Jul/2013:17:29:02] rm -f /var/run/rsnapshot.pid

Any ideas?

Update - this has just started happening on an entirely different server. Same LVM issue.

One thing I have tried is to redirect the lvremove command to a script:

#!/bin/bash
sync
sleep 600
ls /dev/mapper/raid5-*-real
for i in /dev/mapper/raid5-*-real; do /sbin/dmsetup remove $i ; done
/sbin/lvremove --debug "$@"

So this syncs, sleeps for a bit, then removes any -real device maps before attempting the lvremove.

Even after all of this, the removal often fails. Here is the output from rsnapshot. Please ignore the error part way through, while there is an issue on one of the volumes, it is not until later that the lvremove fails:

remote cp -al /share/HDA_DATA/backup/rsnapshot/hourly.0 /share/HDA_DATA/backup/rsnapshot/hourly.1 p1=-al p2=/backup/rsnapshot/hourly.0
  One or more specified logical volume(s) not found.
/dev/mapper/raid5-crypt-real
/dev/mapper/raid5-db-real
device-mapper: remove ioctl on raid5-crypt-real failed: No such device or address
Command failed
device-mapper: remove ioctl on raid5-db-real failed: Device or resource busy
Command failed
  Logical volume "rsnapshot" successfully removed
  One or more specified logical volume(s) not found.
/dev/mapper/raid5-crypt-real
/dev/mapper/raid5-db-real
/dev/mapper/raid5-db--var-real
device-mapper: remove ioctl on raid5-crypt-real failed: No such device or address
Command failed
device-mapper: remove ioctl on raid5-db-real failed: No such device or address
Command failed
device-mapper: remove ioctl on raid5-db--var-real failed: Device or resource busy
Command failed
  Logical volume "rsnapshot" successfully removed
  One or more specified logical volume(s) not found.
/dev/mapper/raid5-crypt-real
/dev/mapper/raid5-db-real
/dev/mapper/raid5-db--var-real
device-mapper: remove ioctl on raid5-crypt-real failed: Device or resource busy
Command failed
device-mapper: remove ioctl on raid5-db-real failed: No such device or address
Command failed
device-mapper: remove ioctl on raid5-db--var-real failed: No such device or address
Command failed
  /dev/raid5/rsnapshot: read failed after 0 of 4096 at 42949607424: Input/output error
  /dev/raid5/rsnapshot: read failed after 0 of 4096 at 42949664768: Input/output error
  /dev/raid5/rsnapshot: read failed after 0 of 4096 at 0: Input/output error
  /dev/raid5/rsnapshot: read failed after 0 of 4096 at 4096: Input/output error
  Logical volume "rsnapshot" successfully removed
  One or more specified logical volume(s) not found.
/dev/mapper/raid5-crypt-real
/dev/mapper/raid5-db-real
/dev/mapper/raid5-db--var-real
/dev/mapper/raid5-dl-real
device-mapper: remove ioctl on raid5-crypt-real failed: No such device or address
Command failed
device-mapper: remove ioctl on raid5-db-real failed: No such device or address
Command failed
device-mapper: remove ioctl on raid5-db--var-real failed: No such device or address
Command failed
device-mapper: remove ioctl on raid5-dl-real failed: Device or resource busy
Command failed
  Unable to deactivate open raid5-dl-real (254:25)
  Failed to resume dl.
----------------------------------------------------------------------------
rsnapshot encountered an error! The program was invoked with these options:
/usr/bin/rsnapshot hourly 
----------------------------------------------------------------------------
ERROR: Removal of LVM snapshot failed: 1280
Paul
  • 1,228
  • 12
  • 24
  • Same issue here, Debian Wheezy servers upgraded from Squeeze. LVM snapshots will fail to remove occasionally, often requiring another lvremove attempt to remove the snapshot and resume the parent volume. Also using rsnapshot but this issue isn't related to it. The "-real" comes from Device Mapper I believe, presumably part of how DM handles snapshots for LVM. – gsreynolds Sep 25 '13 at 08:40
  • Could you paste log when you are removing lvm by hand like ```lvremove --debug ``` or ```lvremove --debug --force ```? – spinus Sep 28 '13 at 22:12
  • @spinus I changed the lvremove command in to include the --debug switch, but it doesn't produce any more output than normal. – Paul Sep 30 '13 at 05:46
  • @Paul have you tried the lsof to check that maybe volume is used by some process? – spinus Oct 01 '13 at 08:34
  • 1
    @spinus No, I can add this to the script, but I don't see how it could be possible. The snapshot is created by the script, the rsync is run, and once complete, the lvremove arrives. The error message suggests something more fundamental than a file access: `Unable to deactivate open raid5-dl-real (254:25)` – Paul Oct 01 '13 at 11:29

1 Answers1

1

In the case this can help anyone, I had the problem as described in the Debian bug id 659762 report.

I identified a volume in suspended state with dmsetup info and reactivated it with dmsetup resume This unlocked the LVM system.

chicks
  • 3,639
  • 10
  • 26
  • 36
user300811
  • 173
  • 1
  • 8