Here is a simplified version of my backup script that runs in the host:
# shutdown the guest to ensure its filesystem is in a stable state
virsh shutdown web --mode=acpi
sleep 20s # the real script uses a smarter method to wait for the guest shutdown to complete
# make a snapshot copy of the offline guest
lvcreate -n web-bsnap -L50GB -s /dev/vg0/web
# start the guest to minimize the offline time
virsh start web
# create the backup volume
lvcreate -n web-0 -L 193273528320B /dev/vg0
# make the backup by copying the offline snapshot
nice -n 19 dd if=/dev/vg0/web-bsnap of=/dev/vg0/web-0 bs=4K
# remove the snapshot
lvremove -f /dev/vg0/web-bsnap
The backup takes more than 1 hour, but the problem is that, during that time, the guest becomes very slow (at times it is unreachable too).
I have no need for the backup to end in 1 hour or 2, it can take 10 hours if needed, but I want it to run at lowest priority so that it doesn't disturb the normal guest operations. The nice
command is there for that reason, but it doesn't seem to make any difference.
The host system is a Debian GNU/Linux 8 amd64 with the Linux kernel from sid (4.7). The same goes for the guest. The problem was just the same with the jessie kernel (3.16) on both host and guest.
The host hardware is way oversized for the usual guest workload, with 256GB of RAM, a Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz with 6 cores and 2TB RAID1 storage on enterprise SATA disks, all for a single guest with a website that serves 1 webpage/second on average. The usual server load is below 1.
What can I do to make the backup less intrusive?