15

I have a KVM host machine with several VMs on it. Each VM uses a Logical Volume on the host. I need to copy the LVs to another host machine.

Normally, I would use something like:

dd if=/the/logical-volume of=/some/path/machine.dd

To turn the LV into an image file and use SCP to move it. Then use DD to copy the file back to a new LV on the new host.

The problem with this method is you need twice as much disk space as the VM takes on both machines. ie. a 5GB LV uses 5GB of space for the LV and the dd copy also uses an additional 5GB of space for the image. This is fine for small LVs, but what if (as is my case) you have a 500GB LV for a big VM? The new host machine has a 1TB hard drive, so it can't hold a 500GB dd image file and have a 500GB logical volume to copy to and have room for the host OS and room for other smaller guests.

What I would like to do is something like:

dd if=/dev/mygroup-mylv of=192.168.1.103/dev/newvgroup-newlv

In other words, copy the data directly from one logical volume to the other over the network and skip the intermediate image file.

Is this possible?

Nick
  • 4,433
  • 29
  • 67
  • 95

6 Answers6

30

Sure, of course it's possible.

dd if=/dev/mygroup-mylv | ssh 192.168.1.103 dd of=/dev/newvgroup-newlv

Boom.

Do yourself a favor, though, and use something larger than the default blocksize. Maybe add bs=4M (read/write in chunks of 4 MB). You can see there's some nitpicking about blocksizes in the comments; if this is something you find yourself doing fairly often, take a little time to try it a few different times with different blocksizes and see for yourself what gets you the best transfer rates.

Answering one of the questions from the comments:

You can pipe the transfer through pv to get statistics about the transfer. It's a lot nicer than the output you get from sending signals to dd.

I will also say that while of course using netcat -- or anything else that does not impose the overhead of encryption -- is going to be more efficient, I usually find that the additional speed comes at some loss of convenience. Unless I'm moving around really large datasets, I usually stick with ssh despite the overhead because in most cases everything is already set up to Just Work.

larsks
  • 41,276
  • 13
  • 117
  • 170
  • 1
    Does the bs only affect the copy speed, or does it have an effect on how the data is stored? – Nick Feb 09 '12 at 01:29
  • 3
    It has no affect on how the data is stored, but it is vastly more efficient than using the default blocksize (of 512 bytes) for reading and writing. – larsks Feb 09 '12 at 01:32
  • Is it possible to pipe it through something that indicates progress? – Nick Feb 09 '12 at 01:48
  • 3
    @Nick: On Linux, you can send the `dd` process the `USR1` signal to make it display a status line with the amount transferred. Get the process number of your `dd` process with something like `ps aux | grep dd` and then use this PID with the command `kill -USR1 $PID`. The message will be displayed on the original terminal where you started `dd`. – Sven Feb 09 '12 at 02:30
  • 4
    You probably don't want to use a bs that large since it will just block writing to the pipe to ssh until it can transfer most of it to the network socket, during which time the disk will go idle. Since the default readahead size is 128k, you probably want to stick with that. Or increase the disk readahead size. – psusi Feb 09 '12 at 03:26
  • 2
    @psusi: The link Zoredache put as comment below the question demonstrated the opposite, they got the fastest result with 16M block sizes, but used netcat instead of ssh as transfer method, which is always a better option when encryption is not required. – Sven Feb 09 '12 at 03:52
  • 1
    @Nick: I've updated the answer re: your question above. – larsks Feb 09 '12 at 04:20
  • Thanks! At first I thought it took off without asking for the password on the other machine, but it was still waiting for a password even though the clock was running. Kind of counter intuitive. But still very nice not to have to wonder how much longer its going to be. – Nick Feb 10 '12 at 01:53
20

Here's an optimized version, which shows the progress using pv and uses BS for bigger chunks and also uses gzip to reduce the network traffic.

That's perfect when moving the data between slow connections like internet servers. I recommend to run the command inside a screen or tmux session. That way the ssh connection to the host from where you execute the command can be disconnected without trouble.

$ dd if=/dev/volumegroupname/logicalvolume bs=4096 | pv | gzip | \
    ssh root@78.46.36.22 'gzip -d | dd of=/dev/volumegroupname/logicalvolume  bs=4096'
chicks
  • 3,639
  • 10
  • 26
  • 36
  • 2
    You could use `ssh -C` instead of `gzip`. I'm not sure if there's a performance impact, but it is a lot less typing. – Samuel Edwin Ward Mar 02 '14 at 16:47
  • 2
    I also suggest using either pigz or pxz -1 instead of gzip, the multithreading _really_ helps on any modern server. – sCiphre Aug 23 '17 at 09:30
  • `pv` can cause problems (in my exprience moving more 500 vps to others servers with this system) with number of bytes, and after this problem, lvm volumes are inconsistent. Beneficts of see progress of work are null and dangeorus. If you like see progress open a a console with ifto for example. – abkrim Jul 21 '18 at 06:18
  • Any way to do this the other way around, i.e. save the image into the current machine from a remote machine over SSH? – GeekTantra Aug 17 '22 at 11:05
  • I tried: `ssh 192.168.1.3 'dd if=/dev/vgubuntu/root bs=4096 | pv | pigz' > /root/test.img.gz` But this would not show any progress indicators while the image size increased. – GeekTantra Aug 17 '22 at 11:06
5

How about using an old freind to do this. NetCat.

On the system that is losing the logical volume type

  • $ dd if=/dev/[directory]/[volume-name] | nc -l [any high number port]

Then on the receiving system. type

  • $ nc -w 10 [ip or name] [port] | dd of=/dev/[directory/[volume name]

Translating, orgin box dd this file and pipe it to nc (netcat) that will listen of this port. On the receiving system, netcat will wait 10 seconds if it it gets no data before closing to [ip or name] on [port] then pipe that data to dd to write it out.

Yurij Goncharuk
  • 207
  • 1
  • 2
  • 13
linuxrebel
  • 51
  • 1
  • 1
3

First I would take a snapshot of the lv:

lvcreate --snapshot --name my_shot --size <thesize> /dev/<name of vg>/<name of lv>

After that you have to create a new lv on the new host (e.g. using lvcreate) with the same size. Then you can directly copy the data to the new host. Here is my example of the copy command:

dd if=/dev/vg0/my_shot bs=4096 | pv | ssh root@some_host -C 'dd of=/dev/vg1/<created lv> bs=4096'

I used the procedure to copy a proxmox pve maintained VM to another host. The logical volume contained several additional LVs that were maintained by the VM itself.

Woolf
  • 31
  • 1
3

First make sure that the logical volume is not mounted. If it is and you want to make a "hot copy", create a snapshot first and use this instead: lvcreate --snapshot --name transfer_snap --size 1G

I have to transfer a lot of data (7TB) between two 1Gbit connected Servers, so i needed the fastes possible way to do so.

Should you use SSH?

Using ssh is out of question, not because of its encryption (if you have a CPU with AES-NI support, it does not hurt so much) but because of its network buffers. Those are not scaling well. There is a patched Ssh version that addresses this problem, but as there are no precompiled packages, its not very convenient.

Using Compression

When transferring raw disk images, it is always advisable to use compression. But you do not want the compression to become a bottleneck. Most unix compression tools like gzip are single-threaded, so if the compression saturates one CPU, it will be a bottleneck. For that reason, i always use pigz, an gzip variant that uses all CPU cores for compression. And this is necessary of you want to go up to and above GBit speeds.

Using Encryption

As said before, ssh is slow. If you have an AES-NI CPU, this should not be a bottleneck. So instead of using ssh, we can use openssl directly.

Speeds

To give you an Idea of the speed impact of the components, here are my results. Those are transfer speeds between two production systems reading and writing to memory. You actual results depend on network speed, HDD speed and source CPU speed! Im doing this to show that there at least is no huge performance drop. Simple nc dd: 5033164800 bytes (5.0 GB, 4.7 GiB) copied, 47.3576 s, 106 MB/s +pigz compression level 1 (speed gain depends on actual data): network traffic: 2.52GiB 5033164800 bytes (5.0 GB, 4.7 GiB) copied, 38.8045 s, 130 MB/s +pigz compression level 5: network traffic: 2.43GiB 5033164800 bytes (5.0 GB, 4.7 GiB) copied, 44.4623 s, 113 MB/s +compression level 1 + openssl encryption: network traffic: 2.52GiB 5033164800 bytes (5.0 GB, 4.7 GiB) copied, 43.1163 s, 117 MB/s Conclusion: using compressing gives a noticeable speedup, as it reduces the data size a lot. This is even more important if you have slower network speeds. When using compression, watch your cpu usage. if the usage gets maxed out, you can try without it. Using compression as only a small impact on AES-NI systems, imho only because it steals some 30-40% cpu from the compression.

Using Screen

If you are transferring a lot of data like me, you do not want to have it interrupted by an network disconnect of your ssh client, so you better start it with screen on both sides. This is just a note, i will not write a screen tutorial here.

Lets Copy

Install some dependencies (on source and destination): apt install pigz pv netcat-openbsd

then create a volume on the destination with the same size as the source. If unsure, use lvdisplay on the source to get the size and create the target i.e.: lvcreate -n lvname vgname -L 50G

next, prepare the destination for receiving the data:

nc -l -p 444 | openssl aes-256-cbc -d -salt -pass pass:asdkjn2hb | pigz -d | dd bs=16M of=/dev/vgname/lvname

and when ready, start the transfer on the Source:

pv -r -t -b -p -e /dev/vgname/lvname | pigz -1 | openssl aes-256-cbc -salt -pass pass:asdkjn2hb | nc <destip/host> 444 -q 1

Note: If you are transferring the data locally or do not care about encryption, just remove the Openssl part from both sides. If you care, asdkjn2hb is the Encryption key, you should change it.

bhelm
  • 141
  • 1
  • 4
  • DO NOT EVER DO THIS ON A PROXMOX SERVER: apt install netcat-openbsd Installing netcat-openbsd completely wiped ProxMox from the server and caused 5+ hours of downtime and work!!! – Zoltan Feb 24 '19 at 18:43
-1

The rest of the answers does not work well and don't fulfill the question requirements, because it does not create the logical volume in the target server, but instead creates a file under /dev/mygroup/myvol in the root disk, which also causes the copied volume not appearing on the LV tools like lvdisplay.

I created a bash script that automates the whole process: https://github.com/daniol/lvm-ssh-transfer

daniol
  • 54
  • 1
  • 4