8

Here is my situation:

  • Two dedicated servers in the same datacenter with gigabit ethernet between them.
  • Both dedicated servers booted into a rescue environment based on Debian Squeeze with extra tools and utilities added. Also plenty of tmp space (32GB of RAM on both boxes) for downloading software, installing packages, and/or compiling as needed.
  • Both dedicated servers have approximately 3TB of usable space.
  • The "source" server has 4 x 1.5TB disks in Hardware RAID-10 with an Adaptec 4 port controller.
  • The "destination" server has 2 x 3TB disks in Hardware RAID-1 with an Adaptec 2 port controller -- same generation as the other, but different number of ports.
  • The number of usable blocks on /dev/sda differs by less than 10 MB, but the destination server's array is for some reason a few megs smaller.
  • Both RAID arrays are configured to use the entire disk surface of all constituent disks to create one, single RAID volume.
  • The operating system boots in MBR mode; no UEFI booting is used.

What I want to do:

  • Copy, at the block layer, the entire OS image (this only consists of GRUB2 bootloader in the GPT partition table, /boot partition, and / partition) from the "source" server to the "destination" server.
  • If possible, the copy should take place "live": this means I don't have enough space to store a proper file of the disk image on the destination side, unless I'm unpacking the disk image onto the hard disk as the copy is taking place. The gigabit ethernet connection between the servers is reliable enough that I'm comfortable with this, and I will of course run fsck on both ends (source and destination) to verify the filesystem is OK before and after the transfer.
  • If possible, do not transfer blocks over the network, which are not used by the constituent filesystems in each partition (all partitions are formatted as ext4). This is because more than 50% of the "source" disk is free space in the / partition.
  • Adjust the size of the / partition so that when it is copied, it is resized to fit within the just barely smaller size of the destination disk.
  • Once the copy is successful, mount each volume and fix up references to static IPs to reflect the IPs of the new server. (Can do this just fine without any further help)

My questions:

  • Should I first calculate the difference (in bytes) between the size of /dev/sda on each server, and then use e2resize to non-destructively reduce the size of the / partition on the source side so that it will fit into the space of the destination side?
  • Should I run dd on the raw block device, /dev/sda from the source to the destination (over ssh), or should I create an equivalent partition layout on the destination and run dd on each partition? Note that handling a partition at a time leaves me the problem of the bootloader, but if I don't do it a partition at a time, then dd needs to know to stop transferring data once it has written as many bytes as the destination can hold (which hopefully will "close out" the very end of the / partition on the last block, which is logically "to the right of" all other partitions in the partition layout of the source).

A few misc. specifics:

  • The host OS on the source box is Ubuntu Server 12.04 running several OpenVZ guests
  • Since both boxes are booted into rescue, direct disk access is possible without expecting any change to the underlying data by the running operating system.
allquixotic
  • 487
  • 1
  • 10
  • 24
  • Do you exactly need to copy the used blocks from the devices, or just the OS filesystem(s)? – Andrew Feb 22 '13 at 00:27

3 Answers3

6

This is messy, but doable.

I presume here that / is on /dev/sda3 and that /boot is on /dev/sda1.

  1. Shrink the filesystem on the old server to its minimum possible size.

    oldserver # resize2fs -M /dev/sda3
    
  2. Partition the new server's disk with an identically sized /boot, swapspace, and new / partition (and anything else you need).

    newserver # parted /dev/sda
    
  3. Copy the / and /boot filesystems.

    oldserver # dd if=/dev/sda1 | ssh root@newserver "dd of=/dev/sda1"
    oldserver # dd if=/dev/sda3 | ssh root@newserver "dd of=/dev/sda3"
    

    Because the partition on the new server will be slightly smaller than the one on the old server, you'll receive a spurious No space left on device message at the end of this. However, since you shrank the filesystem at step 1, this doesn't matter.

  4. Resize the filesystem on the new server to the size of the partition.

    newserver # resize2fs /dev/sda3
    
  5. Install GRUB on the new disk.

    newserver # mount /dev/sda3 /mnt
    newserver # mount /dev/sda1 /mnt/boot
    newserver # mount -o bind /dev /mnt/dev
    newserver # mount -o proc proc /mnt/proc
    newserver # chroot /mnt /bin/bash
    
    newserver(chroot) # grub-install /dev/sda
    newserver(chroot) # exit
    
  6. Finish the rest of your fixups (IP address, etc.).

You can probably find a way to avoid copying the partition's free space, but it'll probably take you longer to research than to just copy it all...

Michael Hampton
  • 237,123
  • 42
  • 477
  • 940
  • Awesome! I'm OK with copying the partition's free space because these instructions meet *all* of my other criteria. Although, wouldn't just resizing the filesystem **and the partition itself** on `oldserver` eliminate the need to copy all the free space? I don't care about `/boot` because it's so small, but for `/` at least, I could do that, right? Just set the partition's end sector to equal what sector `resize2fs` sets the end of the FS sector to. Well, sector, block... probably *block*. But thanks for this! This is great! – allquixotic Feb 21 '13 at 21:23
  • Yes, if you also reduced the size of the partition, then you'd avoid a bunch of copying. That might save you a couple of hours... I'd leave some slack in, just in case my math was slightly off though. – Michael Hampton Feb 21 '13 at 21:24
  • That would also eliminate the spurious/scary "No space left on device", since it's going to resize `/dev/sda3` down to about 1.3 TB and will be copying it into a partition on the destination that's expecting to hold about 2.9 TB. – allquixotic Feb 21 '13 at 21:27
  • [It's gonna take a while](http://www.wolframalpha.com/input/?i=%282992+Gigabytes+at+11.1+Megabytes+per+second%29+in+days). Realized I have a gigabit *port* with a 100 Mbit/s allocation. Crap. – allquixotic Feb 22 '13 at 02:34
5

I'd mkfs fresh filesystems on the new server, then rsync them from the old server. That is restartable, consistent, and each file is easily individually verifiable. Where you're discarding unused sections of the filesystem (not a forensic copy), I don't see any reason to not use this method. You would have to re-run GRUB, but that shouldn't be a challenge.

Explaining a raw copy that is file-system aware would take me a while, so unless you comment as to why my rsync solution doesn't work I'll spare myself the typing.

Jeff Ferland
  • 20,239
  • 2
  • 61
  • 85
  • I think `partimage` can do raw copies that are filesystem aware, but it doesn't support `ext4`. So there goes that as an option... `rsync` is looking nicer as an option, as long as it preserves all discretionary access controls (a la `chmod`) and can cleanly copy over symlinks and device files... – allquixotic Feb 21 '13 at 18:29
  • I second the answer of Jeff. You may transfer the partition layout with sfdisk -d /dev/sda | ssh destination "sfdisk /dev/sdb". Make your filesystems and transfer with 'rsync -a -e "ssh -c arcfour" /mnt/ root@destination:/mnt/'. Aftewards follow step 5 of Michael Hampton answer to make the destination bootable. – Tim Haegele Feb 21 '13 at 21:31
1

If you REALLY want to transfer data at a block device level, I can think of one pretty useful trick I was using to migrate servers with minium downtime involved.

The thing is, you can create a degraded mirror on source server with your data partition being the only active half of the mirror, then export destination partition from second server via AOE (I suppose both your servers are in the same broadcast domain). At source server you then connect network block device to your degraded mirror so it would start rebuild. Wait until rebuild is complete, stop your mirror, remove AOE exported device and you're OK.

A bit more details follow (I'll try to keep it brief).

Components:

  • mdadm with its build mode (ad-hoc mirror without metadata);
  • vblade for exporting block device as AOE network device;
  • aoe-tools for importing AOE network block device.

You have to create partition table on your destination server, then shrink source partition so it would fit destination. You can easily install GRUB to your new MBR; syncing just partitions over newly created partition table is a bit less error-prone.

On the receiving side you have to export your partition with vblade tool, on source server you can see exported devices after installing aoe-tools (run aoe-discover then look at /dev/ether/ for devices).

Then you should build raid1 device on source server with your source drive:

mdadm --build /dev/md0 -n2 -l1 --force /dev/sda

After this you can examine newly built mirror:

mdadm --detail /dev/md0
cat /proc/mdstat

At this point you can safely attach exported destination partition to this mirror:

mdadm /dev/md0 --add /dev/ether/eX.Y

Then just watch over synchronization progress:

watch -n5 cat /proc/mdstat

After sync is done, just stop the mirror: mdadm --stop /dev/md0 on source server, terminate vblade process on destination server, install GRUB on second server, change your IP addresses, etc.

Actually, with this trick it is possible to move server between boxes almost live, with downtime just to reboot synced boxes.


For performance reasons, I also suggest you increase your link's MTU (or set up a separate VLAN with jumbo frames enabled, if possible).

Note, you also can use something like nbd-server/nbd-client (or even iSCSI, if you want it rough) as an alternative to AOE, but AOE (vblade + aoe-tools) have a very simple interface and a great performance (no TCP/IP overhead),

artyom
  • 956
  • 9
  • 8
  • I'd also add that syncing at block device level can sometimes be REALLY faster than going file-over-file with rsync, especially when you have millions (literally) relatively small files on filesystem. – artyom Feb 21 '13 at 19:33
  • `mdadm`? I'm using hardware RAID. And I have no idea what AOE is, and have never used iSCSI. I don't think my servers are in the same broadcast domain, just in the same datacenter. There's at least one or two switches between the servers. – allquixotic Feb 21 '13 at 21:20
  • I think this an excellent idea! But how does it deal with the different size of disks? – Tim Haegele Feb 21 '13 at 21:36
  • @allquixotic, nevertheless, you can try the following scheme replacing AOE with nbd (network block device, provided by `nbd-server` and `nbd-client` packages). `mdadm` is used just to sync two block devices, no metadata is written in `build` mode, so you can use it on top of any block device (it have to be unmounted first). The thing is, I usually set up a fresh system on a degraded mdadm raid1 even if I have hardware raid underlying, this way I can apply the technique described without having to unmount partitions, reducing migration downtime to single reboot time. – artyom Feb 22 '13 at 03:29