13

I need to do host to host migrations from old hardware to new hardware. Specifically, from HP BL460G7 to HP BL460G8. Both the old and new servers have 2 x 600GB 2.5" drives and are configured for RAID1. I can afford 30 minutes downtime per server.

There are four servers to migrate, the smallest has a total of 120GB allocated in logical volumes and the largest has 510GB allocated. Three servers are running RHEL5 and one is running RHEL6.

I've been racking my brain on how to do this within the given time frame and without destroying the OS and critical data.

My only thought is this:

  • remove one drive from the old server (server is turned on)
  • remove both drives from the new server (server is turned off)
  • remove G7 drive from caddy and set aside
  • remove G8 drive from caddy and install into G7 caddy
  • install G8 drive in G7 caddy into old server
  • wait for RAID controller to rebuild RAID1 array
  • when finished shutdown old server
  • remove G8 drive in G7 caddy
  • install G8 drive in G8 caddy and insert into G8 (single drive installed)
  • boot G8 server
  • wait for OS to boot
  • when OS has booted insert remaining drive
  • wait for RAID array to rebuild

Does this sound sane?

EDIT: The RHEL5 are RHEL5.10 and the RHEL6 is RHEL6.6

I should have also noted that two of the systems are part of a hot four node cluster that does near constant replication of application "events" (its part of a critical infrastructure system). We have backups but we only use the in the event of total system failure.

Previous testing has shown about a maximum 'dd' between systems of around 50MBps which is far too slow.

EDIT: I was going to rely on kudzu to pickup and deal with the hardware changes.

user1174838
  • 616
  • 5
  • 18

4 Answers4

18

It should be noted, that there may be other steps needed, depending on the distribution. Most notably the drivers (thanks for pointing that out @ewwhite).

  1. Boot the new server from livecd/usb.
  2. Prepare partitions and bootblock on the new drives.
    • Depending on setup, this could be done by copying MBR/bootblock.
  3. Make the filesystems.
  4. Do an rsync from old server to new.
    • You might want to do it again to see how long will the follow-up rsync take - if its under 30 minutes, continue.
    • This is the time, you can actually try, if new system boots. Just be careful not to cause any IP (or other) conflicts.
  5. Shutdown all services that would write to the filesystem
    • Preferably reboot to livecd/usb
  6. Rsync data from old server to new again
  7. Reboot the new server and use it

Doing it this way, you still have the original server intact, so if anything goes wrong, there is an easy way back. But it requires some knowledge (grub/rsync/partitions), so I suggest doing some prep-work and testing in advance, before doing it live.

Fox
  • 3,887
  • 16
  • 23
  • There are actually driver differences between the two platforms, so it's important to know which minor releases of RHEL he's using. – ewwhite May 18 '15 at 10:34
  • Ah yeah, I shouldn't answer anything related to enterprise distros ... sorry 'bout that ... – Fox May 18 '15 at 10:37
  • @Fox: Undeleted by popular demand. Your answer is good. – Sven May 18 '15 at 11:05
  • @Fox I was referring to the two server models. They use different RAID controller drivers that only appeared in the latest RHEL5 release and maybe the second or third RHEL 6.x release. – ewwhite May 18 '15 at 11:07
  • @ewwhite I know what you mean. But the answer as it was, could cause trouble to someone following it blindly. I updated the answer to warn anyone trying. Though I have no good solution for the driver part of problem, as I kind of try to stay away from enterprise distros. – Fox May 18 '15 at 11:25
  • As I can only get about 50MBps from/to the disks, I can't take this approach – user1174838 May 18 '15 at 13:41
  • 1
    @user1174838 that should not be an obstacle ... the only problem i'd see is a very large amount of small files. – Fox May 18 '15 at 13:45
  • 1
    And, don't forget about this wonderful solution, that the double rsync minimizes the server downtime: because the majority of the data is transferred on the running server, the second rsync (on the now out-of-service server) copies only the latest differences. – peterh Jun 09 '15 at 00:47
6

Two things:

  • I would build anew and rsync data.
  • Your downtime allotment/window seems to be too short. 30 minutes can work in specific situations, but shouldn't YOU be dictating the realistic downtime requirement based on what it takes to actually accomplish the work?

Depending on the data contained within each server, the amount of data churn, and your provisioning scheme, it may make sense to install the necessary OS onto the new Gen8 ProLiant and synchronize the settings and other data portions at a point where you can quiesce the data.

Perhaps make a seed copy and derive your downtime requirement from the amount of time it takes to pick up the file changes on subsequent rsyncs. If you need to accelerate the transfer process or have lots of small files, there are techniques that can help with that.

I make these types of transitions often. With similar Linux installations, you rarely need more than an accurate package list (easily obtainable via Yum or RPM), the configuration directories (e.g. /etc) and your data partitions. If you don't already have a kickstart provisioning system, you can take advantage of the /root/anaconda-ks.cfg file to get an idea of how the G7 system was built.

To answer your question about simply moving the disks, based on the specific RHEL versions you mentioned, this is absolutely possible. You can move the disks/caddies and the HP Smart Array metadata is compatible between the P410 and P420 controllers that may be in your systems. However, I would not do this without fully updating the firmware of the drives and components in the new system first.

ewwhite
  • 194,921
  • 91
  • 434
  • 799
  • Some really good comments in this thread, thanks all. I'm going to go back to the PM and request a larger change window. – user1174838 May 18 '15 at 22:23
1

If your previous OS version is able to handle the new Hardware (mostly RAID controller) you can give a try to CloneZilla.

To check if it is possible to move from one hardware to another you cad pass all data from old to new server doing some tricks with dd.

Boot the new server with a live distro like SystemRescueCD, configure with an IP address and a dd command like this one:

nc -l 8000 | dd of=/dev/sda

On the current server perform

dd if=/dev/sda | nc ${newserverip} 8000

This will make a raw copy of your server's /dev/sda to the new server /dev/sda. This way you can perform a test without downtime on your original server and taking near zero risks.

alphamikevictor
  • 1,062
  • 6
  • 19
  • 2
    If you leave processes running on the old server that write to files on the old disk, especially database servers and similar, chances are very high that this will leave you with a corrupt (copied) filesystem and corrupt data in your (copied) files. Never dd a raw disk unless it's umounted or mounted read-only. – Guntram Blohm May 18 '15 at 12:16
  • @GuntramBlohm I know, it is just to check if you are able to clone the old server to new, without donwtime. Once you have tested, you can clone the server, of course shuting it down or stopping key services. – alphamikevictor May 18 '15 at 12:21
  • CloneZilla and related techniques will take longer than 30 minutes to copy the data between systems. – user1174838 May 18 '15 at 13:46
0

The project manager has denied my request for a larger outage window.

The proposed procedure outlined in the question worked well in testing. The downtime was under 20 minutes. I used the hpacucli utility to monitor the progress on the G7 and then the Gen8, it was very useful for this.

I've yet to do this in anger but as stated this has worked well in testing for RHEL 5.10 on BL460G7 to BL460 Gen8.

I did not update the firmware.

The initial RAID1 re-sync in the G7 took a bit over an hour. The re-sync in the Gen8 took under 50 minutes. This concerned me but I've not been able to find any problems.

Thanks again for all the helpful comments and suggestions.

user1174838
  • 616
  • 5
  • 18