Restoring an Amazon EBS RAID0 array from snapshots taken with ec2-consistent-snapshot

Question

I've configured a new MySQL server on Amazon EC2 and decided to store my data on a EBS RAID0 Array. So far so good, and I've tested taking snapshots of those devices with ec2-consistent-snapshot, great.

Now, how do you rebuild the array on a new instance, from these snapshots, quickly?

When you use ec2-consistent-snapshot to create a snapshot of multiple volumes, you have no way to tell which volume was used for each device in the RAID. I maybe completely wrong, but since you're striping data across the volumes, it would stand to reason that you have to put each NEW volume in the same location on the RAID as the volume from which the snapshot was created.

An example:

3x200gb volumes in a RAID0 configuration.
vol-1 is /dev/sdh device 0 in the RAID
vol-2 is /dev/sdh1 device 1 in the RAID
vol-3 is /dev/sdh2 device 2 in the RAID

you create an ec2 snapshot with: ec2-consistent-snapshot <options> vol-1 vol-2 vol-3.

You now have 3 snapshots, and the only way to trace back which device they are is to look at the source volume id, then look at which device the source volume id is mounted as on the instance, and then check the details of the RAID configuration on the source volume's instance.

This is obviously incredibly manual...and not fast (which obviously makes it hard to bring up a new mysql instance quickly if the other one fails. not to mention, you'd have to record the device positions on the RAID at the time of snapshot, because if the source volume instance crashes, you have no way to get to the RAID configuration).

So, in conclusion:

Am I missing something with how ec2-consistent-snapshot and a software RAID0 array work?
If not, are there any known solutions / best practices around the problem of not knowing to which device/position in the RAID array a snapshot belongs?

I hope this was clear, and thanks for your help!

score 5 · Accepted Answer · answered Feb 24 '11 at 08:52

since you're striping data across the volumes, it would stand to reason that you have to put each NEW volume in the same location on the RAID as the volume from which the snapshot was created.

I tested your premise, and logical as it may seem, the observation is otherwise.

Let me detail this:
I have the exact same requirement as you do. However, the RAID0 that I am using has only 2 volumes.

I'm using Ubuntu 10 and have 2 EBS devices forming a RAID0 device formatted with XFS.

The raid0 device was creating using the following command:
sudo mdadm --create /dev/md0 --level 0 --metadata=1.1 --raid-devices 2 /dev/sdg /dev/sdh

I've installed MYSQL and a bunch of other software that are configured to use /dev/md0 to store their data files.

Using the same volumes: Once done, I umount everything, stop the Raid and reassemble it like so: sudo mdadm --assemble /dev/md0 /dev/sdh /dev/sdg The thing is that irrespective of the order of /dev/sdg /dev/sgh, the RAID reconstitutes itself correctly.

Using snapshots: Post this, I use ec2-consistent-snapshot to create snapshots of the 2 EBS disks together. I then create volumes from this disk, attach it to a new instance (that has been configured for the software already), reassemble the RAID (I've tried interchanging the order of the EBS volumes too), mount it and I'm ready to go.

Sounds strange, but it works.

So, basically, when you re-build the array, it doesn't matter in which order you build it at all. I guess this is because of the superblock data written to the disks, so the RAID controller knows how to put it back together. This is fantastic! Thanks for your answer, it's pretty much exactly what I needed! — Jim Rubenstein, Feb 25 '11 at 14:29

score 4 · Answer 2 · answered Apr 12 '11 at 10:45

I run a similar configuration (RAID0 over 4 EBS volumes), and consequently had the same concerns to reconstitute the RAID array from snapshots created with ec2-consistent-snapshot.

Fortunately, each device in a raid array contains metadata (in a superblock) that records its position in the array, the UUID of the array and the level of array (e.g. RAID0). To query this superblock on any device run the following command (the line matching '^this' describes the queried device):

$ sudo mdadm --examine /dev/sdb1
/dev/sdb1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 2ca96b4a:9a1f1fbd:2f3c176d:b2b9da7c
  Creation Time : Mon Mar 28 23:31:41 2011
     Raid Level : raid0
  Used Dev Size : 0
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0

    Update Time : Mon Mar 28 23:31:41 2011
          State : active
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0
       Checksum : ed10058a - correct
         Events : 1

     Chunk Size : 256K

      Number   Major   Minor   RaidDevice State
this     0     202       17        0      active sync   /dev/sdb1

   0     0     202       17        0      active sync   /dev/sdb1
   1     1     202       18        1      active sync   /dev/sdb2
   2     2     202       19        2      active sync   /dev/sdb3
   3     3     202       20        3      active sync   /dev/sdb4

If you do the same query on a device which is not part of an array, you obtain:

$ sudo mdadm --examine /dev/sda1
mdadm: No md superblock detected on /dev/sda1.

Which proves that this command really relies on information stored on the device itself and not some configuration file.

One can also examine the devices of a RAID array starting from the RAID device, retrieving similar information:

$ sudo mdadm --detail /dev/md0

I use the later along with ec2-describe-volumes to build the list of volumes for ec2-consistent-snaptshot (-n and --debug allow to test this command without creating snapshots). The following command assumes that the directory /mysql is the mount point for the volume and that the AWS region is us-west-1:

$ sudo -E ec2-consistent-snapshot --region us-west-1 --mysql --freeze-filesystem /mysql --mysql-master-status-file /mysql/master-info --description "$(date +'%Y/%m/%d %H:%M:%S') - ASR2 RAID0 (4 volumes) Snapshot" --debug -n $(ec2-describe-volumes --region us-west-1 | grep $(wget http://169.254.169.254/latest/meta-data/instance-id -O - -q) | egrep $(sudo mdadm --detail $(awk '{if($2=="/mysql") print $1}' /etc/fstab) | awk '/ \/dev\//{printf "%s ", $7}' | sed -e 's# /#|/#g') | awk '{printf "%s ", $2}')

score 0 · Answer 3 · answered Feb 20 '11 at 19:55

0

I know this doesn't answer your question, but I'm doing something similar but with Amazon's base ec2-create-snapshot tool and a cron script. It's not as fast as ec2-consistent-snapshot, but I get the extra control I need: fsync, lock writes, and most importantly, name the snapshots appropriately so they can be reconstituted in the correct order.

answered Feb 20 '11 at 19:55

scottburton11

123
1
1
5

I'm actually using XFS, so i freeze the filesystem while I snapshot. Combined with FLUSH and LOCK in MySQL (ec2-consistent-snapshot does all this), I should have a consistent snapshot each time. The problem is the naming, for which i just developed a temp solution, by modifying the ec2-consistent-snapshot perl script, for now. – Jim Rubenstein Feb 20 '11 at 20:28

Restoring an Amazon EBS RAID0 array from snapshots taken with ec2-consistent-snapshot

3 Answers3

Linked