Need help with estimating required bandwidth for SAN array to SAN array replication over WAN

Question

I have a long-term goal of setting up a DR site in a colo somewhere and part of that plan includes replicating some volumes of my EqualLogic SAN. I'm having a bit of a difficult time doing this because I don't know if my method is sound.

This post may be a bit lengthy for the sake of completeness.

Generally relevant information:

I have one EqualLogic PS4000X (~4TB).
The SAN Acts as shared storage for 2 ESXi hosts in vSphere 5 environment.
I have 4 volumes of 500GB each. Volumes 1 and 2 contain my "tier 1" VMs. These will be the only volumes I plan to replicate.
Currently have 3Mb/s connection with actual data bandwidth at ~2.8Mb/s because of our PRI(voice).

My method of measuring change in a volume:

I was told by a Dell rep that a way (perhaps not the best?) to estimate deltas in a volume is to measure the snapshot reserve space used over a period of time of a regular snapshot schedule.

My first experiment with this was to create a schedule of 15 minutes between snapshots with a snapshot reserve of 500GB. I let this run overnight and until COB the following day. I don't recall the number of snapshots that could be held in 500GB but I ended up with an average of ~15GB per snapshot.

$average_snapshot_delta = $snapshot_reserve_used / $number_of_snapshots

I then changed the snapshot interval to 60 minutes which after a full 24 hours passing means a total of 13 snapshots in 500GB. This leaves me with ~37GB per hour (or ~9GB per 15 mins).

The problem:

These numbers are astronomical to me. With my bandwidth I can do a little over 1GB/hour with 100% utilization. Is block-level replication this expensive or am I doing something completely wrong?

a rate of change of 37GB\hr doesn't sound right for such a small setup. What type of applications are you dealing with. — tony roth, Sep 27 '11 at 02:54
I'm a bit new around here... should I answer your question in my comment or in an edit to my original post? — Chris76786777, Sep 27 '11 at 03:16
I'll just add my response here. I have Exchange 2010 with ~40 mailboxes, SQL Server 2008 R2, a file server, and a Remote Desktop Server servicing about 20 users. — Chris76786777, Sep 27 '11 at 04:51
@cparker4486 You can add it as an edit to the question or as a comment as you did, either way. Exchange is very I/O heavy, the online defrag can be a biiiiig I/O event. SQL can also be heavy as you get two writes for every UPDATE (log then datafile) and probably more (update indexes). — sysadmin1138, Sep 27 '11 at 11:30
@sysadmin1138 Understood. What I'm really hoping to find out though is if my method for calculating the rate of change is accurate or not. — Chris76786777, Sep 27 '11 at 15:59

score 4 · Answer 1 · answered Sep 27 '11 at 02:54

Your numbers boil down to 10.24 MB/s, which does seem a bit on the high side for pure write. But then, I don't know your workloads.

However, you have a bigger problem. The initial replication will be replicating 1TB of data over a 3MB/s straw.

1TB = 1024GB = 1,048,576 MB
2.8 MB/s replication speed
~4.33 days

During that time it'll be queueing up your net-change for when the initial sync finishes. And if you ever need to pull data from the remote array, it'll be 4.33 days until you're fully up and running (unless you have an out-of-band method of data-transfer, like a FedEx Overnight Shipping or a truck).

As for the difference in net-change between your 15 minute snapshots and the 60 minute snapshots, I believe the 60 minute snapshot is getting the benefit of a lot of write-combining. Or put another way, all of those writes to the filesystem journals are being coalesced in the longer snapshot in the way they aren't as much in the 15 minute snaps.

This is where sync mode really comes into its own. A 3MB/s pipe is woefully underprovisioned for synchronous replication. A batched asynchronous replication will gain some of the benefits of write-combining, and therefore lower total transfer, at the cost of losing some data in a disaster. Unfortunately, I'm not well versed enough in Equilogic to know what it's capable of

hopefully he can put the two heads in the same room do the sync then move them and run async at that point. — tony roth, Sep 27 '11 at 03:10
I would synchronize the arrays before putting one at the remote site. — Chris76786777, Sep 27 '11 at 04:48

score 3 · Answer 2 · answered Oct 02 '11 at 15:59

This is the biggest con against equallogic in my opinion. Replication is based on snapshots and their snapshot technology is incredably ineffecient.

We run about 25 arrays in our environment and my 2-3 year goal is to replace them all with netapp. Based on what we see on our netapp cif filers and testing of nfs the replication bandwidth and snapshot space will be reduced by 80%. add to the the dedupe features of the netapp and it is much more efficient.

Make sure to put your windows page files and your vmware swap files on a non replicated volume.

Also - if you can afford it look at adding some riverbed wan optimizers. They will reduce the amount of data on your wan for repliation by 60% or so. It has saved us and we have minumum ds3 wan connections up to oc-3.

You also did not mention what you latency is. It is a critical component in replication calculations.

Be careful about Netapp with LUNs. It looks good on paper, but there's a considerable cost in terms of overhead- make sure to calculate the disk space used after the snapshot reserve, which is considerably higher on a LUN than on a file system. — Basil, Oct 03 '11 at 00:07
Thanks for the idea on page files and vmware swap files. That's a great idea. Also, don't throw your EQLs SANs away just yet. Dell recently bought a company (the name of which is escaping me) that does dedupe and they have said it will be baked into EqualLogic in the future. If your window is 3 years there will probably be no reason to switch. On your WAN optimization point I'm looking into a product from Certeon call aSYNC which was recommended by Dell. — Chris76786777, Oct 04 '11 at 22:27

score 2 · Answer 3 · answered Oct 02 '11 at 15:12

2

If your VMs do not have their page files on a separate datastore, you should try moving them to one and then re-measuring your rate of data change (data churn). This will definitely help. Don't replicate more than you need to.

Does EQL support continuous async replication or is driven by a snapshot schedule? Can you use the whole 3Mb 24/7?

I also second the suggestion that you synchronize the arrays before putting one at the remote site.

answered Oct 02 '11 at 15:12

Jeremy

938
2
7
18

Thanks for the idea on the page files. I don't know if it does async or not but I wouldn't be able to saturate the line 24/7 anyway. I may have to try once or twice daily updates. They will definitely be synched before the secondary array is moved. – Chris76786777 Oct 04 '11 at 22:24

score 1 · Answer 4 · answered Oct 03 '11 at 00:44

For the sake of focusing on the most relevant information, I'd suggest defining an objective for your recovery point and recovery time. These are unimaginatively referred to as an "RPO" and "RTO". Disk replication is supposed to reduce them both by keeping a crash-consistent copy of the data that's never older than a few minutes on another site.Once you have these numbers, you can define things like how often you have to have a crash consistent replica.

3Mb/s is probably not going to cut the mustard unless you use WAN acceleration (such as Riverbed, mentioned by one of the other answerers). WAN acceleration works by keeping a cache on disk at both sides of the link where they store all the most recent data you've sent, and if you ever send a duplicate block, it sends a reference instead of the data.

That said, assuming your storage is using the same engine to take snapshots as it uses to replicate snapshots, then the most accurate measure of change is indeed a snapshot reserve. You'd need to keep one snapshot and its reserve isolated for the duration of the the measurement period, though. Assuming EqualLogic uses copy on write snapshots, comparing data from the reserve of several snapshots taken throughout the day might actually make it seem like your data's changing more than it actually is.

As for the data itself, I agree with the replies that suggest not replicating the swap files. Swap files can take a lot of disk and are always changing, which would trigger a lot of replication traffic. I don't know whether VMWare supports replication of an environment without them, though... I assume that the VMs in a a VM datastore replicated without swap files would be crash-consistent, however I can't confirm that myself.

score 0 · Answer 5 · answered Oct 02 '11 at 16:48

0

I am currently in the process of something similar however with Solaris 11 and zfs as our san backend. Because of bandwidth I decided to separate out most of the components. We migrated to exchange 2010 so that we can set up our dr site with an identical copy. What I found was doing san level snapshots would be ridiculous for this data because of bandwidth issues like you are seeing. We decided it would be cheaper and more efficient to set up a dag and replicate within exchange itself. We also did the same thing with our mysql servers. What we clone now are systems with less deltas between snapshots. I was able to do the initial synchronization at the office and transport it to its final destination.

answered Oct 02 '11 at 16:48

gdurham

879
6
10

Thanks for the idea. I may have to move to file/database level replication like you mentioned but the maintenance of those things just seem like too much for the small environment that I have (1 Exchange Server 2010; 1 SQL Server 2008 R2). I still need to do more investigating but it's coming down to cost overhead vs administration overhead. – Chris76786777 Oct 04 '11 at 22:33
Cost is def a factor. However I will say from a mgmt perspective it couldn't be easier. It made more sense to go this route because of both cost from bandwidth as well as management. I could have easily created replication on a storage level however it would have just been a mgmt nightmare. – gdurham Oct 05 '11 at 06:00

score 0 · Answer 6 · answered Sep 27 '13 at 18:33

The block size is 16 Mo for snapshot and replication on Equallogic SANs. This is why you got those astronomical numbers. No way to change that. The solution for us to meet our RTO/RPO SLA was to install a Riverbed WAN Optimization Appliance between the 2 sites.

Need help with estimating required bandwidth for SAN array to SAN array replication over WAN

Generally relevant information:

My method of measuring change in a volume:

The problem:

6 Answers6