1

What would be the best approach for CouchDB replication in the following setup:

A1 and A2 are two CouchDB servers in one DC. They both pull data from each other, although only one is actively used, the other is just a stand by in case of a failure of the first one.

B1 and B2 are similarly set up in terms of replication and are located in a different DC.

What's the best way of achieving A <-> B replication?

I see two options here:

Option 1:

  • A1 pulls from B1 and B2
  • A2 pulls from B1 and B2
  • B1 pulls from A1 and A2
  • B2 pulls from A1 and A2

  • A1 pulls from A2

  • A2 pulls from A1
  • B1 pulls from B2
  • B2 pulls from B1

Option 2:

  • A1 pulls from B1
  • A2 pulls from B2
  • B1 pulls from A1
  • B2 pulls from A2

  • A1 pulls from A2

  • A2 pulls from A1
  • B1 pulls from B2
  • B2 pulls from B1

IMHO Option 2 is sufficient and covers all bases for the HA setup, ie one way or the other no singe failure would prevent data from being replicated to all 4 DB instances.

There's not much data in there, we're talking about 50-100MB of data max.

Comments welcome. Thanks!

rytis
  • 2,324
  • 1
  • 18
  • 13

1 Answers1

1

What I'd be looking for is to minimize cross DC operations as:

  1. these are more expensive
  2. they are more unreliable

Then your replication scheme should take into consideration the "safest" path: basically making sure that you replicate data cross centers without impacting your main servers.

Alex Popescu
  • 111
  • 2
  • which means option 2 then? i need to cater for either one of the A's fail, or one of the B's fail. network is assumed not to fail ever.. (well, yeah, as if... :) but that's the working assumption) – rytis Oct 09 '10 at 19:19