Node mirrors - or is it load balancing or reverse proxying?

1

If you've got a solution, preferably open source, of course(!), I'd be interested to know, but I'm actually wanting to know the correct terminology.

The device I'm thinking of is an address translator (like NAT) that takes the request for a URL and either answers it from its cache, or from one of the possible server machines.

If there's a write (a POST request), this device would send it to all servers.

So all servers can be in the same state, without any need for synchronisation. If one node fails, the remaining node(s) will be in the same state, so can reply with requests.

So, as a diagram:

Client -> read -> device -> Server A/B (load balancing) or from cache


                                     ->  Server A
Client -> write (POST url) -> device ->  Server B
                                     ->  Server N

All the servers are given the same writes so Server A, B,...N are all in exactly the same state (if they're running).

Any server, or the cache, can reply to a read.

The questions:

  • What is this device called?
  • Is this easy to set up with Apache/Squid? -- if so, where's an idiot's guide?
  • This makes the device itself, the SPOF (single point of failure), how do you set it up so you have two independent devices doing this so that, if one fails, the other will take over seamlessly?

Peter Brooks

Posted 2014-10-26T04:39:33.463

Reputation: 111

Thank you, I'll look into those. The problem is simple, there's a requirement to have have a fairly highly available mediawiki site. Mirroring the nodes is a much better solution than mirroring the databases or the wikis. – Peter Brooks – 2014-10-26T06:38:05.430

That looks very close - it seems that the term is 'content switching' or 'application switching'. The Cisco product brief doesn't directly describe what I was asking about, but it looks as if it's the right line of country. – Peter Brooks – 2014-10-26T06:46:55.823

I collected the info in my earlier comments and put them in an answer. After that, I deleted my comments to avoid duplicate information. – agtoever – 2014-10-27T10:05:14.637

Answers

0

The commercial, proprietary and a bit expensive solution: CISCO content switching
CISCO has a product line called Content Service Switches (their CSS 11500 series). Based on your question, this comes close to what you're looking for. The terms they use (and you can use them to Google for similar products) is "content switching" or "application switching".

(I haven't used such devices in real-life nor am I affiliated in any way to Cisco)

High Availability design principes: Keep it simple
In my opinion, there is a weak point in your diagram and that's the write (POST) to all servers. You rely on the "device" to sync all instances of the server (Apache, NSF folders, database, etc.). This introduces a new SPoF (the device itself, as you already noticed) and adds the complexity of synchronization. If one of the servers is (temporary) unavailable, who is responsible for the re-sync? The server itself or the "device"?

A better pattern is to put the responsibility of failover / high availability within the infrastructure itself:

  • on a hardware level (RAID, redundant hardware, multiple network paths, etc.),
  • on the OS level (under Linux: Heartbeat: the cluster messaging layer, Pacemaker: the cluster resource manager and Cluster Glue: the cluster management tools) and
  • on the application level. For example Apache had several options, such as Camel, all database platforms come with clustering options and NSF can be setup using the OS's HA features).

High available failover LAMP server
I think for a MediaWiki the general solution is to build a high available failover LAMP. If you Google on those terms, lots of solutions pop up, such as described in this article and a great schematic of a high available LAMP setup here on serverfault.

agtoever

Posted 2014-10-26T04:39:33.463

Reputation: 5 490

That makes sense - I'm not keen on syncing the OS or the database (at least not all the time, if one node goes down, I'm happy to sync it up with the other). The application level would be ideal - I'll look at camel. The problem with the high-availability example with LAMP is the sql syncing, which makes the sql system a SPOF - if you upgrade, you have to upgrade them all, and an error will propagate. With separate nodes syncing at the application level, one node can use MySQL and the other sqlite, so you're protected against database software induced errors, sync & XML level. – Peter Brooks – 2014-10-27T10:37:47.287

Having two brands of DBMSs is very unusual; I've never seen that in real life. Software or configuration errors should be filtered out in the DTAP stages before a configuration to production.

– agtoever – 2014-10-27T22:13:14.637

Yes, I know it's unusual. I've known of cases (very few admittedly) where an upgrade/patch to a database has destroyed it - with mirrored DBMS, both have died and the entire system has gone. Pre-live testing of all patches should prevent this, certainly, but it doesn't always. So I think two is a good idea, particularly if the primary is sqlite for speed and the secondary mysql for reliability. Human error can also destroy both, more difficult with different technologies, and no mirroring. – Peter Brooks – 2014-11-01T07:01:42.840