Is there a Linux HA software load balancer that serves HTTPS for multiple unrelated domain names but balances to a single web server cluster?

Question

I have a cloud-based (Amazon AWS, Rackspace, whatever) multi-tenant SaaS application, and I need to support HTTPS communications for multiple unrelated tenant domains.

As an illustrative example, let's say our SaaS is available at:

https://foo.com

Tenants can access their tenant-specific UI and service endpoints via:

https://tenantA.foo.com
https://tenantB.foo.com
...

This is easy to support today with one wildcard SSL certificate.

However, with our SaaS, our tenants may wish to expose our UI (but branded for them) directly to their own users.

This causes a problem: let's say John Smith is an existing customer of tenantA (and has no knowledge of foo.com). If John Smith is directed to https://tenantA.foo.com, they could easily become confused (e.g. 'who the heck is foo.com? Why am I here? Am I being hacked? Aahhh!').

To avoid this problem, our tenants would set up a subdomain like:

https://foo.tenantA.com

This avoids a lot of end-user confusion: tenantA's users can see a URL they recognize as owned by tenantA and will more readily use the app. But tenantA wants us to host everything about the app, which means foo.com's infrastructure needs to serve the SSL connection.

To that end, we want to support the following:

A tenant uploads to us an SSL cert+key for foo.tenantA.com.
We take that SSL cert and dynamically install it into a highly available Load Balancing cluster (2 or more LB nodes) that load balances requests to our SaaS application web endpoints.
The tenant updates their DNS to have foo.tenantA.com be a CNAME redirect to tenantA.foo.com.

This way our Load Balancer pool will serve/terminate all HTTPS communications to foo.tenantA.com and all requests are load balanced to our SaaS web server cluster.

This means SSL certs should be able to be added and removed from the LB pool at runtime. Changes cannot interrupt the ability to service existing or new HTTPS requests.

Also, as we'll deploy on virtualized hardware (e.g. EC2) with Linux, we don't have access to the hardware/data center. This must be a software-based solution that can run in Linux. It must also be highly-available (2 or more LB 'nodes').

Does anyone know of a concrete solution? For example, can Nginx, HAProxy or Squid (or anything else) be set up to support this? Is there a 'recipe' or existing solution that is documented and suitable?

P.S. Amazon's Elastic Load Balancer (at the time of writing) cannot pragmatically satisfy this need - it would require an Amazon ELB for each tenant domain. Since every ELB needs to 'ping' the web servers, if you had 500 tenants, you'd have 500 ELBs pinging the SaaS web service endpoints - a non-negligible negative performance hit.

Are you able to use Server Name Indiction (ie, no XP clients) or do you have enough IP addresses available to dedicate one per SSL cert? — Shane Madden, Feb 03 '12 at 22:42
@ShaneMadden SNI might be an option we could take - thanks for pointing it out! (we probably couldn't acquire enough IPs from Amazon since they have a 5 IP limit per AWS account). — Les Hazlewood, Feb 04 '12 at 01:23

phemmer · Accepted Answer · 2017-09-13T12:55:27.807

Update 2017-09-13: SNI has now become prevalent enough in the mainstream browsers that it can probably be used to address the request, and this answer should be considered out of date.

The only way to support this is to have an IP for each of your clients. When you connect via https, the connection is encrypted immediately, there's no chance for the browser to say "I'm here for foo.tenantA.com". So the only way for the server to know which SSL cert should be used to encrypt the connection is based on the IP that the connection came in on.

Now this is still possible, but it means that youre going to need a lot of IPs. We actually do this exact setup at my work. We have 2 active/active load balancers, with half the IPs on one balancer, the other half on the other balancer (a total of around 500 IPs). Then we have several web servers on the back end that take all the connections. Any web server can fail and the load balancer will stop sending it connections. Or the load balancer itself can fail and the other will take all of its IPs.
The load balancing software that does this is Pacemaker and ldirectord (both are mainstream projects and whatever distro you run should have them in its repository). The Linux kernel itself is the one that actually does the load balancing, the software is just responsible for handling failovers.

Note: For load balancing, there are lots of alternatives to ldirectord, such as keepalived and surealived. Though for the actual load balancer failover software, pacemaker is what you should be using.

Basic guides:

This will provide basic instructions for getting pacemaker configured. You can skip past all the previous stuff as CMAN is its replacement. The only thing you need to do at all to get up to that point in the guide is to install pacemaker and its dependencies. Stop at section 8.2.4. You do not need to go on to section 8.3 as thats not relevant to what youre doing.
Once you have pacemaker working, this will provide a very basic configuration to load balance a http server.
You might also look at this and this. Its more of a higher level overview of pacemaker, what it does, and how to use it.

You could also use SSL with SNI, but you can't support IE on windows XP. http://en.wikipedia.org/wiki/Server_Name_Indication#Support — Kyle, Feb 03 '12 at 23:01
@kyle yes, I cant wait until the day comes where this can be implemented without cutting off a large portion of our customer base :-( — phemmer, Feb 03 '12 at 23:14
@Patrick, I forgot to cover the multi-IP vs SNI issues - thanks for clarifying. That being said, thanks so much for the pointers to Pacemaker and ldirectord, I'll check them out asap. Any pointers/links for how to set this up easily? — Les Hazlewood, Feb 04 '12 at 00:18
@Patrick - hmmm - "the other will take all of its IPs". Is this even possible on Amazon EC2, even with Elastic IPs? Or do you need access to the iron to be able to support this feature? I need to investigate... — Les Hazlewood, Feb 04 '12 at 00:35
@LesHazlewood I dont know about the amazon ec2 question. As long as both load balancers are on the same subnet, it should work though. For a guide, I'll amend my answer up above. — phemmer, Feb 04 '12 at 00:55
@Patrick this is super helpful - thanks!!! I'm not sure if this will work on ec2, but it is enough to get us thinking - answer awarded. — Les Hazlewood, Feb 04 '12 at 01:03
@LesHazlewood, Did you get this set up work on EC2? I needed a similar set up and I was wondering did you end up having hundreds of IPs on AWS? Thanks for your attention. — thanikkal, Apr 07 '12 at 02:32
@thanikkal Sorry, we didn't get this set up yet. It is a deferred priority for us, to be addressed at a later date. — Les Hazlewood, Apr 09 '12 at 17:53

score 0 · Answer 2 · answered Feb 03 '12 at 23:31

What about just recommending that your client put a thin wrapper on it themselves? Something like this:

End user sends a request to https://api.tenantA.com
api.tenantA.com just forwards the request to https://tenanta.foo.com
Response is then filtered back the same way.

I'm guessing that as long as this is more of an edge case it should work fine.

Is there a Linux HA software load balancer that serves HTTPS for multiple unrelated domain names but balances to a single web server cluster?

2 Answers2