I've always used Round-Robin DNS, with long TTL, as load-balancer. It works really fine for HTTP/HTTPS services with browsers.
I really stress out with browsers as most browsers implement some sort of «retry on another IP», but I don't know how would other libraries or softwares handle the multiple IP solution.
When the browser doesn't get a reply from one server, it will automatically call the next IP, and then stick with it (until it's down... and then tries another one).
Back in 2007, I've done the following test:
- add an iframe on my website, pointing to one Round-Robin entry, such as
http://roundrobin.test:10080/ping.php
- the page was served by 3 PHP sockets, listening on 3 differents IP, all on port 10080 (I couldn't afford to test on port 80, as my website was running on it)
- one socket (say A) was there to check that the browser could connect on the 10080 port (as many companies allow only standard ports)
- other two sockets (say B and C) could be enabled or disabled on the fly.
I let it run one hour, had a lot of data. Results were that for 99.5% of the hits on socket A, I had a hit on either socket B or C (I didn't disable both of these at the same time, of course).
Browsers were: iPhone, Chrome, Opera, MSIE 6/7/8, BlackBerry, Firefox 3/3.5... So even not-that-compliant browsers were handling it right!
To this day, I never tested it again, but perhaps I'll setup a new test one day or release the code on github so that others can test it.
Important note: even if it's working most of the time, it doesn't remove the fact that some requests will fail. I do use it for POST requests too, as my application will return an error message in case it doesn't work, so that user can send the data again, and most probably the browser will use another IP in this case and save will work. And for static content, it's working really great.
So if you're working with browsers, do use Round-Robin DNS, either for static or dynamic content, you'll be mostly fine. Servers can also go down in the middle of a transaction, and even with the best load-balancer you can't handle such a case.
For dynamic content, you have to make your sessions/database/files synchronous, else you won't be able to handle this (but that's also true with a real load-balancer).
Additional note: you can test the behaviour on your own IP using iptables
. For example, before your firewall rule for HTTP traffic, add:
iptables -A INPUT -p tcp --dport 80 --source 12.34.56.78 -j REJECT
(where 12.34.56.78
is obviously your IP)
Don't use DROP
, as it leave the port filtered, and your browser will wait until timeout. So now, you can enable or disable one server or the other. The most obvious test is to disable server A, load the page, then enable server A and disable server B. When you'll load the page again, you'll see a little wait from the browser, then it will load from the server A again. In Chrome, you can confirm the server's IP by looking at the request in the network panel. In the General
tab of Headers
, you'll see a fake header named Remote Address:
. This is the IP from where you got an answer.
So, if you need to go in maintenance mode on one server, just disable the HTTP/HTTPS traffic with one iptables
REJECT
rule, all requests will go to other servers (with one little wait, almost not noticeable for users).