34

I was researching different load balancing algorithms for HTTP and I just found 3. Random, Round Robin and Weighted Round Robin. Are there any other options?

Thanks Paul

Paul Sheldrake
  • 527
  • 1
  • 6
  • 14

3 Answers3

43

The most common load balancing algorithms for HTTP load balancers are IMHO:

  • Round Robin (sometimes called "Next in Loop").

  • Weighted Round Robin -- as Round Robin, but some servers get a larger share of the overall traffic.

  • Random.

  • Source IP hash. Connections are distributed to backend servers based on the source IP address. If a webnode fails and is taken out of service the distribution changes. As long as all servers are running a given client IP address will always go to the same web server.

  • URL hash. Much like source IP hash, except hashing is done on the URL of the request. Useful when load balancing in front of proxy caches, as requests for a given object will always go to just one backend cache. This avoids cache duplication, having the same object stored in several / all caches, and increases effective capacity of the backend caches.

  • Least connections, weighted least connections. The load balancer monitors the number of open connections for each server, and sends to the least busy server.

  • Least traffic, weighted least traffic. The load balancer monitors the bitrate from each server, and sends to the server that has the least outgoing traffic.

  • Least latency. Perlbal makes a quick HTTP OPTIONS request to backend servers, and sends the request to the first server to answer.

Arguably the above aren't algorithms in a strict computer science sense, they're more general descriptions of common approaches. Here is one little paper from Cisco which describes some of the algorithms they use in more detail. Implementations from other vendors will be slightly different.

There are edge cases where the more exotic algorithms are useful -- for example video streaming may lend itself well to "least traffic". But generally speaking, for most web applications and web sites, the optimal is solution is:

  • A shared / distributed session system, so that any webnode can answer any user request (i.e. user session data such as session cookies is equally available to all servers).

  • Load balancing using Round Robin (optionally Weighted Round Robin) or Random distribution. Round Robin and Random are simple and resilient algorithms without any 'hot spot' problems, i.e. the load distribution to backends remains fair in all situations.

5

The question is incomplete:

Load Balance WHAT?

CPUs may take saturation; the usual perspective is backwards - pushing at a resource instead of pulling to it.

Disks have many different kinds of loads to balance, such as space, read speeds, write speeds, throughput, etc.

Networks can be load balanced based upon latency or total throughput...

People can be load balanced based on individual capacity; some multi-task well, others don't and then there's quality vs quantity. You might optimize your human resources based on many factors and with different weights given to different attributes.

The above is far from exhaustive; the point is that different resources take completely different kinds of load balancing. Of their available attributes and capacities you have to state WHICH are of interest in balancing.

What you are trying to balance is the first criterion in making a good balancing algorithm. And the suggestion that there are only three is ... unenlightened. It would be worthy of a PhD to do a proper job trying to delineate all the ways "loads are balanced."

RT

Richard T
  • 1,130
  • 11
  • 26
  • 3
    you are missing the question Richard, algorithms are basis of any method or implementation. – monomyth Feb 12 '10 at 16:20
  • @monomyth, @Richard is right -- The algorithm choice depends on what you're load balancing. You can develop an algorithm to load balance disk space usage and that may not apply at all to something else, like HTTP requests. – Josh Feb 12 '10 at 19:13
  • @Josh, @ Richard, the concepts of load balancing are the same though. You might still use Round Robin for balancing disk usage, iSCSI, HTTP, CPU, anything. – Mark Henderson Feb 13 '10 at 22:52
  • @Farseeker I agree, Round Robin is pretty universal. But aren't there some load balancing algorithms that are specific to the task? – Josh Feb 14 '10 at 01:47
0

Not a direct answer to your question, but an actual solution we've found useful. Using LVS and the pulse daemon, our HTTP load balancing is configured to call a custom bash script that determines load on the "real servers" via a simple SSH connection and a call to uptime.

Then, based on the load average of the servers, a weighting is set per server. Not the most scientific approach, as load average is not necessarily indicative of HTTP connections or CPU load caused by those connections. Nonetheless, we've had surprisingly effective results.

My 2c. YMMV.

PS: take a look at the LVS project - you'll definitely find info on load balance scheduling implementations.

Zayne S Halsall
  • 1,902
  • 15
  • 19