7

Typically, load balancers like Amazon's Elastic Load Balancers use a DNS record set with multiple A records to provide multiple load balancer instances which can handle traffic to requesting endpoints:

$ dig +short my-fancy-elb.us-east-1.elb.amazonaws.com
10.0.1.1
10.0.1.2

If I attempt to curl this URL in verbose mode, I notice that curl seems to round-robin attempts to the two IP addresses:

$ curl -ivs http://my-fancy-elb.us-east-1.elb.amazonaws.com | grep -i 'connected'
* Connected to my-fancy-elb.us-east-1.elb.amazonaws.com (10.0.1.1)
$ curl -ivs http://my-fancy-elb.us-east-1.elb.amazonaws.com | grep -i 'connected'
* Connected to my-fancy-elb.us-east-1.elb.amazonaws.com (10.0.1.2)

Is the fact that curl does round-robin on the A records described in the record set done by the curl binary itself or is it something that the Linux kernel does for it?

TCP exists at layer 4 and DNS exists at layer 7, so I'd imagine that individual binaries and libraries would have to implement their own load-balancing and failover: fetching the DNS record set for the given domain name and choosing a TCP address to connect to from that set.

Can I reasonably expect that programming languages, browsers, and libraries like curl will do load-balancing and failover on A records for me?

Naftuli Kay
  • 1,648
  • 6
  • 22
  • 43
  • The kernel has no say in this. It's all decided in user mode by either application or libraries. – kasperd May 02 '16 at 21:48

4 Answers4

10

The short answer is that it varies.

When multiple address records are present in the answer set, a queried DNS server normally returns them in a randomized order. The operating system will typically present the returned record set to the application in the order they were received. That said, there are options on both sides of the transaction (the nameserver and the OS) which can result in different behaviors. Usually these are not employed. As an example, a little-known file called /etc/gai.conf controls this on glibc based systems.

The Zytrax book (DNS for Rocket Scientists) has a good summary on the history of this topic, and concludes that RFC 6724 is the current standard that applications and resolver implementations should adhere to.

From here it's worth noting a choice quote from RFC 6724:

   Well-behaved applications SHOULD NOT simply use the first address
   returned from an API such as getaddrinfo() and then give up if it
   fails.  For many applications, it is appropriate to iterate through
   the list of addresses returned from getaddrinfo() until a working
   address is found.  For other applications, it might be appropriate to
   try multiple addresses in parallel (e.g., with some small delay in
   between) and use the first one to succeed.

The standard encourages applications to not stop at the first address on failure, but it is neither a requirement nor the behavior that many casually written applications are going to implement. You should never rely solely on multiple address records for high availability unless you are certain that the greater (or at least most important) percentage of your consuming applications will play nicely. Modern browsers tend to be good about this, but remember that they are not the only consumers that you are dealing with.

(also, as @kasperd notes below, it's important to distinguish between what this buys you in HA vs. load balancing)

Andrew B
  • 31,858
  • 12
  • 90
  • 128
  • 3
    It's true that one shouldn't rely on multiple records in DNS responses as the only load balancing mechanism. DNS can be useful to do some very coarse load balancing. But no matter what you do at the DNS layer it will never produce an even spread of the load, so you'll need load balancing at another layer as well (or lots of excess capacity). Load balancing can actually be achieved even if you have only a single static A record, but I wouldn't recommend that approach. – kasperd May 02 '16 at 22:01
  • 2
    However for failover you have to rely on the client implementing failover between different addresses. Any solution which doesn't rely on this on the client side will have a single point of failure somewhere. But as you point out, not all applications are well-behaved. So one need to strive to make every one of the IPs handed out by DNS be reliable, even if there are limits to the reliability achievable on individual IPs due to SPoFs. – kasperd May 02 '16 at 22:05
  • @kasperd All true. I also changed my verbiage from load balancing to HA, as it was incorrect in that context. – Andrew B May 02 '16 at 22:13
4

My guess what happens is that the DNS TTL for the record is set really low and curl just needs to resolve again every time and will get another IP from the DNS server.

Neither curl nor the kernel are at all aware that this DNS level load balancing happens and you can't reasonably expect anything like that.

Sven
  • 97,248
  • 13
  • 177
  • 225
  • That said, the example looks like a flavor of UNIX. TTL will only be a factor here if a caching nscd (or full recursive server) is present on the system. The normal behavior will be for curl to look it up every time, and the recursor will usually rotate the order of the answers in the rrset for every query. – Andrew B May 02 '16 at 22:40
1

The basic thing is DNS servers usually cycle the records in a pseudorandom fashion.

fedor@piecka:~$ dig +short @ns1.yahoo.com yahoo.com
206.190.36.45
98.138.253.109
98.139.183.24
fedor@piecka:~$ dig +short @ns1.yahoo.com yahoo.com
98.139.183.24
206.190.36.45
98.138.253.109
fedor@piecka:~$ dig +short @ns1.yahoo.com yahoo.com
98.139.183.24
98.138.253.109
206.190.36.45

In the case of curl, it has it's own DNS resolving library which respects the server presented order.

There is a story on this topic on https://daniel.haxx.se/blog/2012/01/03/getaddrinfo-with-round-robin-dns-and-happy-eyeballs/. The curl's implementation is mentioned there too.

Fedor Piecka
  • 400
  • 1
  • 2
  • 8
1

Is the fact that curl does round-robin on the A records described in the record set done by the curl binary itself or is it something that the Linux kernel does for it?

Neither. Its the DNS server which changes the IP address usually. The curl library needs to resolve the host-name to get the IP address for each request. It sends the request to the DNS server which sends back a list of IP addresses. The DNS server can also be local on the same machine for caching. Most of the DNS server rotate the IP list round-robin in every request. Thus you get a different IP in every request as the top IP of the list has changed. If you ping www.google.com from a linux machine you will likely see different address each time.

Do clients typically implement failover/load-balancing on multiple A records?

I performed a test with curl to fetch a file over http. Curl is able to retry with another IP when the first ip is not accessible (failover). So 'failover' is working with curl for http request.