I currently use DNS round robin for load balancing, which works great. The records look like this (I have a TTL of 120 seconds)
;; ANSWER SECTION:
orion.2x.to. 116 IN A 80.237.201.41
orion.2x.to. 116 IN A 87.230.54.12
orion.2x.to. 116 IN A 87.230.100.10
orion.2x.to. 116 IN A 87.230.51.65
I learned that not every ISP / device treats such a response the same way. For example some DNS servers rotate the addresses randomly or always cycle them through. Some just propagate the first entry, others try to determine which is best (regionally near) by looking at the IP address.
However if the user base is big enough (spreads over multiple ISPs, etc.) it balances pretty well. The discrepancies from highest to lowest loaded server hardly every exceeds 15%.
However now I have the problem that I am introducing more servers into the systems, and that not all have the same capacities.
I currently only have 1 Gbps servers, but I want to work with 100 Mbps and also 10 Gbps servers too.
So what I want is I want to introduce a server with 10 Gbps with a weight of 100, a 1 Gbps server with a weight of 10 and a 100 Mbps server with a weight of 1.
I previously added servers twice to bring more traffic to them (which worked nice—the bandwidth almost doubled). But adding a 10 Gbps server 100 times to DNS is a bit ridiculous.
So I thought about using the TTL.
If I give server A 240 seconds TTL and server B only 120 seconds (which is about about the minimum to use for round robin, as a lot of DNS servers set to 120 if a lower TTL is specified (so I have heard)). I think something like this should occur in an ideal scenario:
First 120 seconds
50% of requests get server A -> keep it for 240 seconds.
50% of requests get server B -> keep it for 120 seconds
Second 120 seconds
50% of requests still have server A cached -> keep it for another 120 seconds.
25% of requests get server A -> keep it for 240 seconds
25% of requests get server B -> keep it for 120 seconds
Third 120 seconds
25% will get server A (from the 50% of Server A that now expired) -> cache 240 sec
25% will get server B (from the 50% of Server A that now expired) -> cache 120 sec
25% will have server A cached for another 120 seconds
12.5% will get server B (from the 25% of server B that now expired) -> cache 120sec
12.5% will get server A (from the 25% of server B that now expired) -> cache 240 sec
Fourth 120 seconds
25% will have server A cached -> cache for another 120 secs
12.5% will get server A (from the 25% of b that now expired) -> cache 240 secs
12.5% will get server B (from the 25% of b that now expired) -> cache 120 secs
12.5% will get server A (from the 25% of a that now expired) -> cache 240 secs
12.5% will get server B (from the 25% of a that now expired) -> cache 120 secs
6.25% will get server A (from the 12.5% of b that now expired) -> cache 240 secs
6.25% will get server B (from the 12.5% of b that now expired) -> cache 120 secs
12.5% will have server A cached -> cache another 120 secs
... I think I lost something at this point, but I think you get the idea...
As you can see this gets pretty complicated to predict and it will for sure not work out like this in practice. But it should definitely have an effect on the distribution!
I know that weighted round robin exists and is just controlled by the root server. It just cycles through DNS records when responding and returns DNS records with a set probability that corresponds to the weighting. My DNS server does not support this, and my requirements are not that precise. If it doesn't weight perfectly its okay, but it should go into the right direction.
I think using the TTL field could be a more elegant and easier solution—and it doesn't require a DNS server that controls this dynamically, which saves resources—which is in my opinion the whole point of DNS load balancing vs hardware load balancers.
My question now is: Are there any best practices / methods / rules of thumb to weight round robin distribution using the TTL attribute of DNS records?
Edit:
The system is a forward proxy server system. The amount of Bandwidth (not requests) exceeds what one single server with Ethernet can handle. So I need a balancing solution that distributes the bandwidth to several servers. Are there any alternative methods than using DNS? Of course I can use a load balancer with fibre channel etc, but the costs are ridiculous and it also increases only the width of the bottleneck and does not eliminate it. The only thing I can think of are anycast (is it anycast or multicast?) IP addresses, but I don't have the means to set up such a system.