11

I want to have anycast for my web service, but I cannot find any information on how to achieve this or any company that can help.

I've found loads of companies offering anycast DNS, but that's not what I need.

I have a stateless web service that I want to geographically distribute, using anycast to load balance and increase uptime. Are there any technical reasons a company cannot just advertise an IP address at multiple datacenters for me?

What technical aspects about anycasting do I need to know about to evaluate existing offerings and help me find companies that could help me? What are the pitfalls I need to watch out for?

Mike Pennington
  • 8,266
  • 9
  • 41
  • 86
Filip Haglund
  • 361
  • 4
  • 11
  • 2
    Please see https://www.youtube.com/watch?v=Ym96Z-sThZU (apx. 7:40) too see why anycast won't work (for you). – r_3 Dec 01 '14 at 16:33
  • 1
    To summarize the video: it won't work unless you own all the equipment from start to end as the routers have to be configured properly. – Nathan C Dec 01 '14 at 17:30
  • 2
    @r_3 That video explains the problem you are likely to run into with a naive implementation. But if you know what you are doing, that problem can be solved. And no, one does not need to control all the networking equipment in order to make it work. Some solutions rely on doing "magic" with the destination MAC address, just avoid that and use a proper IP tunnel instead, and the need for full control over the networking equipment goes away. – kasperd Dec 01 '14 at 19:48
  • Do you already have your own IP prefix of a size which is suitable for anycast? Or are you expecting the provider to give you a single IP out of their prefix? – kasperd Dec 01 '14 at 19:52
  • The part of the question saying `I do know what I'm doing` came across a bit too arrogant. In particular given that certain details were omitted, which anybody with sufficient knowledge about anycasting would understand were critical. I have tried my best to improve the question. If the question gets reopened now, I'll be happy to provider an answer as well. – kasperd Dec 01 '14 at 20:13
  • @NathanC, What do you mean by "own all the equipment"? How does Google DNS 8.8.4.4 do it then? – Pacerier Nov 03 '17 at 20:48
  • Hooking https://serverfault.com/q/616412/87017 – Pacerier Nov 03 '17 at 23:23
  • @kasperd, Which anycast providers are there in the market? – Pacerier Nov 03 '17 at 23:23
  • @Pacerier This overview of Google's network may give you some idea of how Google does it: https://cloud.google.com/about/locations/#network-tab – kasperd Nov 04 '17 at 01:36
  • 1
    @Pacerier I don't know of any provider selling anycast as a service. I know of CDNs which may be using anycast as part of the implementation of a higher level service. – kasperd Nov 04 '17 at 01:38
  • @kasperd, but it would be feasible that a single anycast IP would be offered that could allow to load balance between two different unicast public IPs in different geographical regions?. – Jaime Hablutzel Aug 04 '19 at 01:29
  • @Pacerier, if you're looking for an anycast provider, perhaps you should consider [Netactuate](https://netactuate.com/anycast-delivery-platform/) – Mike Pennington Oct 27 '21 at 21:08

3 Answers3

17

There are two separate aspects about anycast that need to be understood in order to address your particular request. The first part is how anycast addresses are advertised and routed. The second is what the challenges are in TCP to an anycast address, and how they might be addressed.

Announcing and routing

In order to keep the BGP table of an acceptable size, most AS will filter incoming announcements if the prefixes are too long. For IPv4 the threshold tend to be a /24 prefix, which means 256 addresses. This means in order to do anycast on the public internet, you need at least 256 addresses.

If you already have a /24 prefix of your own, then there is not much stopping a hosting provider from announcing it in your behalf. If this is the case anycasting could be as simple for you as finding a bunch of different hosting providers willing to provide this service at the right price. Then you just have all of them announce the prefix on your behalf.

You can look at publicly available information about advertised routes to find providers already announcing prefixes on behalf of their customers in order to guide you to providers likely to offer this kind of service. One tool to look up this in routing tables is bgp.he.net.

If you do not have your own prefix and want one from a provider, it is important to understand what the limitations mentioned above means to that provider.

The provider has enough IP addresses that they could configure an anycast prefix. However once they do that, they are committed to using all 256 addresses as anycast. And all 256 addresses must be hosted in the exact same set of locations.

For this reason you sometimes see 256 addresses allocated in order to use just one of them for an anycasted service. This might be the first opportunity for you. A provider already anycasting a prefix might in fact have 250 unused anycast addresses. If your service is "interesting" enough for a provider, they may be willing to rent you hosting on one of those remaining addresses. One important caveat is that you would have to be hosted in the exact same locations as their primary anycast service. And an arrangement would likely be needed in which they move your service as they see fit, because it is their primary anycasted service that decides where hosting is needed.

Most of the above is assuming roughly a 1:1 correspondence between where the provider is hosting a service and where they are announcing the prefixes.

If the hosting provider have their own redundant backbone and their own data centers, then they could announce a prefix in a different set of locations from where they are hosting it. Moreover internally they can route longer prefixes as unicast or anycast.

For example if the provider announces a /22 in four different POPs, and they have a redundant network between those (for example a ring of four links), they could internally route a /24 or /25 to each POP and maybe set aside a /28 to be anycasted to all POPs (which effectively means get serviced by the POP where the packets first enter their network).

If you can find a provider which has both the redundant backbone and data centers, then it simply is much easier for such a provider to anycast one of their own IP addresses for your service. However keep in mind, that in doing so your service consumes one CAM table entry in every one of their backbone routers. And you'd have to pay for that.

TCP and anycast

As some of the comments have pointed out, TCP is a stateful protocol. So even if you consider your web service to be stateless, it still has state at the TCP layer. The consequence of that is that naively anycasting a TCP based service will be that users will experience very frequent connection reset.

That issue can be addressed by putting another layer in front of the actual web servers. What is needed is a layer of nodes that can forward received TCP packets to the proper web server and do so consistently across a connection. So far this pretty much describes a standard DSR based load balancer.

However since there are multiple instances of this load balancer, they need to share state. A distributed hash table is a data structure which could be used for this layer.

Moreover packets from the load balancing layer need to be forwarded unmodified to the backend. IP routing based on the destination IP of the original packet won't solve that problem, because that destination address is still the anycasted address, so the packet would never make it to the backend but simply bounce back to the load balancer and loop until the TTL expired.

Typical load balancers address this by modifying the destination MAC address and forward it, thereby bypassing the IP routing. This only works if your load balancer and backends were all located in a single location and the network between them is entirely switched without any routers between load balancer and backends.

However there is a different approach to solving that problem. Packets from the load balancer to backend can be send through an IP tunnel. The outer IP header carries a destination address which is a unicast address pointing to a backend. The inner IP header is unmodified and carries the client IP as source and anycast IP as destination.

In this setup the source IP of the outer header is mostly unused. In principle it is supposed to be a unicast address of the load balancer receiving the packet. However some services (for example facebook) copies the client IP from the inner header as source IP on the outer header. This mistake on facebook's part can be detected from the outside because sometimes the tunneled packets trigger an ICMP error which is sent directly back to the client.

There is no need for the inner and outer header to use the same IP version. So the unicast addresses that are needed for load balancers and backends can all be IPv6 such the number of load balancers and backends are not limited by availability of IPv4 addresses.

Using a design as sketched above has the added advantage that the load balancers typically only need a minor part of the hardware in this setup, and it is only the load balancers that need to be reached through the anycast address. This means that it is less of a problem if your anycast address need to be relocated with short warning due to piggybacking on an anycast prefix allocated primarily for a different service.

Pitfalls

Obviously the setup sketched above is more complicated than simply deploying a bunch of standalone web servers. Complicated setups tend to be a source of unavailability. So some amount of work will have to be put into such a scheme to make it robust enough to be more reliable than the alternative. This means this is more likely something that should be deployed as part of a CDN service rather than something deployed for an individual web service.

If you try to do anycast TCP with anything simpler than the setup described above, you may very well run into the problem with routes changing mid-connection, and as a result users will experience resets.

Anycast may do some good for availability, latency, and load balancing. However it is no silver bullet. Anycast does balance load, and you can scale with load by adding more nodes. But don't expect anywhere near perfectly balanced load across the nodes reached by anycast. In the setup described above with a distributed load balancing layer, the load balancers themselves may not get even load, but they could distribute load evenly across backends.

Don't rely on a single anycast IP for availability. If one of your nodes goes down, routing may not pick it up automatically. It does not affect all clients, but a subset of clients may have their packets routed to a node which is down. Hence for those clients, your anycast IP address is down. If you want redundancy, you need multiple anycast IP addresses.

Latency can be good as long as routes don't change in the middle of a connection. But as soon as the TCP handshake has completed, you are committed to using a specific backend for the duration of the TCP connection. Packets have to go from client to load balancer to backend and to client. This triangular routing can increase latency. There is a latency reduction from anycast and being able to pick the backend closest by, but having three legs on the roundtrip rather than just two can increase latency. Only collecting lots of real world measurements will tell you which of the two factors weigh more.

kasperd
  • 29,894
  • 16
  • 72
  • 122
  • Just making sure it's understood that only *most* AS filter on a /24. We're using a /29. – Sirex Dec 01 '14 at 23:17
  • @Sirex Is that /29 the only announcement covering those addresses? Or is that /29 part of a shorter prefix (like a /24 or /22) which is also announced? If the /29 is indeed reachable through two announcements of different length, what methodology do you use to know how many networks ignore the /29 and use only the shorter prefix? – kasperd Dec 01 '14 at 23:26
  • putting some more thought into it, we do own a /19 and /22 which are also on the bgp tables as our isp do our anycast also, but only the /29 is anycasted. i guess if we'd not had the larger blocks they'd have complained about table space, dunno though, was set up before I got here. You may well be right though. – Sirex Dec 01 '14 at 23:53
  • @kasperd, So which 256 addresses does Google own for 8.8.8.8 and 8.8.4.4 anycast? – Pacerier Nov 03 '17 at 20:51
  • @kasperd, Is this ansewr still true for ipv6? – Pacerier Nov 03 '17 at 20:53
  • Re "distributed hash table", doesn't that table itself needs to run on unicast IP? – Pacerier Nov 03 '17 at 23:11
  • Re "but a subset of clients may have their packets routed to a node which is down", seriously? Why would a standard-conforming proper router even broadcast broken routes? – Pacerier Nov 03 '17 at 23:17
  • Re "if you want redundancy, you need multiple anycast IP addresses", do you mean **even for DNS**? – Pacerier Nov 03 '17 at 23:19
  • Re "you need multiple anycast IP addresses", but the whole point of anycast in the first place is such that you manage only 1 IP address isn't it? So if you need to manage multiple IPs anyway, **what's the point of anycast**? – Pacerier Nov 03 '17 at 23:20
  • @Pacerier You can use the link I gave in the post and enter `8.8.8.8` and `8.8.4.4`. The answers you get will be `8.8.8.0/24` and `8.8.4.0/24`. If you read https://serverfault.com/q/49765/214507 you will be able to answer such questions without needing a service to look it up. – kasperd Nov 04 '17 at 00:10
  • @Pacerier The essence of this answer is completely IP agnostic. There is very little difference between the two in that respect. The most significant difference is that the prefix lengths in the examples would be different, for example everywhere I mentioned a `/24` I would have said `/48` if it was IPv6. – kasperd Nov 04 '17 at 00:11
  • @Pacerier Everyone of the hosts in the setup still need individual unicast IPs that includes the ones storing the distributed hash table. Anycast is not a way to consume fewer IP addresses, a prober anycast setup will need a few more IP addresses than a unicast setup. – kasperd Nov 04 '17 at 00:18
  • @Pacerier Anycast is not about preserving IP addresses or making things easier to manage. Anycast is also only partially about scaling and redundancy. Anycast is primarily a tool to reduce latency in a service that is distributed worldwide and has users worldwide. And there is no standard that prevents hardware from breaking. Assuming that you can make a system that can automatically tell working hardware from broken hardware will lead to an unreliable system. Hardware will find a way to break just enough to cause unavailability but not enough for your detection to notice. – kasperd Nov 04 '17 at 00:23
  • 1
    @Pacerier A distributed hash table is not the only way to solve the problem. This paper has a different approach https://research.google.com/pubs/pub44824.html – kasperd Nov 04 '17 at 00:40
3

This realistic article might also help https://engineering.linkedin.com/network-performance/tcp-over-ip-anycast-pipe-dream-or-reality

Real User Monitorings were used by linkedin to assess whether global anycast would have a good performance than a regional anycast. At the end they realised and in-fact implemented the regional anycast where different anycast address was used for a different region. They are using a mix of DNS based load balancing and the regional anycast based one.

The solution mentioned above is a good one as it somewhat provides separation between locations and the identities of the servers but its based on tunneling. I believe much more better approach will be to use the same separation approach without tunneling but then its implementation is quite limited this time. It is in active research though e.g. traffic engineering through ILNP (Identifier Locator Network Protocol) provides answers to these entangled issues. cheers

  • Khawar, I see this is your first answer. Thanks for participating in the community, but we tend not to like link-only answers on SF: if the link rots or changes, the answer is useless. If you want to summarise the salient points of the linked article, though, you might have the makings of a good answer here. – MadHatter Sep 19 '15 at 15:44
  • Nice job, Khawar; +1 from me. – MadHatter Sep 19 '15 at 18:33
  • @Khawar, LinkedIn's test case is **completely erroneous**, including the article that they linked. See https://serverfault.com/questions/616412/how-does-anycast-work-with-tcp#comment1136470_616412 for more info. Those LinkedIn engineers are a bunch of noobs, don't trust anyone but Google. And don't trust CDNs because they are trying to sell you something, the more "anycast hype" to justify their higher prices the better. – Pacerier Nov 03 '17 at 23:25
1

You'll need to colo physical webserver hardware with a network provider that can do the anycast for you.

If you go this route, you'll probably also want to setup a tunnel to the management (drac etc) cards on the machines so you don't need to visit them on-site.

We do this for our website.

Sirex
  • 5,447
  • 2
  • 32
  • 54