0

This is a follow-up to a prior question I asked, but with a different ask/approach. In case it matters, I'm on GKE, but I'm hoping there's a cloud-agnostic answer.

I'm trying to run the container factoriotools/factorio, but the application has some particular requirements due to the use of an application-specific public server listing function, as seen in many video games that use user-hosted servers.

So far, I've been able to get things working with host networking, and direct connections to work with a NodePort Service. However, the application's public server listing function remains an issue for unprivileged containers.

Here's how Factorio figures out how to manage the public server listing:

  • The container listens on one socket, say UDP Port 34197.
  • The NodePort Service routes that traffic to 20635 publicly.
  • The container sends a ping over its listening port (34197) to a ping-pong server, and the ping-pong server replies with the IP address and port it received the ping from. Outside of k8s, this would be port 34197 still.
  • The container then uses this information to register with the server listing. IP Address, port, server name, and some other information.
  • In GKE, the ping-pong gets routed to an arbitrary unused port (say, 40792).
  • The container believes it is listening on a port other than the one I set up (20534), and then registers with the public server listing using the wrong port (40792, because the ping-pong server said so).
  • Any attempt to connect from the public server listing then fails; the client believes that the server is listening on the port the ping-pong server witnessed (40792), but the container has been interacting with its listening port (34197) the entire time.

(I've been told this process is a variation of how ICE/STUN work)

So that means that if the container listens on 34197, but Kubernetes routes that to 20635 externally, both inbound and outbound traffic need to go through 20635 on the public side in order for the application's built-in server listing function to work.

If I bypass the public server listing, and connect directly to the container's node's public IP address with port 20635, it works flawlessly. But that's a pretty massive compromise for what I'm doing.

Host networking bypasses this entire issue by allowing the container to directly open whatever port it wants to on the host. For hosts that are already publicly exposed, this means nothing on the host (especially not k8s) can re-route the container traffic through extra layers and change the port numbers. So when the container opens port 34197, it gets port 34197. When it sends on port 34197, that traffic is sent on port 34197. The ping-pong server sees the port it's supposed to see. And because it's a public UDP port, it doesn't matter who sent the traffic first; traffic is traffic, the port is the port.

However, if I understand the docs correctly, running a container on the host's network stack requires a privileged container, which is effectively root access on the host, which is Very Bad In Production. So, for unprivileged containers, there needs to be a solution other than relying on host networking. I cannot find that "other solution". I cannot find documentation anywhere about how to do this, or even evidence that anyone is thinking about it. How do I make this work?

xenrelay
  • 1
  • 4
  • In general networking, it is not recommended to place Inbound and Outbound in the same port. There is a dynamic allocation of ports on source when a connection is initiated, which is used to track connection & its status. As you have mentioned, ‘the public server listing remains an issue for unprivileged containers’ what does this mean? Public server listing means DNS listing? Is it working for a privileged container? Can you elaborate more about the setup? – Anant Swaraj Jun 22 '21 at 09:04
  • Question is edited, hope it helps. Further detail that I'm not sure should be in the question: DNS is unrelated. The public server listing I'm talking about is closer to a database with a custom UI. Some centralized server accepts registrations from actual game servers, and the actual game servers have to keep a connection open. The centralized server then facilitates connections from client to actual game servers. All of this is done with custom code specific to Factorio. Important note: Factorio wasn't designed to be containerized. – xenrelay Jun 23 '21 at 02:47
  • Did you ever get a good solution for this? I am currently trying to solve this for my helm chart. – James Rhoat Jan 14 '22 at 00:51
  • Not really. Without the server authentication piece in place, I was forced to postpone this entire setup. I also don't remember most of the experimentation I was doing. I do remember that as far as I could tell, whatever magic agones was doing did the trick. It's just a shame I can't afford the load balancers it uses. Looks like traefik supports udp routing these days, which might not have been the case 6 months ago, so that's a thing I would check if I was working on this now. – xenrelay Jan 15 '22 at 13:47

1 Answers1

0
  • I would suggest you use the GCP GameServer feature. It is a dedicated game server which takes the pain out of managing your global game server infrastructure, so you can focus on creating great games faster, without increasing complexity or compromising on performance.

  • It provides management of game server clusters using Kubernetes for container orchestration and Agones for game server fleet orchestration and lifecycle management. Agones is an open source dedicated game server hosting and scaling project built on top of Kubernetes with flexibility you need to tailor it to the needs of your multiplayer game.

  • While creating a game server using Agones, you have to add some firewall rules to open the UDP ports required to connect to the cluster.

  • To create a game server using Agones, you can go through Quickstart which guides you through creating a GameServer in Kubernetes using Agones custom resource.

  • To fetch the GameServer status use the below command. It will provide you with various lifecycle events of the GameServer.

                watch kubectl describe gameserver
    
  • To set up a GameServer in GCP with Agones, you can check the official Google document.

  • Interesting, and workable. Not my game, not my source code, but the fact that they have a sidecar-based integration approach is good. Would be nice if there was something simpler I could do (service mesh, networking ACLs, ....), but if I don't hear any further ideas, I'll give this a try. – xenrelay Jun 25 '21 at 03:57
  • I'm not actually using this right now because there were some issues getting it hooked up and working in the broader system, but I'm also skipping this problem entirely for now, since there's other issues preventing the public server listing from working. I'll probably get back to this some day. – xenrelay Jul 05 '21 at 21:06
  • Unaccepted because under the covers, Agones uses 3 load balancers, which is unacceptably expensive at my scale. – xenrelay Jul 08 '21 at 23:54