1

Me and a couple of friends are thinking of setting up a kubernetes cluster where our homeservers will act as the nodes.

As our nodes will be spread out between our apartments I am worried this will create problems when it comes to:

  1. Exposing services outward, since the public IP of the nodes will be different.

  2. Network speed and latency between distributed storage services, since we rely on our ISP network connections to communicate between nodes.

We all have around 100 to 250Mbit/s up/down, but handling large storage volumes might become a problem here still.

What challenges do we face when spreading our nodes like this?

Are there any specific functions, tutorials or guides I can read into to learn more about how to solve this?

Is our idea even viable at all?

I'm personally very new at kubermetes, but am very excited about it.

I am thankful for any answers.

  • You could run into multiple issues when running a cluster which is "geolocated". First of all, it will heavily depend on the networking infrastructure between the "nodes". You will also need to consider how your setup would look like (master,worker). As you said that you are new at Kubernetes, I encourage you to read the official docs. It could shed some light on the potential solutions: https://kubernetes.io/ . Also you could take a look here: https://www.getambassador.io/learn/multi-cluster-kubernetes/ – Dawid Kruk Sep 28 '20 at 16:26
  • Can you add any examples of what you want to run? Are the masters/etcd planned to be on the home nodes? Storage could get a bit finicky hosted over higher latency, unreliable links. – Matt Sep 30 '20 at 23:41
  • The best of worlds would be if we could run everything in a distributed manner. Storage and etcd included. – Hannes Knutsson Oct 04 '20 at 19:26
  • I assume you mean [geographically](https://en.wiktionary.org/wiki/geography), not [geologically](https://en.wiktionary.org/wiki/geology#English). – Flimzy Oct 05 '20 at 10:20
  • @Hannes Knutsson does provided post answered your questions? – PjoterS Mar 01 '21 at 08:46

1 Answers1

1

Posting this answer as a community wiki as the topic portrayed in the question without exact specification could be wide and won't have a definitive answer.

First of all you'll need to know the exact requirements for your cluster. What kind of control plane you would like to build. There are multiple options (single master, multiple masters). You can refer to the official documentation:

If you would like to solve a particular tasks, you'd be better to include it in the question as potential issues could be different for different workloads.

As for the beginning you could look on the kubeadm requirements for the Kubernetes clusters:


Some of the potential issues you can encounter:

  • Controllers and services are designed to spread traffic evenly between pods. You can have troubles with networks delays when traffic jumps all over the nodes.

  • Kubernetes components have predefined timeouts for their operations. With distributed clusters you'll need to tweak them heavily to achieve predictable work without timeouts during cluster operations.

  • Next potential issue: NAT. All Kubernetes nodes are supposed to connect with other nodes without any address translations. It could come to the situation where you'll need to build a reliable VPN connection between sites.

  • You could run into issues with ETCD when building highly available Kubernetes cluster. Specifically the synchronization between the ETCD members. Any connection lost could lead to a lost quorum. Without quorum any changes to the cluster would be impossible and it may destroy the cluster.


As for storage I reckon there could be issues with data replication between sites (latency between sites, the amount of data sent between them, the type of storage solution used, ensuring all of the pods have access to the same data at the same time). You can look into the official documentation for storage concepts in Kubernetes:

Dawid Kruk
  • 588
  • 2
  • 8