3

I am currently doing some research for my company concerning Kubernetes. We want to evaluate hosting our own Kubernetes-cluster from our bare-metal servers inside our private computing centre. I only ever worked with managed Kubernetes (like Google Cloud etc.) and of course minikube for local testing.

Has anyone ever worked with a self-hosted and self-installed Kubernetes-cluster and give me some evaluation on Know-How and time needed to configure and administrate such cluster?

OHithere
  • 33
  • 4
  • You would need to provide more details. How big of a cluster. What is to be intended to run there? – Crou Jun 03 '20 at 16:53
  • Hi, the idea was to first create a small cluster (about 3 or 4 nodes) and then provide it to our developers. I can't define what exactly is meant to run there because we have several projects that could be deployed to Kubernetes. Later, we maybe want to create more clusters for more projects and now I'm wondering whether administrating such systems (as I am the only admin) is too much work. Especially considering my not so present knowledge about Kubernetes inner-workings. – OHithere Jun 08 '20 at 08:14

1 Answers1

4

Has anyone ever worked with a self-hosted and self-installed Kubernetes-cluster and give me some evaluation on Know-How and time needed to configure and administrate such cluster?

I have run some non-trivial number of clusters before GKE and EKS were a thing, although I'm thankful I was doing so in a IaaS setup, so I didn't have to rack things and if something went toes up, I just asked the cloud provider to kill it. There are two separate parts to your question, and they are distinct amounts of work: configuring, and administering

Configuring a cluster could happen in as little as 30 minutes after you have the machines in a shape that they will boot up and read their user-data (I presume even bare metal has a corresponding cloud-init scheme, even if less emphasis on the "cloud" part), thanks to the utterly magic kubeadm and its friend etcdadm

However, after kubernetes is up-and-running, that's when the real work starts -- often characterized as "day two" operations, and it's a thick book of things that need monitoring and things that can go toes up.

For absolute clarity, I don't mean to dissuade you: when the cluster(s) (is|are) in good shape, it's like magic and a startling number of things Just Work™. But, like many things magical, when they get angry, if you aren't already familiar with the warning signs or recognize the sound of their gunfire, it can make a frustrating situation even more frustrating.

I'm wondering whether administrating such systems (as I am the only admin) is too much work. Especially considering my not so present knowledge about Kubernetes inner-workings

It's that last part that will be the killer hurdle to overcome, IMHO, since -- like many pieces of software -- once you understand how they're glued together, troubleshooting them is generally a tedious but tractable problem. However, managing kubernetes itself is only one part of the toolset required to keep a kubernetes cluster alive:

  • systemd (yup, still required unless you go with one of the really, really esoteric machine images that boot directly into kubelet and containerd)
    • how systemd's cgroups management and docker or containerd's cgroups management fight
  • docker (or containerd)
    • including its auth system, if you have images that require credentials to pull
  • CNI
    • and every nuance of the CNI provider you decide to go with, because inevitably they are going to puke and putting out that fire is some ... good fun
  • etcd(!!!!!!!!!!!!!!!!!!)
    • membership management
    • backups -- and they're only "backups" if they're tested, so some disaster recovery drills go a long way toward lowering the number of 3am pager calls
    • while it might not affect you if you have a small enough cluster, triaging etcd performance stalls is also a nightmare
  • the roles played by the parts of the control plane, in order to know who's logs to go spelunking around in based on the observed behavior:
    • apiserver
    • controller-manager
    • scheduler
    • kube-proxy
    • kubelet
  • intimate knowledge of the major moving parts of all, and I do mean all, of the vanilla resource types: Node, Pod, Deployment, Service, ConfigMap, Secret (including the 4 major sub-types of Secret); StatefulSets are optional, but handy to understand why they exist
  • the RBAC subsystem: Role, RoleBindings, ClusterRole, ClusterRoleBindings, and how auth makes it from an HTTPS request down into the apiserver's handler to get translated into a Subject that can be evaluated against those policies, which like all good things related to "security" is its own bottomless well of standards to know and tools to troubleshoot

and while it might not affect you with a bare-metal setup, usually kubernetes interacts with the outside world via something like MetalLB and the CNI IPAM solution, and troubleshooting those requires knowing what kubernetes expects them to do, and then reconcile that with what they are actually doing


I personally have not yet taken the CKA but it may behoove you to at least go through the curriculum to get a sense for what kinds of topics the CNCF considers essential knowledge. If you're not spooked, then, hey, maybe you can get your CKA out of this exercise, too :-)

Good hunting!

mdaniel
  • 2,338
  • 1
  • 8
  • 13
  • 2
    A few more things to add: Logging, getting logs from pods is a PITA unless you have something shipping to a centralised log server, user authentication. If you are going to have multiple clusters, and (for example) a development cluster then your users need a way of authenticating. Secret storage - HC Vault?. Metrics: Prometheus & Graphana? Network Policy, Service mesh, How will you deploy things to your cluster? GitOps? Package managers? Helm... The list goes on. Some of these decisions are present whether you use onsite or cloud, but you get the idea, there's a lot. – GeoSword Jun 09 '20 at 19:13
  • 1
    BTW To answer your question, yes it is viable. Just harder than cloud IMO. You get the flexibility, but you need to implement pretty much everything yourself, and thats a lot of work. – GeoSword Jun 09 '20 at 19:21
  • Hi guys, first of all @mdaniel thank you very much for the detailed analysis :) Really appreciate it! I took all points into consideration and handed them to my supervisors and we decided against our own Kubernetes at the moment. However, as I'm having a personal interest in this topic I'll get into this quite a bit throughout the next months. So, once again thank you for you answers and for providing your own experiences!!! – OHithere Jun 10 '20 at 09:17