1

TL:DR ContainerCreating state of pod, Docker containers paused, /var/lib/kubelet/config.yaml is reported missing in the logs but does exist and present in systemctl status kublet. On an AWS EC2 details below.

I have a home brewed Kubernetes cluster on an amazon EC2.

cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo 
[kubernetes] 
name=Kubernetes 
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7\$basearch 
enabled=1 
gpgcheck=1 
repo_gpgcheck=1 
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg 
https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg 
exclude=kubelet kubeadm kubectl 
EOF

sudo yum -y install docker iproute-tc kubelet kubectl kubeadm --disableexcludes=kubernetes

/usr/lib/sysctl.d/00-system.conf
net.bridge.bridge-nf-call-iptables = 1

/usr/lib/systemd/system/docker.service
ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock $OPTIONS $DOCKER_STORAGE_OPTIONS $DOCKER_ADD_RUNTIMES --exec-opt native.cgroupdriver=systemd

Then docker, containerd, and kubelet were all enabled and started.

Ran sudo kubeadm init

At this stage the node is not ready.

Next I ran:

kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml

Flannel here may be a red herring but it seemed to make progress. The node now shows ready.

I created a deployment.yaml file based on this: https://v1-18.docs.kubernetes.io/docs/tasks/run-application/run-stateless-application-deployment/

kubectl apply -f deployment.yaml

Tried Weave in a vein hope:

kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"

Allow pods to schedule on the master node: kubectl taint nodes --all node-role.kubernetes.io/master-

The pod shows as ContainerCreating. So I guess this is where it gets interesting as we now enter the realm of the logs.

kubectl get events --all-namespaces --sort-by='.metadata.creationTimestamp'

Too much to paste here but lots of lines looking like Warning FailedCreatePodSandBox

The command docker ps reports that all containers are paused.

journalctl -u kubelet

The first error shows:

server.go:198] failed to load Kubelet config file /var/lib/kubelet/config.yaml, error failed to read kubelet config file "/var/lib/kubelet/config.yaml", error: open /var/lib/kubelet/config.yaml: no such file or directory

ls -alh shows the file exists and is 876B.

None of my research seems to address these underlying issues, please help...

  • Are you using specific tutorial ? Can you describe nodes and fetch logs after applying command kubeadm init ? – Malgorzata Mar 15 '21 at 13:32
  • I am not using any specific tutorial. I can describe nodes and fetch logs. kubctl works fine. Only it looks like there is something missing. I am guessing to do with networking but I'm not sure what. I don't even know how to find the CNI version, if that's different from the kubelet version... – Stephen Feyrer Mar 15 '21 at 18:55

1 Answers1

0

Your CNI version is put in kube-flannel.yaml file in ConfigMap configuration.

cni-conf.json: |
    {
      "name": "cbr0",
      "cniVersion": "0.3.1",
      "plugins": [
        {
          "type": "flannel",
          "delegate": {
            "hairpinMode": true,
            "isDefaultGateway": true
          }
        },
        {
          "type": "portmap",
          "capabilities": {
            "portMappings": true
          }
        }
      ]
    }

But the/etc/cni/net.d directory with the default location in which the scripts will look for net configurations.

Check following steps:

  1. Make sure all requirements are filled.

  2. Make sure you don't block egress traffic and that port 6443 ingress rule is open for the worker node (relevant for the joining phase) - check firewall rules.

  3. Check if required ports are not occupied.

  4. Restart Kubelet with systemctl restart kubelet and then check latest logs with:
    sudo journalctl -u kubelet -n 100 --no-pager.

  5. Check if Docker version can be updated to a newer stabler one.

  6. Try running kubeadm reset and make sure you re-run kubeadm init with latest version or with the specific stable version by addding --kubernetes-version=X.Y.Z.

Read more: kubelet-kubernetes-config-file.

Also if you are using CRI-O - see solution kubeadm-crio.

Take a look on similar problems: vpc-cni-k8s, rbac-flannel-kubeadm.

Malgorzata
  • 358
  • 1
  • 5