Node-to-Node communication doesn't work with Kubernetes with Calico

Question

I'm quite new to Kubernetes, event if it doesn't feel like after I spent dozens of hours trying to setup a working Kubernetes.

The edge parameters:

1 master and 3 nodes
set up using kubeadm
kubernetes version 1.12.1, Calico 3.2
Primary IP addresses of hosts are 192.168.1.0/21x (relevant because this collides with default pod subnet, because of this I set --pod-network-cidr=10.10.0.0/16)

Installation using kubeadm init and joining worked so far. All pods are running, only coredns keeps crashing, but this is not relevant here.

Installation of Calico

Then, I starting with installing with the etcd datastore and installing with the kubernetes api datastore 50 nodes or less

kubectl apply -f https://docs.projectcalico.org/v3.2/getting-started/kubernetes/installation/rbac.yaml

curl https://docs.projectcalico.org/v3.2/getting-started/kubernetes/installation/hosted/calico.yaml -O
# modify calico.yaml  # Here, I feel a lack of documentation: Which etcd is needed? The one of kubernetes or a new one? See below
kubectl apply -f calico.yaml

kubectl apply -f https://docs.projectcalico.org/v3.2/getting-started/kubernetes/installation/hosted/rbac-kdd.yaml

curl https://docs.projectcalico.org/v3.2/getting-started/kubernetes/installation/hosted/kubernetes-datastore/calico-networking/1.7/calico.yaml -O
# modify calico.yaml (here, I have to change the range of CALICO_IPV4POOL_CIDR)
sed -i 's/192.168.0.0/10.10.0.0/' calico.yaml
kubectl apply -f calico.yaml

Test

Now, I use the following definition for testing:

apiVersion: v1
kind: Pod
metadata:
  name: www1
  labels:
    service:      testwww
spec:
  containers:
  - name: meinserver
    image: erkules/nginxhostname
    ports:
    - containerPort: 80
---
apiVersion: v1
kind: Pod
metadata:
  name: www2
  labels:
    service:      testwww
spec:
  containers:
  - name: meinserver
    image: erkules/nginxhostname
---
kind: Service
apiVersion: v1
metadata:
  name: www-np
spec:
  type: NodePort
  selector:
    service: testwww
  ports:
  - name: http1
    protocol: TCP
    nodePort: 30333
    port: 8080
    targetPort: 80

How I test:

curl http://192.168.1.211:30333  # master, no success
curl http://192.168.1.212:30333  # node, no success
curl http://192.168.1.213:30333  # node, only works 50%, with www1 (which is on this node)
curl http://192.168.1.214:30333  # node, only works 50%, with www2 (which is on this node)

The above commands work only if the (randomly chosen) pod is on the node which owns the specified IP address. I expected to have 100% success rate on all nodes.

I saw more success when using the etcd server of kubernetes (pod/etcd-master1). In this case, all the above commands worked. But the pod/calico-kube-controllers didn't start in this case because it was running on a worker node and thus didn't have access to etcd.

In the getting started guide, I found an instruction to install an extra etcd:

kubectl apply -f https://docs.projectcalico.org/v3.2/getting-started/kubernetes/installation/hosted/etcd.yaml

It's weird: This line is only in the "getting started", but not in "installation". But the default calico.yaml already contains the correct clusterIp of exactly this etcd server (btw how is this IP static? Is it generated by a hash?). Anyway: with this, all Calico nodes came up without an error, but I had the described behaviour where not all NodePorts were working. And I also care about etcd which is open to everyone this way which is not what I want.

So, there are the main questions:

What's the correct etcd server to use? A separate one or the one of Kubernetes?
- If it should be the one of Kubernetes, why isn't pod/calico-kube-controllers configured by default to run on the master where it has access to etcd?
- If I should serve an own etcd for calico, why isn't it documented under "installation", any why do I have these NodePort problems?

Btw: I was the answers which recommend changing the iptables default rule from DROP to ACCEPT. But this is an ugly hack and probably bypasses all of the security features of Calico

Requested details (Variant with extra etcd)

$ kubectl get all --all-namespaces=true -o wide; kubectl get nodes -o wide
NAMESPACE     NAME                                          READY   STATUS             RESTARTS   AGE   IP                NODE      NOMINATED NODE
default       pod/www1                                      1/1     Running            0          8s    192.168.104.9     node2     <none>
default       pod/www2                                      1/1     Running            0          8s    192.168.166.136   node1     <none>
kube-system   pod/calico-etcd-46g2q                         1/1     Running            0          22m   192.168.1.211     master1   <none>
kube-system   pod/calico-kube-controllers-f4dcbf48b-88795   1/1     Running            10         23h   192.168.1.212     node0     <none>
kube-system   pod/calico-node-956lj                         2/2     Running            6          21h   192.168.1.213     node1     <none>
kube-system   pod/calico-node-mhtvg                         2/2     Running            5          21h   192.168.1.211     master1   <none>
kube-system   pod/calico-node-s9njn                         2/2     Running            6          21h   192.168.1.214     node2     <none>
kube-system   pod/calico-node-wjqlk                         2/2     Running            6          21h   192.168.1.212     node0     <none>
kube-system   pod/coredns-576cbf47c7-4tcx6                  0/1     CrashLoopBackOff   15         24h   192.168.137.86    master1   <none>
kube-system   pod/coredns-576cbf47c7-hjpgv                  0/1     CrashLoopBackOff   15         24h   192.168.137.85    master1   <none>
kube-system   pod/etcd-master1                              1/1     Running            17         24h   192.168.1.211     master1   <none>
kube-system   pod/kube-apiserver-master1                    1/1     Running            2          24h   192.168.1.211     master1   <none>
kube-system   pod/kube-controller-manager-master1           1/1     Running            3          24h   192.168.1.211     master1   <none>
kube-system   pod/kube-proxy-22mb9                          1/1     Running            2          23h   192.168.1.212     node0     <none>
kube-system   pod/kube-proxy-96tn7                          1/1     Running            2          23h   192.168.1.213     node1     <none>
kube-system   pod/kube-proxy-vb4pq                          1/1     Running            2          24h   192.168.1.211     master1   <none>
kube-system   pod/kube-proxy-vq7qj                          1/1     Running            2          23h   192.168.1.214     node2     <none>
kube-system   pod/kube-scheduler-master1                    1/1     Running            2          24h   192.168.1.211     master1   <none>
kube-system   pod/kubernetes-dashboard-77fd78f978-h8czs     1/1     Running            2          23h   192.168.180.9     node0     <none>

NAMESPACE     NAME                           TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE   SELECTOR
default       service/kubernetes             ClusterIP   10.96.0.1        <none>        443/TCP          24h   <none>
default       service/www-np                 NodePort    10.99.149.53     <none>        8080:30333/TCP   8s    service=testwww
kube-system   service/calico-etcd            ClusterIP   10.96.232.136    <none>        6666/TCP         21h   k8s-app=calico-etcd
kube-system   service/calico-typha           ClusterIP   10.105.199.162   <none>        5473/TCP         23h   k8s-app=calico-typha
kube-system   service/kube-dns               ClusterIP   10.96.0.10       <none>        53/UDP,53/TCP    24h   k8s-app=kube-dns
kube-system   service/kubernetes-dashboard   ClusterIP   10.96.235.235    <none>        443/TCP          23h   k8s-app=kubernetes-dashboard

NAMESPACE     NAME                         DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR                     AGE   CONTAINERS                IMAGES                                                 SELECTOR
kube-system   daemonset.apps/calico-etcd   1         1         1       1            1           node-role.kubernetes.io/master=   21h   calico-etcd               quay.io/coreos/etcd:v3.3.9                             k8s-app=calico-etcd
kube-system   daemonset.apps/calico-node   4         4         4       4            4           beta.kubernetes.io/os=linux       23h   calico-node,install-cni   quay.io/calico/node:v3.2.3,quay.io/calico/cni:v3.2.3   k8s-app=calico-node
kube-system   daemonset.apps/kube-proxy    4         4         4       4            4           <none>                            24h   kube-proxy                k8s.gcr.io/kube-proxy:v1.12.1                          k8s-app=kube-proxy

NAMESPACE     NAME                                      DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE   CONTAINERS                IMAGES                                          SELECTOR
kube-system   deployment.apps/calico-kube-controllers   1         1         1            1           23h   calico-kube-controllers   quay.io/calico/kube-controllers:v3.2.3          k8s-app=calico-kube-controllers
kube-system   deployment.apps/calico-typha              0         0         0            0           23h   calico-typha              quay.io/calico/typha:v3.2.3                     k8s-app=calico-typha
kube-system   deployment.apps/coredns                   2         2         2            0           24h   coredns                   k8s.gcr.io/coredns:1.2.2                        k8s-app=kube-dns
kube-system   deployment.apps/kubernetes-dashboard      1         1         1            1           23h   kubernetes-dashboard      k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.0   k8s-app=kubernetes-dashboard

NAMESPACE     NAME                                                DESIRED   CURRENT   READY   AGE   CONTAINERS                IMAGES                                          SELECTOR
kube-system   replicaset.apps/calico-kube-controllers-f4dcbf48b   1         1         1       23h   calico-kube-controllers   quay.io/calico/kube-controllers:v3.2.3          k8s-app=calico-kube-controllers,pod-template-hash=f4dcbf48b
kube-system   replicaset.apps/calico-typha-5f646c475c             0         0         0       23h   calico-typha              quay.io/calico/typha:v3.2.3                     k8s-app=calico-typha,pod-template-hash=5f646c475c
kube-system   replicaset.apps/coredns-576cbf47c7                  2         2         0       24h   coredns                   k8s.gcr.io/coredns:1.2.2                        k8s-app=kube-dns,pod-template-hash=576cbf47c7
kube-system   replicaset.apps/kubernetes-dashboard-77fd78f978     1         1         1       23h   kubernetes-dashboard      k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.0   k8s-app=kubernetes-dashboard,pod-template-hash=77fd78f978

NAME      STATUS   ROLES    AGE   VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE           KERNEL-VERSION      CONTAINER-RUNTIME
master1   Ready    master   24h   v1.12.0   192.168.1.211   <none>        Ubuntu 18.04 LTS   4.15.0-20-generic   docker://17.12.1-ce
node0     Ready    <none>   23h   v1.12.0   192.168.1.212   <none>        Ubuntu 18.04 LTS   4.15.0-20-generic   docker://17.12.1-ce
node1     Ready    <none>   23h   v1.12.0   192.168.1.213   <none>        Ubuntu 18.04 LTS   4.15.0-20-generic   docker://17.12.1-ce
node2     Ready    <none>   23h   v1.12.0   192.168.1.214   <none>        Ubuntu 18.04 LTS   4.15.0-20-generic   docker://17.12.1-ce

$ for i in $(seq 20); do timeout 1 curl -so/dev/null http://192.168.1.214:30333 && echo -n x || echo -n -  ;done
x---x-x-x--x-xx-x---

Requested details (Variant with existing etcd)

$ kubectl get all --all-namespaces=true -o wide; kubectl get nodes -o wide
NAMESPACE     NAME                                          READY   STATUS                       RESTARTS   AGE     IP              NODE      NOMINATED NODE
default       pod/www1                                      1/1     Running                      0          9m27s   10.10.2.3       node1     <none>
default       pod/www2                                      1/1     Running                      0          9m27s   10.10.3.3       node2     <none>
kube-system   pod/calico-kube-controllers-f4dcbf48b-qrqnc   0/1     CreateContainerConfigError   1          18m     192.168.1.212   node0     <none>
kube-system   pod/calico-node-j8cwr                         2/2     Running                      2          17m     192.168.1.212   node0     <none>
kube-system   pod/calico-node-qtq9m                         2/2     Running                      2          17m     192.168.1.214   node2     <none>
kube-system   pod/calico-node-qvf6w                         2/2     Running                      2          17m     192.168.1.211   master1   <none>
kube-system   pod/calico-node-rdt7k                         2/2     Running                      2          17m     192.168.1.213   node1     <none>
kube-system   pod/coredns-576cbf47c7-6l9wz                  1/1     Running                      2          21m     10.10.0.11      master1   <none>
kube-system   pod/coredns-576cbf47c7-86pxp                  1/1     Running                      2          21m     10.10.0.10      master1   <none>
kube-system   pod/etcd-master1                              1/1     Running                      19         20m     192.168.1.211   master1   <none>
kube-system   pod/kube-apiserver-master1                    1/1     Running                      2          20m     192.168.1.211   master1   <none>
kube-system   pod/kube-controller-manager-master1           1/1     Running                      1          20m     192.168.1.211   master1   <none>
kube-system   pod/kube-proxy-28qct                          1/1     Running                      1          20m     192.168.1.212   node0     <none>
kube-system   pod/kube-proxy-8ltpd                          1/1     Running                      1          21m     192.168.1.211   master1   <none>
kube-system   pod/kube-proxy-g9wmn                          1/1     Running                      1          20m     192.168.1.213   node1     <none>
kube-system   pod/kube-proxy-qlsxc                          1/1     Running                      1          20m     192.168.1.214   node2     <none>
kube-system   pod/kube-scheduler-master1                    1/1     Running                      5          19m     192.168.1.211   master1   <none>

NAMESPACE     NAME                   TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)          AGE     SELECTOR
default       service/kubernetes     ClusterIP   10.96.0.1      <none>        443/TCP          21m     <none>
default       service/www-np         NodePort    10.106.27.58   <none>        8080:30333/TCP   9m27s   service=testwww
kube-system   service/calico-typha   ClusterIP   10.99.14.62    <none>        5473/TCP         17m     k8s-app=calico-typha
kube-system   service/kube-dns       ClusterIP   10.96.0.10     <none>        53/UDP,53/TCP    21m     k8s-app=kube-dns

NAMESPACE     NAME                         DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR                 AGE   CONTAINERS                IMAGES                                                 SELECTOR
kube-system   daemonset.apps/calico-node   4         4         4       4            4           beta.kubernetes.io/os=linux   18m   calico-node,install-cni   quay.io/calico/node:v3.2.3,quay.io/calico/cni:v3.2.3   k8s-app=calico-node
kube-system   daemonset.apps/kube-proxy    4         4         4       4            4           <none>                        21m   kube-proxy                k8s.gcr.io/kube-proxy:v1.12.1                          k8s-app=kube-proxy

NAMESPACE     NAME                                      DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE   CONTAINERS                IMAGES                                   SELECTOR
kube-system   deployment.apps/calico-kube-controllers   1         1         1            0           18m   calico-kube-controllers   quay.io/calico/kube-controllers:v3.2.3   k8s-app=calico-kube-controllers
kube-system   deployment.apps/calico-typha              0         0         0            0           17m   calico-typha              quay.io/calico/typha:v3.2.3              k8s-app=calico-typha
kube-system   deployment.apps/coredns                   2         2         2            2           21m   coredns                   k8s.gcr.io/coredns:1.2.2                 k8s-app=kube-dns

NAMESPACE     NAME                                                DESIRED   CURRENT   READY   AGE   CONTAINERS                IMAGES                                   SELECTOR
kube-system   replicaset.apps/calico-kube-controllers-f4dcbf48b   1         1         0       18m   calico-kube-controllers   quay.io/calico/kube-controllers:v3.2.3   k8s-app=calico-kube-controllers,pod-template-hash=f4dcbf48b
kube-system   replicaset.apps/calico-typha-5f646c475c             0         0         0       17m   calico-typha              quay.io/calico/typha:v3.2.3              k8s-app=calico-typha,pod-template-hash=5f646c475c
kube-system   replicaset.apps/coredns-576cbf47c7                  2         2         2       21m   coredns                   k8s.gcr.io/coredns:1.2.2                 k8s-app=kube-dns,pod-template-hash=576cbf47c7

NAME      STATUS   ROLES    AGE   VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE           KERNEL-VERSION      CONTAINER-RUNTIME
master1   Ready    master   21m   v1.12.0   192.168.1.211   <none>        Ubuntu 18.04 LTS   4.15.0-20-generic   docker://17.12.1-ce
node0     Ready    <none>   20m   v1.12.0   192.168.1.212   <none>        Ubuntu 18.04 LTS   4.15.0-20-generic   docker://17.12.1-ce
node1     Ready    <none>   20m   v1.12.0   192.168.1.213   <none>        Ubuntu 18.04 LTS   4.15.0-20-generic   docker://17.12.1-ce
node2     Ready    <none>   20m   v1.12.0   192.168.1.214   <none>        Ubuntu 18.04 LTS   4.15.0-20-generic   docker://17.12.1-ce

$ for i in $(seq 20); do timeout 1 curl -so/dev/null http://192.168.1.214:30333 && echo -n x || echo -n -  ;done
xxxxxxxxxxxxxxxxxxxx

Update: Variant with flannel

I just tried with flannel: Result is surprisingly the same as with extra etcd (pods only answering if on the same node). This brings me to the question: is there anything about my OS? Ubuntu 18.04 with latest updates, installed using debootstrap. No firewall...

How I installed it:

kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

Result:

$ kubectl get all --all-namespaces=true -o wide; kubectl get nodes -o wide
NAMESPACE     NAME                                  READY   STATUS    RESTARTS   AGE     IP              NODE      NOMINATED NODE
default       pod/www1                              1/1     Running   0          3m40s   10.10.2.2       node1     <none>
default       pod/www2                              1/1     Running   0          3m40s   10.10.3.2       node2     <none>
kube-system   pod/coredns-576cbf47c7-64wxp          1/1     Running   3          21m     10.10.1.3       node0     <none>
kube-system   pod/coredns-576cbf47c7-7zvqs          1/1     Running   3          21m     10.10.1.2       node0     <none>
kube-system   pod/etcd-master1                      1/1     Running   0          21m     192.168.1.211   master1   <none>
kube-system   pod/kube-apiserver-master1            1/1     Running   0          20m     192.168.1.211   master1   <none>
kube-system   pod/kube-controller-manager-master1   1/1     Running   0          21m     192.168.1.211   master1   <none>
kube-system   pod/kube-flannel-ds-amd64-brnmq       1/1     Running   0          8m22s   192.168.1.214   node2     <none>
kube-system   pod/kube-flannel-ds-amd64-c6v67       1/1     Running   0          8m22s   192.168.1.213   node1     <none>
kube-system   pod/kube-flannel-ds-amd64-gchmv       1/1     Running   0          8m22s   192.168.1.211   master1   <none>
kube-system   pod/kube-flannel-ds-amd64-l9mpl       1/1     Running   0          8m22s   192.168.1.212   node0     <none>
kube-system   pod/kube-proxy-5pmtc                  1/1     Running   0          21m     192.168.1.213   node1     <none>
kube-system   pod/kube-proxy-7ctp5                  1/1     Running   0          21m     192.168.1.212   node0     <none>
kube-system   pod/kube-proxy-9zfhl                  1/1     Running   0          21m     192.168.1.214   node2     <none>
kube-system   pod/kube-proxy-hcs4g                  1/1     Running   0          21m     192.168.1.211   master1   <none>
kube-system   pod/kube-scheduler-master1            1/1     Running   0          20m     192.168.1.211   master1   <none>

NAMESPACE     NAME                 TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE     SELECTOR
default       service/kubernetes   ClusterIP   10.96.0.1        <none>        443/TCP          22m     <none>
default       service/www-np       NodePort    10.101.213.118   <none>        8080:30333/TCP   3m40s   service=testwww
kube-system   service/kube-dns     ClusterIP   10.96.0.10       <none>        53/UDP,53/TCP    22m     k8s-app=kube-dns

NAMESPACE     NAME                                     DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR                     AGE     CONTAINERS     IMAGES                                   SELECTOR
kube-system   daemonset.apps/kube-flannel-ds-amd64     4         4         4       4            4           beta.kubernetes.io/arch=amd64     8m22s   kube-flannel   quay.io/coreos/flannel:v0.10.0-amd64     app=flannel,tier=node
kube-system   daemonset.apps/kube-flannel-ds-arm       0         0         0       0            0           beta.kubernetes.io/arch=arm       8m22s   kube-flannel   quay.io/coreos/flannel:v0.10.0-arm       app=flannel,tier=node
kube-system   daemonset.apps/kube-flannel-ds-arm64     0         0         0       0            0           beta.kubernetes.io/arch=arm64     8m22s   kube-flannel   quay.io/coreos/flannel:v0.10.0-arm64     app=flannel,tier=node
kube-system   daemonset.apps/kube-flannel-ds-ppc64le   0         0         0       0            0           beta.kubernetes.io/arch=ppc64le   8m21s   kube-flannel   quay.io/coreos/flannel:v0.10.0-ppc64le   app=flannel,tier=node
kube-system   daemonset.apps/kube-flannel-ds-s390x     0         0         0       0            0           beta.kubernetes.io/arch=s390x     8m21s   kube-flannel   quay.io/coreos/flannel:v0.10.0-s390x     app=flannel,tier=node
kube-system   daemonset.apps/kube-proxy                4         4         4       4            4           <none>                            22m     kube-proxy     k8s.gcr.io/kube-proxy:v1.12.1            k8s-app=kube-proxy

NAMESPACE     NAME                      DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE   CONTAINERS   IMAGES                     SELECTOR
kube-system   deployment.apps/coredns   2         2         2            2           22m   coredns      k8s.gcr.io/coredns:1.2.2   k8s-app=kube-dns

NAMESPACE     NAME                                 DESIRED   CURRENT   READY   AGE   CONTAINERS   IMAGES                     SELECTOR
kube-system   replicaset.apps/coredns-576cbf47c7   2         2         2       21m   coredns      k8s.gcr.io/coredns:1.2.2   k8s-app=kube-dns,pod-template-hash=576cbf47c7
NAME      STATUS   ROLES    AGE   VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE           KERNEL-VERSION      CONTAINER-RUNTIME
master1   Ready    master   22m   v1.12.1   192.168.1.211   <none>        Ubuntu 18.04 LTS   4.15.0-20-generic   docker://17.12.1-ce
node0     Ready    <none>   21m   v1.12.1   192.168.1.212   <none>        Ubuntu 18.04 LTS   4.15.0-20-generic   docker://17.12.1-ce
node1     Ready    <none>   21m   v1.12.1   192.168.1.213   <none>        Ubuntu 18.04 LTS   4.15.0-20-generic   docker://17.12.1-ce
node2     Ready    <none>   21m   v1.12.1   192.168.1.214   <none>        Ubuntu 18.04 LTS   4.15.0-20-generic   docker://17.12.1-ce

$ for i in $(seq 20); do timeout 1 curl -so/dev/null http://192.168.1.214:30333 && echo -n x || echo -n -  ;done
-x--xxxxx-x-x---xxxx

provide 2 outputs: 1) kubectl get all --all-namespaces=true -o wide; 2)kubectl get nodes -o wide — Vit, Oct 08 '18 at 15:41

Daniel Alder · Answer 1 · 2018-10-13T22:22:54.777

So far, I found 3 problems:

docker version

In my first tries, I used docker.io from the default Ubuntu repositories (17.12.1-ce). In the tutorial https://computingforgeeks.com/how-to-setup-3-node-kubernetes-cluster-on-ubuntu-18-04-with-weave-net-cni/, I discovered they recommend something different:

apt-get --purge remove docker docker-engine docker.io
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
apt-get update
apt-get install docker-ce

This is now version 18.6.1, and also doesn't cause a warning anymore in kubeadm preflight check.

cleanup

I used kubeadm reset and deleting some directories when resetting my VMs to an unconfigured state. After I read some bug reports, I decided to extend the list of directories to remove. This is what I do now:

kubeadm reset
rm -rf /var/lib/cni/ /var/lib/calico/ /var/lib/kubelet/ /var/lib/etcd/ /etc/kubernetes/ /etc/cni/
reboot

Calico setup

With the above changes, I was immediately able to init a full-working setup (all pods "Running" and curl working). I did "Variant with extra etcd".

All this worked until the first reboot, then I had again the

calico-kube-controllers-f4dcbf48b-qrqnc CreateContainerConfigError

Digging into this problem showed me.

$ kubectl -n kube-system describe pod/calico-kube-controllers-f4dcbf48b-dp6n9
Events:
  Type     Reason            Age                     From               Message
  ----     ------            ----                    ----               -------
  Warning  Failed            4m32s (x10 over 9m)     kubelet, node1     Error: Couldn't find key etcd_endpoints in ConfigMap kube-system/calico-config

Then, I realized that I did two installation instructions in chain which were meant to do only one.

kubectl apply -f https://docs.projectcalico.org/v3.2/getting-started/kubernetes/installation/hosted/rbac-kdd.yaml

curl https://docs.projectcalico.org/v3.2/getting-started/kubernetes/installation/hosted/kubernetes-datastore/calico-networking/1.7/calico.yaml -O

cp -p calico.yaml calico.yaml_orig
sed -i 's/192.168.0.0/10.10.0.0/' calico.yaml

kubectl apply -f calico.yaml

Result

$ kubectl get pod,svc,nodes --all-namespaces -owide

NAMESPACE     NAME                                        READY   STATUS    RESTARTS   AGE   IP              NODE      NOMINATED NODE
default       pod/www1                                    1/1     Running   2          71m   10.10.3.4       node1     <none>
default       pod/www2                                    1/1     Running   2          71m   10.10.4.4       node2     <none>
kube-system   pod/calico-node-45sjp                       2/2     Running   4          74m   192.168.1.213   node1     <none>
kube-system   pod/calico-node-bprml                       2/2     Running   4          74m   192.168.1.211   master1   <none>
kube-system   pod/calico-node-hqdsd                       2/2     Running   4          74m   192.168.1.212   master2   <none>
kube-system   pod/calico-node-p8fgq                       2/2     Running   4          74m   192.168.1.214   node2     <none>
kube-system   pod/coredns-576cbf47c7-f2l7l                1/1     Running   2          84m   10.10.2.7       master2   <none>
kube-system   pod/coredns-576cbf47c7-frq5x                1/1     Running   2          84m   10.10.2.6       master2   <none>
kube-system   pod/etcd-master1                            1/1     Running   2          83m   192.168.1.211   master1   <none>
kube-system   pod/kube-apiserver-master1                  1/1     Running   2          83m   192.168.1.211   master1   <none>
kube-system   pod/kube-controller-manager-master1         1/1     Running   2          83m   192.168.1.211   master1   <none>
kube-system   pod/kube-proxy-9jmsk                        1/1     Running   2          80m   192.168.1.213   node1     <none>
kube-system   pod/kube-proxy-gtzvz                        1/1     Running   2          80m   192.168.1.214   node2     <none>
kube-system   pod/kube-proxy-str87                        1/1     Running   2          84m   192.168.1.211   master1   <none>
kube-system   pod/kube-proxy-tps6d                        1/1     Running   2          80m   192.168.1.212   master2   <none>
kube-system   pod/kube-scheduler-master1                  1/1     Running   2          83m   192.168.1.211   master1   <none>
kube-system   pod/kubernetes-dashboard-77fd78f978-9vdqz   1/1     Running   0          24m   10.10.3.5       node1     <none>

NAMESPACE     NAME                           TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE   SELECTOR
default       service/kubernetes             ClusterIP   10.96.0.1        <none>        443/TCP          84m   <none>
default       service/www-np                 NodePort    10.107.205.119   <none>        8080:30333/TCP   71m   service=testwww
kube-system   service/calico-typha           ClusterIP   10.99.187.161    <none>        5473/TCP         74m   k8s-app=calico-typha
kube-system   service/kube-dns               ClusterIP   10.96.0.10       <none>        53/UDP,53/TCP    84m   k8s-app=kube-dns
kube-system   service/kubernetes-dashboard   ClusterIP   10.96.168.213    <none>        443/TCP          24m   k8s-app=kubernetes-dashboard

NAMESPACE   NAME           STATUS   ROLES    AGE   VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE           KERNEL-VERSION      CONTAINER-RUNTIME
            node/master1   Ready    master   84m   v1.12.1   192.168.1.211   <none>        Ubuntu 18.04 LTS   4.15.0-20-generic   docker://18.6.1
            node/master2   Ready    <none>   80m   v1.12.1   192.168.1.212   <none>        Ubuntu 18.04 LTS   4.15.0-20-generic   docker://18.6.1
            node/node1     Ready    <none>   80m   v1.12.1   192.168.1.213   <none>        Ubuntu 18.04 LTS   4.15.0-20-generic   docker://18.6.1
            node/node2     Ready    <none>   80m   v1.12.1   192.168.1.214   <none>        Ubuntu 18.04 LTS   4.15.0-20-generic   docker://18.6.1


192.168.1.211 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
192.168.1.212 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
192.168.1.213 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
192.168.1.214 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

score 0 · Answer 2 · answered Oct 12 '18 at 08:18

0

Could it be that you did not install the kubernetes-cni package? If no network Providers work, this is very likely. AFAIK it is also not mentioned in the docs that you need to do this.

Should also be visible in the kubelet service log.

answered Oct 12 '18 at 08:18

marenkay

311
1
3

Welcome to the club ;) - And thanks for responding - Where do I get this kubernetes-cni? I thought calico or flannel are the cni? If there's something more to install, where do I get information about it? Will check the log tomorrow – Daniel Alder Oct 12 '18 at 09:23
Short check: kubernetes-cni package is installed. kubeadm depends on it. both and more from "deb http://apt.kubernetes.io/" . and docker is the default ubuntu package, version 17.12.1-ce btw – Daniel Alder Oct 12 '18 at 16:59