I faced multiple problems during installation of k8s multimaster cluster with external etcd. I did it before twice, on other sites, successfully, but this time I need help.

calico was installed from the recommended in guide yaml: https://docs.projectcalico.org/manifests/calico.yaml

First, there was problem installing calico - calico-node could not reach API, when apiServer.extraArgs.advertise-address was mentioned in config.

After that calico-kube-controllers stuck in ContainerCreating state. I managed to fix it by using calico-etcd.yaml instead if calico.yaml. Now calico pods are up and running, calicoctl can see them in etcd.

But the coredns pods stuck in ConteinerCreating. These lines I can see in describe pod:

  Warning  FailedScheduling        82s (x2 over 88s)  default-scheduler                  
0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
  Normal   Scheduled               80s                default-scheduler                  
Successfully assigned kube-system/coredns-6955765f44-clbhk to master01.<removed>
  Warning  FailedCreatePodSandBox  18s                kubelet, 
master01.<removed>  Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "9ab9fe3bd3d4e145c218fe59f6578169fa09075c59718fbe2f
7033d207c4ea4c" network for pod "coredns-6955765f44-clbhk": networkPlugin cni failed to set up pod "coredns-6955765f44-clbhk_kube-system" network: unable to connect to Cilium daemon: failed to create cilium agent client after 30.000000 seconds timeout: Get http:///var/run/cilium/cilium.sock/v1/config: dial unix /var/run/cilium/cilium.sock: connect: no such file or directory
Is the agent running?                                                                                                  
  Normal  SandboxChanged  17s  kubelet, master01.<removed>  Pod sandbox changed, it will be killed and re-created. 

But I don't use cilium. I use calico. I did tried cilium during first calico problem debug, but I removed it, rebiult cluster multiple times and also wiped etcd data after every try.

Here is kubelet config:

apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: "v1.17.2"
controlPlaneEndpoint: "" #balancer ip:port


#  extraArgs:
#    node-monitor-period: "2s"
#    node-monitor-grace-period: "16s"
#    pod-eviction-timeout: "30s"

  dnsDomain: "cluster.local"
  podSubnet: ""
  serviceSubnet: ""

  timeoutForControlPlane: "60s"
#  extraArgs:
#    advertise-address: ""
#    bind-address: ""
#    secure-port: "6443"

kubernetes 1.17.2, etcd 3.3.11, centos 7 x64

It feels like problem is somewhere between api pod and etcd, but I can't locate it.

Paul K.
  • 125
  • 1
  • 1
  • 9

1 Answers1


Oh, nevermind. I have found it.

There were cilium-cni cilium-cni.old files in /opt/cni/bin/ These files obviously were installed with cilium, so they survived kubernetes-cni rpm reinstallation. Idk why, but k8s prefers cilium, if it is available. Is it bug? Should I report it?

Paul K.
  • 125
  • 1
  • 1
  • 9