kube-proxy not accepting connections on some nodes

Question

I'm running into a really weird situation where kube-proxy doesn't accept connections on a some nodes and I'm frankly I'm stumped.

Background:

kubernetes cluster is setup on AWS EC2 using kops
the cluster in is eu-central-1, spread over three availability zones (a, b, c)
kubernetes version is v1.14.6 but this has been observed also with v.1.13.x
there doesn't seem to be anything erroneous in the kube-proxylogs, in fact the output seems quite identical
kube-proxy does listen on the given port, but requests time out
the issue seems to be related only to ingress service, internal load balancer and HTTPS (443) port

Example below. Both nodes are in the same availability zone.

Healthy node:

curl --connect-timeout 5 -sD - -k https://localhost:32028/healthz

HTTP/2 200
date: Sun, 13 Oct 2019 13:51:32 GMT
content-type: text/html
content-length: 0

sudo netstat -taupen | grep kube-proxy | grep LISTEN                                                                                       
tcp        0      0 127.0.0.1:10249         0.0.0.0:*               LISTEN      0          21284      3162/kube-proxy
tcp6       0      0 :::32618                :::*                    LISTEN      0          23820      3162/kube-proxy
tcp6       0      0 :::32012                :::*                    LISTEN      0          21359      3162/kube-proxy
tcp6       0      0 :::10256                :::*                    LISTEN      0          21280      3162/kube-proxy
tcp6       0      0 :::30259                :::*                    LISTEN      0          21358      3162/kube-proxy
tcp6       0      0 :::30844                :::*                    LISTEN      0          21361      3162/kube-proxy
tcp6       0      0 :::32028                :::*                    LISTEN      0          21360      3162/kube-proxy
tcp6       0      0 :::30048                :::*                    LISTEN      0          21357      3162/kube-proxy

Unhealthy node:

curl -v --connect-timeout 5 -sD - -k https://localhost:32028/healthz'                                                                             

*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connection timed out after 5001 milliseconds
* Curl_http_done: called premature == 1
* stopped the pause stream!
* Closing connection 0

sudo netstat -taupen | grep kube-proxy | grep LISTEN
tcp        0      0 127.0.0.1:10249         0.0.0.0:*               LISTEN      0          23611      2726/kube-proxy
tcp6       0      0 :::32618                :::*                    LISTEN      0          22662      2726/kube-proxy
tcp6       0      0 :::32012                :::*                    LISTEN      0          22654      2726/kube-proxy
tcp6       0      0 :::10256                :::*                    LISTEN      0          21872      2726/kube-proxy
tcp6       0      0 :::30259                :::*                    LISTEN      0          22653      2726/kube-proxy
tcp6       0      0 :::32028                :::*                    LISTEN      0          22656      2726/kube-proxy
tcp6       0      0 :::30844                :::*                    LISTEN      0          22655      2726/kube-proxy
tcp6       0      0 :::30048                :::*                    LISTEN      0          22652      2726/kube-proxy

Environment:

kubectl get services nginx-ingress
NAME            TYPE           CLUSTER-IP      EXTERNAL-IP                                                                        PORT(S)                                   AGE
nginx-ingress   LoadBalancer   100.67.138.99   xxxx-yyyy.elb.eu-central-1.amazonaws.com   80:30259/TCP,443:32028/TCP,22:32012/TCP   47h


kubectl version
Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.1", GitCommit:"d647ddbd755faf07169599a625faf302ffc34458", GitTreeState:"clean", BuildDate:"2019-10-02T23:49:07Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.6", GitCommit:"96fac5cd13a5dc064f7d9f4f23030a6aeface6cc", GitTreeState:"clean", BuildDate:"2019-08-19T11:05:16Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"}

kubectl get deployment nginx-ingress -oyaml:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  annotations:
    checksum/config: 0f815cbf49129a18dacd05c1f35c0e2c0a36d0ad4f8f0272828a558c49c40aed
    configmap.reloader.stakater.com/reload: ingress-nginx,tcp-services,udp-services
    deployment.kubernetes.io/revision: "5"
  creationTimestamp: "2019-10-11T14:16:23Z"
  generation: 18
  labels:
    app: nginx-ingress
    chart: nginx-ingress-0.26.1
    heritage: Tiller
    k8s-addon: ingress-nginx.addons.k8s.io
    k8s-app: nginx-ingress-controller
    release: nginx-ingress-private
  name: nginx-ingress
  namespace: backend
  resourceVersion: "85333311"
  selfLink: /apis/extensions/v1beta1/namespaces/backend/deployments/nginx-ingress
  uid: b4a6f226-ec31-11e9-bd40-066623cdec10
spec:
  progressDeadlineSeconds: 600
  replicas: 3
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: nginx-ingress
      release: nginx-ingress-private
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      annotations:
        prometheus.io/port: "10254"
        prometheus.io/scrape: "true"
      creationTimestamp: null
      labels:
        app: nginx-ingress
        k8s-addon: ingress-nginx.addons.k8s.io
        k8s-app: nginx-ingress-controller
        release: nginx-ingress-private
    spec:
      containers:
      - args:
        - /nginx-ingress-controller
        - --logtostderr=true
        - --stderrthreshold=0
        - --http-port=80
        - --https-port=443
        - --healthz-port=10254
        - --default-backend-service=$(POD_NAMESPACE)/nginx-ingress-default
        - --configmap=$(POD_NAMESPACE)/ingress-nginx
        - --tcp-services-configmap=$(POD_NAMESPACE)/tcp-services
        - --udp-services-configmap=$(POD_NAMESPACE)/udp-services
        - --publish-service=$(POD_NAMESPACE)/nginx-ingress
        - --default-ssl-certificate=$(POD_NAMESPACE)/certs-tls
        - --ingress-class=private
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        - name: STAKATER_TCP_SERVICES_CONFIGMAP
          value: 060d83d51ec7d01fe8af12484189acf792044690
        - name: STAKATER_UDP_SERVICES_CONFIGMAP
          value: da39a3ee5e6b4b0d3255bfef95601890afd80709
        - name: STAKATER_INGRESS_NGINX_CONFIGMAP
          value: 6a393dab54c63f4117785f635b7c00c64e140853
        image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.26.1
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /healthz
            port: 10254
            scheme: HTTP
          initialDelaySeconds: 30
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 5
        name: nginx-ingress
        ports:
        - containerPort: 80
          name: http
          protocol: TCP
        - containerPort: 443
          name: https
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /healthz
            port: 10254
            scheme: HTTP
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        resources:
          limits:
            cpu: 250m
            memory: 512Mi
          requests:
            cpu: 10m
            memory: 50Mi
        securityContext:
          capabilities:
            add:
            - NET_BIND_SERVICE
            drop:
            - ALL
          runAsUser: 33
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: nginx-ingress
      serviceAccountName: nginx-ingress
      terminationGracePeriodSeconds: 60
status:
  availableReplicas: 3
  conditions:
  - lastTransitionTime: "2019-10-11T14:16:24Z"
    lastUpdateTime: "2019-10-11T21:55:50Z"
    message: ReplicaSet "nginx-ingress-6d994d7b96" has successfully progressed.
    reason: NewReplicaSetAvailable
    status: "True"
    type: Progressing
  - lastTransitionTime: "2019-10-13T11:05:46Z"
    lastUpdateTime: "2019-10-13T11:05:46Z"
    message: Deployment has minimum availability.
    reason: MinimumReplicasAvailable
    status: "True"
    type: Available
  observedGeneration: 18
  readyReplicas: 3
  replicas: 3
  updatedReplicas: 3

/etc/kubernetes/manifests/kube-proxy.manifest:

apiVersion: v1
kind: Pod
metadata:
  annotations:
    scheduler.alpha.kubernetes.io/critical-pod: ""
  creationTimestamp: null
  labels:
    k8s-app: kube-proxy
    tier: node
  name: kube-proxy
  namespace: kube-system
spec:
  containers:
  - command:
    - /bin/sh
    - -c
    - mkfifo /tmp/pipe; (tee -a /var/log/kube-proxy.log < /tmp/pipe & ) ; exec /usr/local/bin/kube-proxy
      --cluster-cidr=100.96.0.0/11 --conntrack-max-per-core=131072 --hostname-override=ip-aaa-bbb-ccc-ddd.eu-central-1.compute.internal
      --kubeconfig=/var/lib/kube-proxy/kubeconfig --master=https://api.xxx.yyy.zzz
      --oom-score-adj=-998 --resource-container="" --v=2 > /tmp/pipe 2>&1
    image: k8s.gcr.io/kube-proxy:v1.14.6
    name: kube-proxy
    resources:
      requests:
        cpu: 100m
    securityContext:
      privileged: true
    volumeMounts:
    - mountPath: /var/lib/kube-proxy/kubeconfig
      name: kubeconfig
      readOnly: true
    - mountPath: /var/log/kube-proxy.log
      name: logfile
    - mountPath: /lib/modules
      name: modules
      readOnly: true
    - mountPath: /etc/ssl/certs
      name: ssl-certs-hosts
      readOnly: true
    - mountPath: /run/xtables.lock
      name: iptableslock
  hostNetwork: true
  priorityClassName: system-node-critical
  tolerations:
  - key: CriticalAddonsOnly
    operator: Exists
  volumes:
  - hostPath:
      path: /var/lib/kube-proxy/kubeconfig
    name: kubeconfig
  - hostPath:
      path: /var/log/kube-proxy.log
    name: logfile
  - hostPath:
      path: /lib/modules
    name: modules
  - hostPath:
      path: /usr/share/ca-certificates
    name: ssl-certs-hosts
  - hostPath:
      path: /run/xtables.lock
      type: FileOrCreate
    name: iptableslock
status: {}

kubectl get service nginx-ingress -oyaml:

apiVersion: v1
kind: Service
metadata:
  annotations:
    dns.alpha.kubernetes.io/internal: private.xxx.yyy.zzz
    external-dns.alpha.kubernetes.io/hostname: private.xxx.yyy.zzz
    kubernetes.io/ingress.class: private
    service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
    service.beta.kubernetes.io/aws-load-balancer-internal: 0.0.0.0/0
    service.beta.kubernetes.io/aws-load-balancer-type: nlb
    service.beta.kubernetes.io/do-loadbalancer-enable-proxy-protocol: "false"
  creationTimestamp: "2019-10-11T14:16:23Z"
  labels:
    app: nginx-ingress
    chart: nginx-ingress-0.26.1
    heritage: Tiller
    k8s-addon: ingress-nginx.addons.k8s.io
    release: nginx-ingress-private
  name: nginx-ingress
  namespace: backend
  resourceVersion: "84591201"
  selfLink: /api/v1/namespaces/backend/services/nginx-ingress
  uid: b4a1b347-ec31-11e9-bd40-066623cdec10
spec:
  clusterIP: 100.xxx.yyy.zzz
  externalTrafficPolicy: Local
  healthCheckNodePort: 32618
  ports:
  - name: http
    nodePort: 30259
    port: 80
    protocol: TCP
    targetPort: http
  - name: https
    nodePort: 32028
    port: 443
    protocol: TCP
    targetPort: https
  - name: ssh
    nodePort: 32012
    port: 22
    protocol: TCP
    targetPort: 22
  selector:
    app: nginx-ingress
    release: nginx-ingress-private
  sessionAffinity: None
  type: LoadBalancer
status:
  loadBalancer:
    ingress:
    - hostname: xxx-yyy.elb.eu-central-1.amazonaws.com

kubectl get pods -n backend -o wide:

NAME                                    READY   STATUS    RESTARTS   AGE     IP              NODE                                              NOMINATED NODE   READINESS GATES
drone-drone-server-59586dd487-rmv9b     1/1     Running   0          2d10h   aaa.bbb.13.19    ip-172-xxx-yyy-180.eu-central-1.compute.internal   <none>           <none>
kube-slack-848c9646fd-8w5mw             1/1     Running   0          2d10h   aaa.bbb.13.10    ip-172-xxx-yyy-180.eu-central-1.compute.internal   <none>           <none>
nginx-ingress-5nlzg                     1/1     Running   0          9h      aaa.bbb.14.164   ip-172-xxx-yyy-201.eu-central-1.compute.internal   <none>           <none>
nginx-ingress-7xb54                     1/1     Running   0          9h      aaa.bbb.15.120   ip-172-xxx-yyy-156.eu-central-1.compute.internal   <none>           <none>
nginx-ingress-default-589975445-qdjvz   1/1     Running   0          9h      aaa.bbb.14.150   ip-172-xxx-yyy-201.eu-central-1.compute.internal   <none>           <none>
nginx-ingress-jqtd6                     1/1     Running   0          9h      aaa.bbb.12.84    ip-172-xxx-yyy-29.eu-central-1.compute.internal    <none>           <none>
nginx-ingress-z9nt8                     1/1     Running   0          9h      aaa.bbb.13.57    ip-172-xxx-yyy-180.eu-central-1.compute.internal   <none>           <none>
sonarqube-sonarqube-746cbc858b-ks87c    1/1     Running   0          2d10h   aaa.bbb.12.13    ip-172-xxx-yyy-29.eu-central-1.compute.internal    <none>           <none>
youtrack-0                              1/1     Running   0          2d10h   aaa.bbb.12.52    ip-172-xxx-yyy-29.eu-central-1.compute.internal    <none>           <none>

Any ideas on what could be wrong or how to debug issue further would be greatly appreciated.

UPDATE: Sees that the "unhealthy" / non-working nodes are only the nodes where nginx is not deployed. Deploying the nginx pods as DeamonSet instead of Deployment solves the issue. This still does not answer a question why this issue appears only under strict specific conditions (non-plain HTTP port + internal load balancer) and how to solve it.

could you check if you have a pod running in the unhealthy node with kubectl get pods -n backend -o wide — c4f4t0r, Oct 13 '19 at 15:35
No, pods and nodes are healthy. Although, I do have three instances of `nginx-ingress` running and surprisingly they are running on the three nodes that work properly. At this point I think it might be some kind of a routing issue. — boky, Oct 13 '19 at 19:37
please, provide the outputs of kubectl get nodes -o wide and kubectl get pods -n backend -o wide — c4f4t0r, Oct 13 '19 at 22:05

Nepomucen · Accepted Answer · 2019-10-17T14:02:02.023

I would say it's working as designed.

Please note that by specifying in the 'nginx-ingress' Service (LB) of Ingress Controller the following annotation:

 externalTrafficPolicy: Local

you are enforcing an ingress traffic to be served only by these Pods (replicas of Ingress Controller), that are local to the Node. In AWS the NLB does not know which Nodes (the Load Balancer's registered targets) don't own its own replica of Nginx-Ingress Controller, therefore sometime a LB distributed traffic fails.
To avoid this behavior you need to remove these Nodes manually from LB healthcheck probes.

Why switching from 'Deployment' to 'DeamonSet' solves the issue in your case ?, cause it ensures that each Node runs a copy of Ingress Controller Pod.

Please note, that it's a Cloud provider specific behavior of Load Balancer, documented here.

if you don't care about preserving source Client IPs, just change the 'externalTrafficPolicy' from 'Local' to 'Cluster' to solve your issue.

Thanks! Solved! I got the impression that this means an internal load balancer would be created - I guess I read too fast through the document. — boky, Oct 18 '19 at 15:49

kube-proxy not accepting connections on some nodes

1 Answers1