2

I have a fresh k8s cluster on gke.

Whenever I run kubectl top node gke-data-custom-vm-6-25-0cbae9b9-hrkc I get

Error from server (NotFound): the server could not find the requested resource (get services http:heapster:)

At the same time I have this service:

> kubectl -n kube-system get services
NAME                   TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)         AGE
default-http-backend   NodePort    10.11.241.20    <none>        80:32688/TCP    59d
heapster               ClusterIP   10.11.245.182   <none>        80/TCP          59d
kube-dns               ClusterIP   10.11.240.10    <none>        53/UDP,53/TCP   59d
metrics-server         ClusterIP   10.11.249.26    <none>        443/TCP         59d

and a pod with heapster is running (and I can see it was restarted a lot of times)

 kubectl -n kube-system get pods
NAME                                               READY     STATUS    RESTARTS   AGE
event-exporter-v0.2.3-85644fcdf-kwd6g              2/2       Running   0          16d
fluentd-gcp-scaler-8b674f786-dbrcr                 1/1       Running   0          16d
fluentd-gcp-v3.2.0-2fqgl                           2/2       Running   0          17d
fluentd-gcp-v3.2.0-47586                           2/2       Running   0          17d
fluentd-gcp-v3.2.0-552xm                           2/2       Running   0          16d
heapster-v1.6.0-beta.1-fdc7fd478-8s998             3/3       Running   73         16d

However I can see in logs of heapster-nanny container some errors:

> kubectl logs -n kube-system --tail 10 -f po/heapster-v1.6.0-beta.1-fdc7fd478-8s998 -c heapster-nanny
ERROR: logging before flag.Parse: E0418 23:30:10.075539       1 nanny_lib.go:95] Error while querying apiserver for resources: Get https://10.11.240.1:443/api/v1/namespaces/kube-system/pods/heapster-v1.6.0-beta.1-fdc7fd478-8s998: dial tcp 10.11.240.1:443: getsockopt: connection refused
ERROR: logging before flag.Parse: E0418 23:30:10.971230       1 reflector.go:205] k8s.io/autoscaler/addon-resizer/nanny/kubernetes_client.go:107: Failed to list *v1.Node: Get https://10.11.240.1:443/api/v1/nodes?resourceVersion=0: dial tcp 10.11.240.1:443: getsockopt: connection refused
ERROR: logging before flag.Parse: E0418 23:30:11.972337       1 reflector.go:205] k8s.io/autoscaler/addon-resizer/nanny/kubernetes_client.go:107: Failed to list *v1.Node: Get https://10.11.240.1:443/api/v1/nodes?resourceVersion=0: dial tcp 10.11.240.1:443: getsockopt: connection refused
ERROR: logging before flag.Parse: E0418 23:30:12.973637       1 reflector.go:205] k8s.io/autoscaler/addon-resizer/nanny/kubernetes_client.go:107: Failed to list *v1.Node: Get https://10.11.240.1:443/api/v1/nodes?resourceVersion=0: dial tcp 10.11.240.1:443: getsockopt: connection refused
ERROR: logging before flag.Parse: E0418 23:30:13.975024       1 reflector.go:205] k8s.io/autoscaler/addon-resizer/nanny/kubernetes_client.go:107: Failed to list *v1.Node: Get https://10.11.240.1:443/api/v1/nodes?resourceVersion=0: dial tcp 10.11.240.1:443: getsockopt: connection refused
ERROR: logging before flag.Parse: E0418 23:30:14.976582       1 reflector.go:205] k8s.io/autoscaler/addon-resizer/nanny/kubernetes_client.go:107: Failed to list *v1.Node: Get https://10.11.240.1:443/api/v1/nodes?resourceVersion=0: dial tcp 10.11.240.1:443: getsockopt: connection refused
ERROR: logging before flag.Parse: E0418 23:30:16.063760       1 reflector.go:205] k8s.io/autoscaler/addon-resizer/nanny/kubernetes_client.go:107: Failed to list *v1.Node: Get https://10.11.240.1:443/api/v1/nodes?resourceVersion=0: dial tcp 10.11.240.1:443: getsockopt: connection refused
ERROR: logging before flag.Parse: E0418 23:30:27.065693       1 reflector.go:205] k8s.io/autoscaler/addon-resizer/nanny/kubernetes_client.go:107: Failed to list *v1.Node: Get https://10.11.240.1:443/api/v1/nodes?resourceVersion=0: net/http: TLS handshake timeout
ERROR: logging before flag.Parse: E0418 23:30:30.077159       1 nanny_lib.go:95] Error while querying apiserver for resources: Get https://10.11.240.1:443/api/v1/namespaces/kube-system/pods/heapster-v1.6.0-beta.1-fdc7fd478-8s998: net/http: TLS handshake timeout
ERROR: logging before flag.Parse: E0418 23:30:59.778560       1 reflector.go:205] k8s.io/autoscaler/addon-resizer/nanny/kubernetes_client.go:107: Failed to list *v1.Node: Get https://10.11.240.1:443/api/v1/nodes?resourceVersion=0: dial tcp 10.11.240.1:443: i/o timeout

and also in heapster container

I0423 07:02:10.765134       1 heapster.go:113] Starting heapster on port 8082
W0423 07:16:27.975467       1 manager.go:152] Failed to get all responses in time (got 2/3)
W0423 07:16:43.064110       1 manager.go:107] Failed to get kubelet_summary:10.128.0.49:10255 response in time
W0423 07:20:36.875359       1 manager.go:152] Failed to get all responses in time (got 2/3)
W0423 07:20:44.383790       1 manager.go:107] Failed to get kubelet_summary:10.128.0.49:10255 response in time
W0423 07:22:29.683060       1 manager.go:152] Failed to get all responses in time (got 2/3)
W0423 07:22:40.278962       1 manager.go:107] Failed to get kubelet_summary:10.128.0.49:10255 response in time
W0423 07:31:27.072711       1 manager.go:152] Failed to get all responses in time (got 2/3)
W0423 07:31:54.580031       1 manager.go:107] Failed to get kubelet_summary:10.128.0.49:10255 response in time

How can I fix this?

Any additional info that I should provide?

Korjavin Ivan
  • 2,230
  • 2
  • 25
  • 39

2 Answers2

3

Heapster Deprecation

Heapster is a deprecated project and may have problems when running in recent Kubernetes versions.

See Heapster Deprecation Timeline:

| Kubernetes Release  | Action              | Policy/Support                                                                   |
|---------------------|---------------------|----------------------------------------------------------------------------------|
| Kubernetes 1.11     | Initial Deprecation | No new features or sinks are added.  Bugfixes may be made.                       |
| Kubernetes 1.12     | Setup Removal       | The optional to install Heapster via the Kubernetes setup script is removed.     |
| Kubernetes 1.13     | Removal             | No new bugfixes will be made.  Move to kubernetes-retired organization.          |

Since Kubernetes v1.10, the kubectl top relies on metrics-server by default.

CHANGELOG-1.10.md:

  • Support metrics API in kubectl top commands. (#56206, @brancz)

This PR implements support for the kubectl top commands to use the metrics-server as an aggregated API, instead of requesting the metrics from heapster directly. If the metrics.k8s.io API is not served by the apiserver, then this still falls back to the previous behavior.


What you should do:

As of Heapster is deprecated, and you already have a metrics-server deployed, the best option is to use a kubectl version v1.10 or above, as it fetches the metrics from metrics-server.

However, beware of kubectl Version Skew Policy:

kubectl is supported within one minor version (older or newer) of kube-apiserver

Check your kube-apiserver version before choosing your kubectl version.

Eduardo Baitello
  • 267
  • 1
  • 14
1

I guess your issue might be related to auto-upgrade of your GKE's Master nodes.

Mine got upgraded recently to v1.11.8-gke.6, and during the upgrade, I observed the same intermittent errors inside heapster-nanny container:

(error code: E0418)

For me, the problem no longer persists, and I can safely get the nodes' metrics with kubectl.

Eduardo Baitello
  • 267
  • 1
  • 14
Nepomucen
  • 306
  • 1
  • 4