Running Kubernetes with flannel on a local ESXI server with 3 VMs, a master and two nodes. On all of the nodes, I have Kubernetes 1.15.5, Ubuntu 18.04, and Docker 18.09.7. A green field install.
Nginx runs fine with a single pod on either node, but when scaling to two pods, random connection timeouts start occurring after a long pause from curl.
kubectl apply -f nginx.yaml
deployment.apps/nginx configured
service/nginx unchanged
cat nginx.yaml
apiVersion: apps/v1beta2
kind: Deployment
metadata:
name: nginx
spec:
selector:
matchLabels:
app: nginx
replicas: 1
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1
ports:
- name: http
containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: nginx
spec:
ports:
- name: http
nodePort: 32000
port: 80
protocol: TCP
targetPort: 80
selector:
app: nginx
type: NodePort
kubectl get services,pods,deployments,daemonsets -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 6d17h <none>
service/nginx NodePort 10.102.48.211 <none> 80:32000/TCP 45m app=nginx
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod/nginx-6d4fbdf4df-q7jdt 1/1 Running 0 45m 10.10.2.6 kubernetes3 <none> <none>
NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR
deployment.extensions/nginx 1/1 1 1 45m nginx nginx:1 app=nginx
curl http://kubernetes3:32000 returns the nginx page
curl http://kubernetes2:32000 returns a connection timeout.
Scaling up two two pods
kubectl scale --replicas=2 deployment nginx
deployment.extensions/nginx scaled
kubectl get services,pods,deployments,daemonsets -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 6d17h <none>
service/nginx NodePort 10.102.48.211 <none> 80:32000/TCP 48m app=nginx
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod/nginx-6d4fbdf4df-q7jdt 1/1 Running 0 48m 10.10.2.6 kubernetes3 <none> <none>
pod/nginx-6d4fbdf4df-zg2n5 1/1 Running 0 42s 10.10.1.5 kubernetes2 <none> <none>
NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR
deployment.extensions/nginx 2/2 2 2 48m nginx nginx:1 app=nginx
curl http://kubernetes3:32000 works half the time and curl http://kubernetes2:32000 works almost half of the time. The other half, I get a connection timeout. If I run the commands on node 3 or 2, I get the same thing. Telnet gets the random timeouts as well although ports are listening on all nodes and I have full connectivity between all nodes.
url: (7) Failed to connect to kubernetes2 port 32000: Connection timed out
telnet -d kubernetes3 32000
Trying <IP>...
setsockopt (SO_DEBUG): Permission denied
kubernetes3:~$ netstat -tulpn | grep 3200
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp6 0 0 :::32000 :::* LISTEN -
Why am I getting these timeouts when I scale up to two or more instances?