I'm setting up Istio in a new AWS EKS cluster and created a basic nginx deployment to test. When the deployment only has one replica, it works perfectly, responding in less than 100ms. When I add one replica, the new pod's response time goes up like crazy, averaging around 10 seconds.
Based on suggestions from elsewhere, I updated the mesh config to disable automatic retries:
meshConfig:
defaultHttpRetryPolicy: {}
After this happened, I found that requests to the second pod are always failing:
"GET / HTTP/1.1" 503 UF upstream_reset_before_response_started{connection_failure} - "-" 0 91 10003 - "108.249.9.111,10.1.0.117" "curl/7.68.0" "6fa51be8-1441-4454-8d 1b-a03c93b257dc" "example.com" "10.1.52.62:80" outbound|80||nginx.my-namespace.svc.cluster.local - 10.1.108.189:8080 10.1.0.117:21410 - -
My setup is the following:
# AWS ALB Ingress -> istio-ingressgateway (ClusterIP) -> gateway -> virtualservice -> service -> nginx
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
name: default
spec:
selector:
istio: ingressgateway
servers:
- port:
number: 80
name: http
protocol: HTTP
hosts:
- "*"
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: nginx
spec:
hosts:
- "example.com"
gateways:
- default
http:
- route:
- destination:
host: nginx
---
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
selector:
app: nginx
ports:
- port: 80
name: http
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
labels:
app: nginx
version: v1
spec:
replicas: 2
revisionHistoryLimit: 1
selector:
matchLabels:
app: nginx
version: v1
template:
metadata:
labels:
app: nginx
version: v1
spec:
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
resources:
requests:
memory: 100Mi
cpu: 100m
limits:
memory: 1500Mi
cpu: 1000m
Versions:
$ istioctl version
client version: 1.13.2
control plane version: 1.13.2
data plane version: 1.13.2 (1 proxies)
$ kubectl version --short
Client Version: v1.21.11
Server Version: v1.21.5-eks-bc4871b