AKS - Sporadic Ingress/Routing 502 after adding Windows node

Question

I have some kind of routing issue when adding a Windows node to an AKS cluster locked down to internal network. With 0 Windows node, ingress works fine, and very reliable. Ingress was setup using this instruction: https://docs.microsoft.com/en-us/azure/aks/ingress-internal-ip?tabs=azure-cli

However, after adding a Windows node (by setting Windows node pool scaling from 0 to 1), before running any pods, ~5% of the same requests that worked before would result in 502 errors sporadically, Fiddler shows that hostname was resolved to the correct Kubernetes Loadbalancer IP, but "target machine actively refused it".

The logs for ingress pod show no trace of those 502s, so somehow the request got to the loadbalancer, but wasn't routed to ingress to handle.

Scaling out the ingress pod from 1 to 2 did not help. Removing the Windows node from the cluster immediately resolves the issue, as if some inner workings of kubernetes try to use the Windows node for ingress and failed.

Any suggestions on where to look, how to diagnose the problem would be appreciated.

AKS - Sporadic Ingress/Routing 502 after adding Windows node

0 Answers0