1

I setup a Kubernetes cluster using Kubernetes the Hard Way tutorial, and the connection is hanging whenever a Pod connects to another Pod on the same node through a ClusterIP (hairpin traffic).

If I access the pods directly, without going through the ClusterIP, everything works fine.

So, visually, this doesn't work:

PodA -> ServiceA ClusterIP -> PodA 
PodA -> ServiceB ClusterIP -> PodB on same node 

However, this works 100% great, as does any Pod contacting another Pod directly by it's IP:

PodA -> ServiceB ClusterIP -> PodB on other node 

I found Kubernetes Documentation about debugging services and went through it, and everything seems fine, up to the section A Pod can't reach itself via Service VIP.

I see rules added for my services in iptables-save output (and I confirmed using iptables mode):

-A KUBE-SERVICES ! -s 10.200.0.0/16 -d 10.32.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns cluster IP" -m udp --dport 53 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.32.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns cluster IP" -m udp --dport 53 -j KUBE-SVC-TCOU7JCQXEZGVUNU
-A KUBE-SERVICES ! -s 10.200.0.0/16 -d 10.32.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp --dport 53 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.32.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp --dport 53 -j KUBE-SVC-ERIFXISQEP7F7OF4

I can see from the kubelet's logs promiscuous-bridge hairpin mode flag:

kubelet[12496]: I1204 04:13:29.761707   12496 flags.go:33] FLAG: --hairpin-mode="promiscuous-bridge"

I don't see logs such as Hairpin mode set to "promiscuous-bridge" specifically confirming the mode, so I set it explicitly in the kubelet-config.yml also

Also, I edited the CNI plugin to add promiscMode: true (docs), and I see PROMISC on the cnio0 interface:

cnio0: flags=4419<UP,BROADCAST,RUNNING,PROMISC,MULTICAST>  mtu 1500

At this point I think either 1) this tutorial doesn't have hairpin traffic working or 2) I screwed up something obscure that's breaking this, but I can't figure out what it is!

Since this Kubernetes the Hard Way tutorial is known to be the canonical setup reference, I'm doubting it would be #1... anyone have other suggestions to determine #2?

aarosil
  • 111
  • 5
  • Could you please check if you got `"hairpinMode": true` in your CNI config? – Wytrzymały Wiktor Dec 05 '19 at 12:37
  • @OhHiMark, you can only set one of `promiscMode` or `hairpinMode` to true in the CNI plugin. I set `promiscMode: true` since that seemed to correlate to the kubelet's `hairpinMode` setting of `promiscuous-bridge`. I assumed you'd use `hairpinMode: true` in the CNI plugin if you set kubelet `hairpinMode` value to `hairpin-veth` instead. I did however try both combinations and neither worked. I think you right, it must be something w/ CNI plugin. – aarosil Dec 06 '19 at 05:32
  • Under [this Github thread](https://github.com/kubernetes/kubernetes/issues/45790) there is a discussion regarding this particular issue. Indeed the problem lays within the CNI and not Kubernetes itself. There is [another thread](https://github.com/containernetworking/cni/issues/476) opened for that already. – Wytrzymały Wiktor Dec 09 '19 at 13:43
  • Thanks! I saw those issues and figured it would've been resolved given how old they were, but maybe not. I'll look into the CNI situation. I think in another thread I found, someone used Cilium or another CNI plugin instead, and then hairpin traffic worked. – aarosil Dec 09 '19 at 17:04
  • Good to hear that. I will post an answer so the rest of the community with a similar problem could also benefit from. – Wytrzymały Wiktor Dec 10 '19 at 08:57
  • Did @Wytrzymały Wiktor answer help you to solve your problem?If yes,Please consider accepting and upvoting it.[What should I do when someone answers my question](https://stackoverflow.com/help/someone-answers)? – Fariya Rahmat Apr 05 '22 at 06:38

1 Answers1

0

This is a known issue with CNI documented here and here.

I think the abstract "allows hairpin behavior" is a property of plugins. Some support it. Some do not. Some may support it in multiple ways with configuration required. If you don't support it in a Kubernetes install, one class of Service behavior will not work correctly. That may be OK for some installations and not for others.

However, you can use Flannel with hairpinMode set to true (the default flannel configuration does not set HairpinMode to true). That pat of the config would look like this:

    {
      "name": "<name_here>",
      "type": "flannel",
      "delegate": {
        "hairpinMode": true,
        "isDefaultGateway": true
      }
    }

Please let me know if that helped.