0

Rancher Server Setup

  • Rancher version: 2.6.3
  • Installation option (Docker install/Helm Chart): Helm Chart, Kubernetes v1.21.6 and RKE1

Information about the Cluster Kubernetes version: v1.20.15-rancher1-2 Cluster Type (Local/Downstream): Downstream If downstream, what type of cluster? (Custom/Imported or specify provider for Hosted/Infrastructure Provider): RKE Custom (3 nodes on-prem + 1 node on Azure)

User Information What is the role of the user logged in? (Admin/Cluster Owner/Cluster Member/Project Owner/Project Member/Custom) Admin role

Describe the bug
To illustrate the inter-pod communication problem, consider these three dcgm-exporter pods that collect and expose GPU metrics :

  • URL1- http://10.42.0.79:9400/metrics -> Pod 10.42.4.54 running on node-1-on-prem

  • URL2- http://10.42.2.77:9400/metrics -> Pod 10.42.2.77 running on node-2-on-prem

  • URL3- http://10.42.4.54:9400/metrics -> Pod 10.42.4.54 running on node-3-azure

  • On node-1-on-prem Linux shell : curl URL1 & URL2 are successful; curl URL3 fails

  • On node-2-on-prem Linux shell : curl URL1 & URL2 are successful; curl URL3 fails

  • On node-3-azure Linux shell : curl URL1 & URL2 fail ; curl URL3 is successful

Reproduce

  • On-prem subnet is 10.133.100.0/24 and Azure subnet is 10.208.2.0/24
  • Azure Virtual network and Local network are connected by a site to site VPN
  • Node to node connections are successful and there are no port restrictions in Azure and on-prem
  • IPv4 port forwarding enabled on all nodes
  • Downstream cluster container network interface configuration : network: mtu: 0 options: flannel_backend_type: vxlan plugin: canal
  • Azure node addition to cluster is flawless and all pods come up

Result

Expected Result

  • Successful inter-pod communication and display of GPU metrics

How to get these pods to communicate properly? Thanks in advance for your support.

0 Answers0