0

I am trying to install mongodb-replicaset helmchart available on Rancher2 (well it's mostly a k8s problem imho).

The service is named mongodb-replicaset in the namespace mongodb-replicaset.

On init, the bootstrap container is stuck waiting for peer-finder command. This log is printed again and again every few seconds:

lookup mongodb-replicaset on 10.43.0.10:53: read udp 10.42.8.5:54048->10.43.0.10:53: i/o timeout

The IP (10.43.0.10) is the same as in /etc/resolv.conf, but it looks like the DNS server is not responding in time.

> kubectl exec -i -t dnsutils -- cat /etc/resolv.conf  
nameserver 10.43.0.10
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5

I added a log option for coredns in the related ConfigMap, and now I can see the requests made to the coredns service:

> kubectl logs --namespace=kube-system -l k8s-app=kube-dns -f
[INFO] 10.42.8.5:49418 - 22856 "SRV IN mongodb-replicaset.mongodb-replicaset.svc.cluster.local. udp 73 false 512" NOERROR qr,aa,rd 316 0.000214362s
[INFO] 10.42.8.5:47437 - 28407 "SRV IN mongodb-replicaset.svc.cluster.local. udp 54 false 512" NOERROR qr,aa,rd 147 0.000184052s
[INFO] 10.42.8.5:35926 - 3179 "SRV IN mongodb-replicaset.svc.cluster.local. udp 54 false 512" NOERROR qr,aa,rd 147 0.000168232s
[INFO] 10.42.8.5:52756 - 25514 "SRV IN mongodb-replicaset.cluster.local. udp 50 false 512" NXDOMAIN qr,aa,rd 143 0.000166371s
[INFO] 10.42.8.5:45189 - 5389 "SRV IN mongodb-replicaset. udp 36 false 512" NXDOMAIN qr,rd,ra 111 0.001224073s
[INFO] 10.42.8.5:34150 - 3084 "SRV IN mongodb-replicaset. udp 36 false 512" NXDOMAIN qr,aa,rd,ra 111 0.000131951s

When I try to make a dig or nslookup to the server, I have a timeout on both of the commands:

> kubectl exec -i -t dnsutils -- dig serverfault.com

; <<>> DiG 9.11.6-P1 <<>> serverfault.com
;; global options: +cmd
;; connection timed out; no servers could be reached
command terminated with exit code 9

But I can see the request on the coredns logs :

> kubectl logs --namespace=kube-system -l k8s-app=kube-dns -f
[INFO] 10.42.6.6:43125 - 1737 "A IN serverfault.com. udp 56 false 4096" NOERROR qr,rd,ra 157 0.001773677s
[INFO] 10.42.6.6:43125 - 1737 "A IN serverfault.com. udp 56 false 4096" NOERROR qr,aa,rd,ra 157 0.000264764s
[INFO] 10.42.6.6:43125 - 1737 "A IN serverfault.com. udp 56 false 4096" NOERROR qr,aa,rd,ra 157 0.000200713s

I have the same results as above when I dig the internal FQDN : dig mongodb-replicaset.mongodb-replicaset.svc.cluster.local.

And btw it works as expected when I use an external DNS server:

> kubectl exec -i -t dnsutils -- dig serverfault.com @8.8.8.8  

; <<>> DiG 9.11.6-P1 <<>> serverfault.com @8.8.8.8
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 16124
;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;serverfault.com.               IN      A

;; ANSWER SECTION:
serverfault.com.        2414    IN      A       151.101.65.69
serverfault.com.        2414    IN      A       151.101.129.69
serverfault.com.        2414    IN      A       151.101.193.69
serverfault.com.        2414    IN      A       151.101.1.69

;; Query time: 15 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Fri Oct 16 10:21:10 UTC 2020
;; MSG SIZE  rcvd: 108

I am a total newbie in k8s world, and I am a bit lost in the options to resolve this problem. I hope a good soul can guide me through this.

Thanks

Michaël
  • 1
  • 1
  • 2

0 Answers0