I am trying to install mongodb-replicaset helmchart available on Rancher2 (well it's mostly a k8s problem imho).
The service is named mongodb-replicaset
in the namespace mongodb-replicaset
.
On init, the bootstrap
container is stuck waiting for peer-finder
command. This log is printed again and again every few seconds:
lookup mongodb-replicaset on 10.43.0.10:53: read udp 10.42.8.5:54048->10.43.0.10:53: i/o timeout
The IP (10.43.0.10) is the same as in /etc/resolv.conf
, but it looks like the DNS server is not responding in time.
> kubectl exec -i -t dnsutils -- cat /etc/resolv.conf
nameserver 10.43.0.10
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
I added a log
option for coredns in the related ConfigMap, and now I can see the requests made to the coredns
service:
> kubectl logs --namespace=kube-system -l k8s-app=kube-dns -f
[INFO] 10.42.8.5:49418 - 22856 "SRV IN mongodb-replicaset.mongodb-replicaset.svc.cluster.local. udp 73 false 512" NOERROR qr,aa,rd 316 0.000214362s
[INFO] 10.42.8.5:47437 - 28407 "SRV IN mongodb-replicaset.svc.cluster.local. udp 54 false 512" NOERROR qr,aa,rd 147 0.000184052s
[INFO] 10.42.8.5:35926 - 3179 "SRV IN mongodb-replicaset.svc.cluster.local. udp 54 false 512" NOERROR qr,aa,rd 147 0.000168232s
[INFO] 10.42.8.5:52756 - 25514 "SRV IN mongodb-replicaset.cluster.local. udp 50 false 512" NXDOMAIN qr,aa,rd 143 0.000166371s
[INFO] 10.42.8.5:45189 - 5389 "SRV IN mongodb-replicaset. udp 36 false 512" NXDOMAIN qr,rd,ra 111 0.001224073s
[INFO] 10.42.8.5:34150 - 3084 "SRV IN mongodb-replicaset. udp 36 false 512" NXDOMAIN qr,aa,rd,ra 111 0.000131951s
When I try to make a dig
or nslookup
to the server, I have a timeout on both of the commands:
> kubectl exec -i -t dnsutils -- dig serverfault.com
; <<>> DiG 9.11.6-P1 <<>> serverfault.com
;; global options: +cmd
;; connection timed out; no servers could be reached
command terminated with exit code 9
But I can see the request on the coredns
logs :
> kubectl logs --namespace=kube-system -l k8s-app=kube-dns -f
[INFO] 10.42.6.6:43125 - 1737 "A IN serverfault.com. udp 56 false 4096" NOERROR qr,rd,ra 157 0.001773677s
[INFO] 10.42.6.6:43125 - 1737 "A IN serverfault.com. udp 56 false 4096" NOERROR qr,aa,rd,ra 157 0.000264764s
[INFO] 10.42.6.6:43125 - 1737 "A IN serverfault.com. udp 56 false 4096" NOERROR qr,aa,rd,ra 157 0.000200713s
I have the same results as above when I dig the internal FQDN : dig mongodb-replicaset.mongodb-replicaset.svc.cluster.local.
And btw it works as expected when I use an external DNS server:
> kubectl exec -i -t dnsutils -- dig serverfault.com @8.8.8.8
; <<>> DiG 9.11.6-P1 <<>> serverfault.com @8.8.8.8
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 16124
;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;serverfault.com. IN A
;; ANSWER SECTION:
serverfault.com. 2414 IN A 151.101.65.69
serverfault.com. 2414 IN A 151.101.129.69
serverfault.com. 2414 IN A 151.101.193.69
serverfault.com. 2414 IN A 151.101.1.69
;; Query time: 15 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Fri Oct 16 10:21:10 UTC 2020
;; MSG SIZE rcvd: 108
I am a total newbie in k8s world, and I am a bit lost in the options to resolve this problem. I hope a good soul can guide me through this.
Thanks