0

Few weeks ago I started a powerdns authoritative server for a third level domain. After 2 weeks I still have some public dns not resolving my records, for example google (8.8.8.8) is resolving, but opendns (208.67.222.220) is not resolving. I have tried some DNS checker tool online and I can say only 50% of public dns works with my record.

How can I understand why? Can it be connected to DNSSEC (I did not enable it)?

Tobia
  • 1,210
  • 8
  • 37
  • 73
  • 2
    Unless you could share the actual domain it's hard for us to check what's wrong with its configuration. What we can say: DNS [doesn't propagate](https://esajokinen.net/dns/) (push) but is requested (pull) and cached. Also, it's very unlikely the problem would be with DNSSEC if it's really disabled for the domain, but if it's enabled and misconfigured it might be the cause. – Esa Jokinen Jul 03 '20 at 11:54
  • This is an example record `test.ep.cinebot.it` it is a powerdns connected to a mysql db (it is a service like a DDNS) – Tobia Jul 03 '20 at 13:29
  • 1) There is no propagation as the DNS is not top down; if you query the authoritative nameservers you should see the new content immediately and then you can query the recursives one to see what is happening; as Esa said `dnsviz` can be a great help and for 2) do a `dig` query with and without `+cd`. If you get `SERVFAIL` without it but a correct reply with `+cd` it means you have a DNSSEC problem 99.99% of the time. But if you get SERVFAIL in both cases it means the problem is elsewhere. – Patrick Mevzek Jul 04 '20 at 19:03

2 Answers2

3

So, it seems you have delegated the control for ep.cinebot.it. to another name server:

;; AUTHORITY SECTION:
ep.cinebot.it.          3600    IN      NS      mbox.cinebot.it.
ep.cinebot.it.          3600    IN      NS      srv1.cinebot.it.

;; ADDITIONAL SECTION:
mbox.cinebot.it.        3600    IN      A       51.255.48.120
srv1.cinebot.it.        3600    IN      A       51.255.48.120

;; SERVER: 213.251.128.129#53(213.251.128.129)  ### ns10.ovh.net

Now, there are some problems:

  • You only have one name server; a configuration prone to errors. You are required to have at least two name servers on separate networks (IANA Technical requirements for authoritative name servers).

  • The 51.255.48.120 doesn't answer to everyone. It's status: SERVFAIL instead of NXDOMAIN. Is there some kind of a firewall? Or maybe Fail2Ban with too strict configuration?

    E.g. while DNSViz for test.ep.cinebot.it mainly shows there's no DNSSEC for cinebot.it (proving there's no problem with DNSSEC), it also gives a clear error suggesting communication problems:

    ep.cinebot.it zone: The server(s) were not responsive to queries over TCP. (51.255.48.120)

    With +trace I get consistent results from 1.1.1.1 (Cloudflare) and 208.67.222.220 (OpenDNS), and occasionally even from 8.8.8.8 / 8.8.4.4 (Google):

     $ dig test.ep.cinebot.it +trace +tcp @1.1.1.1
    
     ;; communications error to 51.255.48.120#53: end of file
     ;; communications error to 51.255.48.120#53: end of file
    

    This made me also check whether the problem is with UDP vs. TCP connections, but it seems your server answers similarly with both dig +tcp and +notcp:

     ;; ANSWER SECTION:
     test.ep.cinebot.it.     60      IN      A       93.42.126.242
    
     ;; Query time: 27 msec
     ;; SERVER: 51.255.48.120#53(51.255.48.120)
    
  • Also, your TTL is extreme low (60 seconds). This means that recursive name servers won't cache the responses for a long time, which emphasizes the importance of responsive and redundant name servers.

Esa Jokinen
  • 43,252
  • 2
  • 75
  • 122
  • Thank you @Esa for the very clear and complete answer. About the second point of your list I have checked and my server is listening on tcp and udp 53 and my fail2ban has no jail enabled for dns. Then I cannot understand all those "unreachable errors". – Tobia Jul 06 '20 at 06:54
  • Might need some debugging with `traceroute` then, if the configuration seems ok. – Esa Jokinen Jul 06 '20 at 08:38
  • At the end it was a bug of powerdns service :'-( thank you for your help – Tobia Jul 06 '20 at 09:04
  • Could you explain the bug and your workaround in an own answer, please. That way you may help others with the same problem. – Esa Jokinen Jul 06 '20 at 09:12
  • 1
    The dns reply was empty (but the server was reachable!). I have found that there was an error in the database backend structure of powerdns, I had to drop the DB, recreated from the latest version of structure and the problem is solved. This is a powerdns related problem more then a dns problem, this is why I think this is not really an answer. I will add an aswer to say that the error can be a wrong reply (empty in my case) of the server. – Tobia Jul 06 '20 at 09:25
1

After a bit debugging thanks to Esa's answer, the problem was my DNS backend. I discovered that ";; communications error to ..." was not a reachability problem but an empty answer from the DNS server.

The empty answer cause was an error of the DNS backend (power-dsn with mysql).

Tobia
  • 1,210
  • 8
  • 37
  • 73
  • BTW the error was not so easy to discover because locally it was working... – Tobia Jul 06 '20 at 09:43
  • Any insight on why it sometimes answered correctly and sometimes with an empty response? – Esa Jokinen Jul 06 '20 at 10:34
  • Somehow some request fails and some other no, if I well understood, it was a SQL error originatin from a table structure error, then I guess some requests had more element to get or to store in the DB (udp/tcp? dnssec requests?). Unfortunately i did not have a full sql log from powerdns. – Tobia Jul 06 '20 at 16:39