1

I am trying to migrate my DNS server from 178.32.xx.xx to 162.243.xx.xx, but as soon as I turn off the first server (178), some users report that the site is offline. I also checked with WhatsMyDNS and as soon as I turn named off on 178, almost all of the worldwide DNS replicators show an X for the website's A entries.

I have disabled slave DNS and changed all IP entries to 162, including the NS records, that now all point to 162. It has been over a week now, and I cannot turn off the old server, or everything will be offline.

I use ZPanel for DNS entry management, but I have checked the zone files and they are all correct, pointing to 162. If I leave the old server on and check WhatsMyDNS, everything is fine. If I turn if off, the horror!

Is there anything that I am missing? Thank you for any assistance.

  • 1
    What is the domain name in question? – MadHatter Dec 05 '13 at 09:49
  • It's azulvirtual.org – henriquesirot Dec 05 '13 at 09:51
  • From here, everything looks fine. The `whois`, the NS records for the nameservers, the A records for `azulvirtual.org`, all line up as being in 162. Could you paste into your question some evidence about what troubles you? – MadHatter Dec 05 '13 at 09:55
  • I suspect MadHatter is already writing an answer but if you want to know how to figure this out yourself, ask yourself the question "How do users know what DNS server to ask for my NS records?" – Ladadadada Dec 05 '13 at 09:56
  • Ladadadada, thank you for that - your faith in me is touching! Sadly, I can't initially see anything wrong, so if you have a better idea, please: go for it! – MadHatter Dec 05 '13 at 09:58
  • The domain is registered to ns1.sirothost.com.br (162.243.84.110) and ns2.sirothost.com.br (54.207.4.127) and both repsond correctly. It could be that your clients use 178.32.xx.xx as their primary/only resolver and switching that off simply breaks all DNS? – HBruijn Dec 05 '13 at 10:02
  • It is all working because the 178 DNS server is online. I'm turning it off now and would you please check again in like, 2 minutes? You could see there was no 178 entries anywhere... – henriquesirot Dec 05 '13 at 10:08
  • Everything still looks fine. As I said, if you're seeing otherwise when the 178 server is down, **please cut and paste evidence into your question**. Otherwise, there's nothing to diagnose. – MadHatter Dec 05 '13 at 10:59

1 Answers1

3

The problem is in the .com.br root servers.

To get to this point I started by finding the org root servers because we want azulvirtual.org:

> dig org NS

; <<>> DiG 9.6-ESV-R4-P3 <<>> org NS
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 12744
;; flags: qr rd ra; QUERY: 1, ANSWER: 6, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;org.               IN  NS

;; ANSWER SECTION:
org.            5827    IN  NS  a0.org.afilias-nst.info.
org.            5827    IN  NS  a2.org.afilias-nst.info.
org.            5827    IN  NS  b0.org.afilias-nst.org.
org.            5827    IN  NS  b2.org.afilias-nst.org.
org.            5827    IN  NS  c0.org.afilias-nst.info.
org.            5827    IN  NS  d0.org.afilias-nst.org.

;; Query time: 10 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Thu Dec  5 10:41:41 2013
;; MSG SIZE  rcvd: 159

And then asking for your name servers from one of those:

> dig @a0.org.afilias-nst.info azulvirtual.org NS

; <<>> DiG 9.6-ESV-R4-P3 <<>> @a0.org.afilias-nst.info azulvirtual.org NS
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 63061
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 2, ADDITIONAL: 0
;; WARNING: recursion requested but not available

;; QUESTION SECTION:
;azulvirtual.org.       IN  NS

;; AUTHORITY SECTION:
azulvirtual.org.    86400   IN  NS  ns2.sirothost.com.br.
azulvirtual.org.    86400   IN  NS  ns1.sirothost.com.br.

;; Query time: 213 msec
;; SERVER: 199.19.56.1#53(199.19.56.1)
;; WHEN: Thu Dec  5 10:41:57 2013
;; MSG SIZE  rcvd: 85

This is the correct result, so far. But now we need to find ns1.sirothost.com.br or ns2.sirothost.com.br. For that, we need the com.br root servers:

> dig com.br NS

; <<>> DiG 9.6-ESV-R4-P3 <<>> com.br NS
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 39126
;; flags: qr rd ra; QUERY: 1, ANSWER: 6, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;com.br.                IN  NS

;; ANSWER SECTION:
com.br.         21600   IN  NS  c.dns.br.
com.br.         21600   IN  NS  b.dns.br.
com.br.         21600   IN  NS  d.dns.br.
com.br.         21600   IN  NS  e.dns.br.
com.br.         21600   IN  NS  f.dns.br.
com.br.         21600   IN  NS  a.dns.br.

;; Query time: 65 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Thu Dec  5 10:42:22 2013
;; MSG SIZE  rcvd: 124

And then ask one of them for ns1.sirothost.com.br. This is where we get the wrong IP address(es).

> dig @b.dns.br ns2.sirothost.com.br

; <<>> DiG 9.6-ESV-R4-P3 <<>> @b.dns.br ns2.sirothost.com.br
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 46093
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 2, ADDITIONAL: 2
;; WARNING: recursion requested but not available

;; QUESTION SECTION:
;ns2.sirothost.com.br.      IN  A

;; AUTHORITY SECTION:
sirothost.com.br.   86400   IN  NS  ns1.sirothost.com.br.
sirothost.com.br.   86400   IN  NS  ns2.sirothost.com.br.

;; ADDITIONAL SECTION:
ns1.sirothost.com.br.   86400   IN  A   178.32.65.90
ns2.sirothost.com.br.   86400   IN  A   54.213.72.90

;; Query time: 208 msec
;; SERVER: 200.189.41.10#53(200.189.41.10)
;; WHEN: Thu Dec  5 10:42:53 2013
;; MSG SIZE  rcvd: 102

Note that I am digging directly from authoritative sources at all times so there is no caching involved here. This problem will not resolve itself by waiting.

These are the glue records that your registrar (or rather the registrar for sirothost.com.br) gives to the root servers.

You can fix this by either telling the registrar for azulvirtual.org that you want to use different name servers (not ns1.sirothost.com.br and ns2.sirothost.com.br) or by telling the registrar for sirothost.com.br that the name servers for that domain have a new IP address.

To understand why I had to make all those requests, have a read through the canonical Why does DNS work the way it does question and Andrew B's answer on how caching can affect +trace queries using dig.

Ladadadada
  • 25,847
  • 7
  • 57
  • 90
  • Thank you! It did clarify why it is happening and how to fix it! – henriquesirot Dec 05 '13 at 11:37
  • I would upvote this many times if I could: it's a textbook example of how to use the tools we have for patiently querying all the steps involved in DNS, to see who's stuffed up, and should be carefully studied by anyone who has to debug such a problem. In this case, we find it's your DNS provider whose own infrastructure isn't correctly set up. That given, I'd be inclined to move providers again; they're clearly not competent. – MadHatter Dec 05 '13 at 12:45