25

The engineering team I work with has been in the process of moving equipment from one datacenter to another. Ten days ago we moved one of our name servers authoritative for our client's domains (ns1.faithhiway.com) and updated its IP address with its respective DNS provider (register.com) to point to the new datacenter. All tests done show that this name server is correctly running at its new location and when queried, returning the correct response for any domains it is responsible for.

The problem is that well after 72 hours had gone by we were still seeing more DNS activity at its old IP address than at the new. The good news is that we kept a name server responding on the old IP address for the time being so we are not seeing any issues with the domains our nameserver is responsible for but the goal is to retire that as soon as possible. As you can see from WhatsMyDNS.net, a decent amount of propagation has occurred over the last 10 days since we made this change, but still there are some locations reporting our original IP.

enter image description here

Considering that the TTL is only 3600 with the name servers responsible for this domain, it does not make any sense to myself or the other engineers working with me that we are having this issue.

Now if I run a DNS check using one of the Register.com DNS servers (direct nameservers for faithhiway.com), I get the following (correct) result:

# dig @dns01.gpn.register.com ns1.faithhiway.com A

; <<>> DiG 9.3.6-P1-RedHat-9.3.6-4.P1.el5_5.3 <<>> @dns01.gpn.register.com. ns1.faithhiway.com A
; (1 server found)
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 43232
;; flags: qr aa; QUERY: 1, ANSWER: 1, AUTHORITY: 5, ADDITIONAL: 5

;; QUESTION SECTION:
;ns1.faithhiway.com.  IN A

;; ANSWER SECTION:
ns1.faithhiway.com. 3601 IN A 206.127.2.71

;; AUTHORITY SECTION:
faithhiway.com.  3600 IN NS dns01.gpn.register.com.
faithhiway.com.  3600 IN NS dns02.gpn.register.com.
faithhiway.com.  3600 IN NS dns03.gpn.register.com.
faithhiway.com.  3600 IN NS dns04.gpn.register.com.
faithhiway.com.  3600 IN NS dns05.gpn.register.com.

;; ADDITIONAL SECTION:
dns01.gpn.register.com. 3600 IN A 98.124.192.1
dns02.gpn.register.com. 3600 IN A 98.124.197.1
dns03.gpn.register.com. 3600 IN A 98.124.193.1
dns04.gpn.register.com. 3600 IN A 69.64.145.225
dns05.gpn.register.com. 3600 IN A 98.124.196.1

;; Query time: 50 msec
;; SERVER: 98.124.192.1#53(98.124.192.1)
;; WHEN: Thu Jan 27 15:16:57 2011
;; MSG SIZE  rcvd: 269

Just as a reference, here are the results when the same query is checked against a variety of Public DNS servers:

Google:

# dig @8.8.8.8 ns1.faithhiway.com A

; <<>> DiG 9.3.6-P1-RedHat-9.3.6-4.P1.el5_5.3 <<>> @8.8.8.8. ns1.faithhiway.com A
; (1 server found)
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 12773
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;ns1.faithhiway.com.  IN A

;; ANSWER SECTION:
ns1.faithhiway.com. 997 IN A 206.127.2.71

;; Query time: 29 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Thu Jan 27 15:17:31 2011
;; MSG SIZE  rcvd: 52

Level 3:

# dig @4.2.2.1 ns1.faithhiway.com A

; <<>> DiG 9.3.6-P1-RedHat-9.3.6-4.P1.el5_5.3 <<>> @4.2.2.1. ns1.faithhiway.com A
; (1 server found)
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 46505
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;ns1.faithhiway.com.  IN A

;; ANSWER SECTION:
ns1.faithhiway.com. 2623 IN A 206.127.2.71

;; Query time: 7 msec
;; SERVER: 4.2.2.1#53(4.2.2.1)
;; WHEN: Thu Jan 27 15:18:35 2011
;; MSG SIZE  rcvd: 52

Verizon:

# dig @151.197.0.38 ns1.faithhiway.com A

; <<>> DiG 9.3.6-P1-RedHat-9.3.6-4.P1.el5_5.3 <<>> @151.197.0.38. ns1.faithhiway.com A
; (1 server found)
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 32658
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;ns1.faithhiway.com.  IN A

;; ANSWER SECTION:
ns1.faithhiway.com. 3601 IN A 206.127.2.71

;; Query time: 81 msec
;; SERVER: 151.197.0.38#53(151.197.0.38)
;; WHEN: Thu Jan 27 15:19:15 2011
;; MSG SIZE  rcvd: 52

Cisco:

# dig @64.102.255.44 ns1.faithhiway.com A

; <<>> DiG 9.3.6-P1-RedHat-9.3.6-4.P1.el5_5.3 <<>> @64.102.255.44. ns1.faithhiway.com A
; (1 server found)
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 39689
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 5, ADDITIONAL: 0

;; QUESTION SECTION:
;ns1.faithhiway.com.  IN A

;; ANSWER SECTION:
ns1.faithhiway.com. 3601 IN A 206.127.2.71

;; AUTHORITY SECTION:
faithhiway.com.  3600 IN NS dns01.gpn.register.com.
faithhiway.com.  3600 IN NS dns04.gpn.register.com.
faithhiway.com.  3600 IN NS dns05.gpn.register.com.
faithhiway.com.  3600 IN NS dns02.gpn.register.com.
faithhiway.com.  3600 IN NS dns03.gpn.register.com.

;; Query time: 105 msec
;; SERVER: 64.102.255.44#53(64.102.255.44)
;; WHEN: Thu Jan 27 15:20:05 2011
;; MSG SIZE  rcvd: 165

OpenDNS:

# dig @208.67.222.222 ns1.faithhiway.com A

; <<>> DiG 9.3.6-P1-RedHat-9.3.6-4.P1.el5_5.3 <<>> @208.67.222.222. ns1.faithhiway.com A
; (1 server found)
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 12328
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;ns1.faithhiway.com.  IN A

;; ANSWER SECTION:
ns1.faithhiway.com. 169507 IN A 207.200.19.162

;; Query time: 6 msec
;; SERVER: 208.67.222.222#53(208.67.222.222)
;; WHEN: Thu Jan 27 15:19:29 2011
;; MSG SIZE  rcvd: 52

SpeakEasy:

# dig @66.93.87.2 ns1.faithhiway.com A

; <<>> DiG 9.3.6-P1-RedHat-9.3.6-4.P1.el5_5.3 <<>> @66.93.87.2. ns1.faithhiway.com A
; (1 server found)
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 9342
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;ns1.faithhiway.com.  IN A

;; ANSWER SECTION:
ns1.faithhiway.com. 169323 IN A 207.200.19.162

;; Query time: 69 msec
;; SERVER: 66.93.87.2#53(66.93.87.2)
;; WHEN: Thu Jan 27 15:19:51 2011
;; MSG SIZE  rcvd: 52

As you can see above, the majority of queries are returning the correct result. But a few (OpenDNS and SpeakEasy in the examples above) are still showing the old IP address. Considering the length of time that has gone by, it seems obvious to me that either we have made a mistake and not thoroughly handled the DNS changes on our end (likely) or there is a problem with either the DNS provider for this domain (Register) or with some of the DNS servers out in the wild (rather unlikely).

Any advice on how I can proceed with this?

UPDATE (January 31, 2011):

First of all, I apologize for the length of both the original question and this update. I contemplated removing some of the excess from the original post but just in case this problem and its solution are helpful to someone else in the future I'm just going to leave everything as it is.

Anyway, I've been doing some more research into this problem, and have discovered the following interesting occurrence. While running a check on the glue records for faithhiway.com always resolve correctly, if I go and check a client domain (where ns1.faithhiway.com is authoritative), I get a strange response. It looks like the root servers are returning nsX.faithhiway.com as their old IP addresses still (under Additional Section). Because we have a server still there responding to DNS queries, the trace finishes and returns the correct IP addresses as the final step (again, under Additional Section). The example below uses one of the domains that we use that uses ns1.faithhiway.com as its authoritative DNS server.

# dig +trace +nosearch +all +norecurse ignitemail.com

; <<>> DiG 9.2.4 <<>> +trace +nosearch +all +norecurse ignitemail.com
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 46856
;; flags: qr ra; QUERY: 1, ANSWER: 13, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;.    IN NS

;; ANSWER SECTION:
.   7986 IN NS a.root-servers.net.
.   7986 IN NS b.root-servers.net.
.   7986 IN NS c.root-servers.net.
.   7986 IN NS d.root-servers.net.
.   7986 IN NS e.root-servers.net.
.   7986 IN NS f.root-servers.net.
.   7986 IN NS g.root-servers.net.
.   7986 IN NS h.root-servers.net.
.   7986 IN NS i.root-servers.net.
.   7986 IN NS j.root-servers.net.
.   7986 IN NS k.root-servers.net.
.   7986 IN NS l.root-servers.net.
.   7986 IN NS m.root-servers.net.

;; Query time: 39 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Mon Jan 31 09:22:17 2011
;; MSG SIZE  rcvd: 228

;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 16325
;; flags: qr; QUERY: 1, ANSWER: 0, AUTHORITY: 13, ADDITIONAL: 14

;; QUESTION SECTION:
;ignitemail.com.   IN A

;; AUTHORITY SECTION:
com.   172800 IN NS h.gtld-servers.net.
com.   172800 IN NS m.gtld-servers.net.
com.   172800 IN NS i.gtld-servers.net.
com.   172800 IN NS l.gtld-servers.net.
com.   172800 IN NS c.gtld-servers.net.
com.   172800 IN NS k.gtld-servers.net.
com.   172800 IN NS d.gtld-servers.net.
com.   172800 IN NS f.gtld-servers.net.
com.   172800 IN NS b.gtld-servers.net.
com.   172800 IN NS a.gtld-servers.net.
com.   172800 IN NS e.gtld-servers.net.
com.   172800 IN NS g.gtld-servers.net.
com.   172800 IN NS j.gtld-servers.net.

;; ADDITIONAL SECTION:
a.gtld-servers.net. 172800 IN A 192.5.6.30
a.gtld-servers.net. 172800 IN AAAA 2001:503:a83e::2:30
b.gtld-servers.net. 172800 IN A 192.33.14.30
b.gtld-servers.net. 172800 IN AAAA 2001:503:231d::2:30
c.gtld-servers.net. 172800 IN A 192.26.92.30
d.gtld-servers.net. 172800 IN A 192.31.80.30
e.gtld-servers.net. 172800 IN A 192.12.94.30
f.gtld-servers.net. 172800 IN A 192.35.51.30
g.gtld-servers.net. 172800 IN A 192.42.93.30
h.gtld-servers.net. 172800 IN A 192.54.112.30
i.gtld-servers.net. 172800 IN A 192.43.172.30
j.gtld-servers.net. 172800 IN A 192.48.79.30
k.gtld-servers.net. 172800 IN A 192.52.178.30
l.gtld-servers.net. 172800 IN A 192.41.162.30

;; Query time: 64 msec
;; SERVER: 198.41.0.4#53(a.root-servers.net)
;; WHEN: Mon Jan 31 09:22:17 2011
;; MSG SIZE  rcvd: 504

;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 12860
;; flags: qr; QUERY: 1, ANSWER: 0, AUTHORITY: 2, ADDITIONAL: 2

;; QUESTION SECTION:
;ignitemail.com.   IN A

;; AUTHORITY SECTION:
ignitemail.com.  172800 IN NS ns1.faithhiway.com.
ignitemail.com.  172800 IN NS ns2.faithhiway.com.

;; ADDITIONAL SECTION:
ns1.faithhiway.com. 172800 IN A 207.200.19.162
ns2.faithhiway.com. 172800 IN A 207.200.50.142

;; Query time: 152 msec
;; SERVER: 192.54.112.30#53(h.gtld-servers.net)
;; WHEN: Mon Jan 31 09:22:17 2011
;; MSG SIZE  rcvd: 111

;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 43016
;; flags: qr aa; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 2

;; QUESTION SECTION:
;ignitemail.com.   IN A

;; ANSWER SECTION:
ignitemail.com.  3600 IN A 206.127.2.64

;; AUTHORITY SECTION:
ignitemail.com.  3600 IN NS ns1.faithhiway.com.
ignitemail.com.  3600 IN NS ns2.faithhiway.com.

;; ADDITIONAL SECTION:
ns1.faithhiway.com. 3600 IN A 206.127.2.71
ns2.faithhiway.com. 3600 IN A 206.127.2.72

;; Query time: 25 msec
;; SERVER: 206.127.2.71#53(ns1.faithhiway.com)
;; WHEN: Mon Jan 31 09:22:18 2011
;; MSG SIZE  rcvd: 127

I really think this is a problem we have somewhere in our setup, but whether it is ignorance of something with DNS on my or my fellow engineer's end or just a dumb mistake we made, I have yet to find it.

runlevelsix
  • 2,609
  • 21
  • 19

3 Answers3

5

Problem solved. Finally. Apparently Register.com did not update the glue records for ns1 and ns2.faithhiway.com despite our initial request for them to do so (and their confirmation that it had been done).

The tests that I posted above in my update showed that despite their confirmation of the update, the glue records were not propagating correctly. I went ahead and pushed another update to our glue records and it looks like this time we are seeing propagation:

# dig +trace +nosearch +all +norecurse ignitemail.com

; <<>> DiG 9.2.4 <<>> +trace +nosearch +all +norecurse ignitemail.com
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 12706
;; flags: qr ra; QUERY: 1, ANSWER: 13, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;.              IN  NS

;; ANSWER SECTION:
.           79883   IN  NS  a.root-servers.net.
.           79883   IN  NS  b.root-servers.net.
.           79883   IN  NS  c.root-servers.net.
.           79883   IN  NS  d.root-servers.net.
.           79883   IN  NS  e.root-servers.net.
.           79883   IN  NS  f.root-servers.net.
.           79883   IN  NS  g.root-servers.net.
.           79883   IN  NS  h.root-servers.net.
.           79883   IN  NS  i.root-servers.net.
.           79883   IN  NS  j.root-servers.net.
.           79883   IN  NS  k.root-servers.net.
.           79883   IN  NS  l.root-servers.net.
.           79883   IN  NS  m.root-servers.net.

;; Query time: 293 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Mon Jan 31 13:24:02 2011
;; MSG SIZE  rcvd: 228

;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 43910
;; flags: qr; QUERY: 1, ANSWER: 0, AUTHORITY: 13, ADDITIONAL: 14

;; QUESTION SECTION:
;ignitemail.com.            IN  A

;; AUTHORITY SECTION:
com.            172800  IN  NS  i.gtld-servers.net.
com.            172800  IN  NS  c.gtld-servers.net.
com.            172800  IN  NS  k.gtld-servers.net.
com.            172800  IN  NS  f.gtld-servers.net.
com.            172800  IN  NS  b.gtld-servers.net.
com.            172800  IN  NS  l.gtld-servers.net.
com.            172800  IN  NS  g.gtld-servers.net.
com.            172800  IN  NS  e.gtld-servers.net.
com.            172800  IN  NS  d.gtld-servers.net.
com.            172800  IN  NS  a.gtld-servers.net.
com.            172800  IN  NS  m.gtld-servers.net.
com.            172800  IN  NS  j.gtld-servers.net.
com.            172800  IN  NS  h.gtld-servers.net.

;; ADDITIONAL SECTION:
a.gtld-servers.net. 172800  IN  A   192.5.6.30
a.gtld-servers.net. 172800  IN  AAAA    2001:503:a83e::2:30
b.gtld-servers.net. 172800  IN  A   192.33.14.30
b.gtld-servers.net. 172800  IN  AAAA    2001:503:231d::2:30
c.gtld-servers.net. 172800  IN  A   192.26.92.30
d.gtld-servers.net. 172800  IN  A   192.31.80.30
e.gtld-servers.net. 172800  IN  A   192.12.94.30
f.gtld-servers.net. 172800  IN  A   192.35.51.30
g.gtld-servers.net. 172800  IN  A   192.42.93.30
h.gtld-servers.net. 172800  IN  A   192.54.112.30
i.gtld-servers.net. 172800  IN  A   192.43.172.30
j.gtld-servers.net. 172800  IN  A   192.48.79.30
k.gtld-servers.net. 172800  IN  A   192.52.178.30
l.gtld-servers.net. 172800  IN  A   192.41.162.30

;; Query time: 336 msec
;; SERVER: 198.41.0.4#53(a.root-servers.net)
;; WHEN: Mon Jan 31 13:24:03 2011
;; MSG SIZE  rcvd: 504

;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 44133
;; flags: qr; QUERY: 1, ANSWER: 0, AUTHORITY: 2, ADDITIONAL: 2

;; QUESTION SECTION:
;ignitemail.com.            IN  A

;; AUTHORITY SECTION:
ignitemail.com.     172800  IN  NS  ns1.faithhiway.com.
ignitemail.com.     172800  IN  NS  ns2.faithhiway.com.

;; ADDITIONAL SECTION:
ns1.faithhiway.com. 172800  IN  A   206.127.2.71
ns2.faithhiway.com. 172800  IN  A   206.127.2.72

;; Query time: 2411 msec
;; SERVER: 192.43.172.30#53(i.gtld-servers.net)
;; WHEN: Mon Jan 31 13:24:06 2011
;; MSG SIZE  rcvd: 111

;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 50833
;; flags: qr aa; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 2

;; QUESTION SECTION:
;ignitemail.com.            IN  A

;; ANSWER SECTION:
ignitemail.com.     3600    IN  A   206.127.2.64

;; AUTHORITY SECTION:
ignitemail.com.     3600    IN  NS  ns1.faithhiway.com.
ignitemail.com.     3600    IN  NS  ns2.faithhiway.com.

;; ADDITIONAL SECTION:
ns1.faithhiway.com. 3600    IN  A   206.127.2.71
ns2.faithhiway.com. 3600    IN  A   206.127.2.72

;; Query time: 1495 msec
;; SERVER: 206.127.2.71#53(ns1.faithhiway.com)
;; WHEN: Mon Jan 31 13:24:09 2011
;; MSG SIZE  rcvd: 127
runlevelsix
  • 2,609
  • 21
  • 19
4

You have 2 problems:

  1. The queries for ns1.faithhiway.com are returning incorrect results.

  2. The name servers listed for your domain are wrong.

You're actually testing a little backward. You're testing for what ip address is being returned when querying for ns1.faithhiway.com but what you should be testing for first is what name servers are actually being returned for faithhiway.com. A Whois lookup and an nslookup return the following servers as being the name servers for faithhiway.com:

dns01.gpn.register.com

dns02.gpn.register.com

dns03.gpn.register.com

dns04.gpn.register.com

dns05.gpn.register.com

So you need to get that fixed first.

joeqwerty
  • 108,377
  • 6
  • 80
  • 171
  • first -- ns1 and ns2 are name servers are are not authoritative for themselves. Register hosts our NS server's DNS, whilst everyone else points their DNS records to us for an authoritative response.| I looked up the NS records for faithhiway.com on all the above servers and they also resolve register as the authoritative name servers of the domain. – grufftech Jan 27 '11 at 22:16
  • I misunderstood the problem then. The name servers for faithhiway.com are not nsX.faithhiway.com. nsX.faithhiway.com are the name servers for DNS zones that you host for your customers. Is that right? If so, my apologies for the misunderstanding. – joeqwerty Jan 27 '11 at 22:23
  • That is correct - Register.com is authoritative for faithhiway.com and nsX.faithhiway.com are the responsible name servers for our clients. – runlevelsix Jan 27 '11 at 22:23
  • (for my own understanding) Wouldn't having your domain name server be authoritative for itself be a relatively bad idea, since if for some (god forsaken) reason you loose access to the IP's in which those domains point to, you no longer have any authoritative name servers telling your domains where to go, nor where to go to get updates, such as the new location for themselves ns#.yourdomain.com. in the same vein, if I buy awesomedomain.com, and point it to ns1 & ns2.awesomedomain.com, however no rootservers have ever been told where those two subdomains point to. Shouldn't nothing happen? – grufftech Jan 27 '11 at 22:37
  • I don't know how bad of an idea it is but I know plenty of company's that host their own name servers for their public namespace. If I'm understanding what your asking, that's what glue records at the parent servers are for. – joeqwerty Jan 27 '11 at 22:41
2

Lots of servers ignore your TTL and cache the info much longer than they should. The easiest way to solve this is usually to contact the affected network's operators and let them know. They're usually really good about fixing it pretty quickly.

Chris S
  • 77,337
  • 11
  • 120
  • 212