This is obviously a staged Q&A, but this tends to confuse people often and I can't find a canonical question covering the topic.
dig +trace
is a great diagnostic tool, but one aspect of its design is widely misunderstood: the IP of every server that will be queried is obtained from your resolver library. This is very easily overlooked and often only ends up becoming a problem when your local cache has the wrong answer for a nameserver cached.
Detailed Analysis
This is easier to break down with a sample of the output; I'll omit everything past the first NS delegation.
; <<>> DiG 9.7.3 <<>> +trace +additional serverfault.com
;; global options: +cmd
. 121459 IN NS d.root-servers.net.
. 121459 IN NS e.root-servers.net.
. 121459 IN NS f.root-servers.net.
. 121459 IN NS g.root-servers.net.
. 121459 IN NS h.root-servers.net.
. 121459 IN NS i.root-servers.net.
. 121459 IN NS j.root-servers.net.
. 121459 IN NS k.root-servers.net.
. 121459 IN NS l.root-servers.net.
. 121459 IN NS m.root-servers.net.
. 121459 IN NS a.root-servers.net.
. 121459 IN NS b.root-servers.net.
. 121459 IN NS c.root-servers.net.
e.root-servers.net. 354907 IN A 192.203.230.10
f.root-servers.net. 100300 IN A 192.5.5.241
f.root-servers.net. 123073 IN AAAA 2001:500:2f::f
g.root-servers.net. 354527 IN A 192.112.36.4
h.root-servers.net. 354295 IN A 128.63.2.53
h.root-servers.net. 108245 IN AAAA 2001:500:1::803f:235
i.root-servers.net. 355208 IN A 192.36.148.17
i.root-servers.net. 542090 IN AAAA 2001:7fe::53
j.root-servers.net. 354526 IN A 192.58.128.30
j.root-servers.net. 488036 IN AAAA 2001:503:c27::2:30
k.root-servers.net. 354968 IN A 193.0.14.129
k.root-servers.net. 431621 IN AAAA 2001:7fd::1
l.root-servers.net. 354295 IN A 199.7.83.42
;; Received 496 bytes from 75.75.75.75#53(75.75.75.75) in 10 ms
com. 172800 IN NS m.gtld-servers.net.
com. 172800 IN NS k.gtld-servers.net.
com. 172800 IN NS f.gtld-servers.net.
com. 172800 IN NS g.gtld-servers.net.
com. 172800 IN NS b.gtld-servers.net.
com. 172800 IN NS e.gtld-servers.net.
com. 172800 IN NS j.gtld-servers.net.
com. 172800 IN NS c.gtld-servers.net.
com. 172800 IN NS l.gtld-servers.net.
com. 172800 IN NS d.gtld-servers.net.
com. 172800 IN NS i.gtld-servers.net.
com. 172800 IN NS h.gtld-servers.net.
com. 172800 IN NS a.gtld-servers.net.
a.gtld-servers.net. 172800 IN A 192.5.6.30
a.gtld-servers.net. 172800 IN AAAA 2001:503:a83e::2:30
b.gtld-servers.net. 172800 IN A 192.33.14.30
b.gtld-servers.net. 172800 IN AAAA 2001:503:231d::2:30
c.gtld-servers.net. 172800 IN A 192.26.92.30
d.gtld-servers.net. 172800 IN A 192.31.80.30
e.gtld-servers.net. 172800 IN A 192.12.94.30
f.gtld-servers.net. 172800 IN A 192.35.51.30
g.gtld-servers.net. 172800 IN A 192.42.93.30
h.gtld-servers.net. 172800 IN A 192.54.112.30
i.gtld-servers.net. 172800 IN A 192.43.172.30
j.gtld-servers.net. 172800 IN A 192.48.79.30
k.gtld-servers.net. 172800 IN A 192.52.178.30
l.gtld-servers.net. 172800 IN A 192.41.162.30
;; Received 505 bytes from 192.203.230.10#53(e.root-servers.net) in 13 ms
- The initial query for
. IN NS
(root nameservers) hits the local resolver, which in this case is Comcast. (75.75.75.75
) This is easy to spot.
- The next query is for
serverfault.com. IN A
and runs against e.root-servers.net.
, randomly selected from the list of root nameservers we just got. It has an IP address of 192.203.230.10
, and since we have +additional
enabled it appears to be coming from the glue.
- Since it is not authoritative for serverfault.com, this gets delegated to the
com.
TLD nameservers.
- What isn't obvious from the output here is that
dig
did not derive the IP address of e.root-servers.net.
from the glue.
In the background, this is what really happened:
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth1, link-type EN10MB (Ethernet), capture size 65535 bytes
02:03:43.301022 IP 192.0.2.1.59900 > 75.75.75.75.53: 63418 NS? . (17)
02:03:43.327327 IP 75.75.75.75.53 > 192.0.2.1.59900: 63418 13/0/14 NS k.root-servers.net., NS l.root-servers.net., NS m.root-servers.net., NS a.root-servers.net., NS b.root-servers.net., NS c.root-servers.net., NS d.root-servers.net., NS e.root-servers.net., NS f.root-servers.net., NS g.root-servers.net., NS h.root-servers.net., NS i.root-servers.net., NS j.root-servers.net. (512)
02:03:43.333047 IP 192.0.2.1.33120 > 75.75.75.75.53: 41110+ A? e.root-servers.net. (36)
02:03:43.333096 IP 192.0.2.1.33120 > 75.75.75.75.53: 5696+ AAAA? e.root-servers.net. (36)
02:03:43.344301 IP 75.75.75.75.53 > 192.0.2.1.33120: 41110 1/0/0 A 192.203.230.10 (52)
02:03:43.344348 IP 75.75.75.75.53 > 192.0.2.1.33120: 5696 0/1/0 (96)
02:03:43.344723 IP 192.0.2.1.37085 > 192.203.230.10.53: 28583 A? serverfault.com. (33)
02:03:43.423299 IP 192.203.230.10.53 > 192.0.2.1.37085: 28583- 0/13/14 (493)
+trace
cheated and consulted the local resolver to obtain the IP address of the next hop nameserver instead of consulting the glue. Sneaky!
This is usually "good enough" and won't cause a problem for most people. Unfortunately, there are edge cases. If for whatever reason your upstream DNS cache is providing the wrong answer for the nameserver, this model breaks down entirely.
Real world example:
- domain expires
- glue is repointed at registrar redirection nameservers
- bogus IPs are cached for ns1 and ns2.yourdomain.com
- domain is renewed with restored glue
- any caches with the bogus nameserver IPs continue to send people to a website that says the domain is for sale
In the above case, +trace
will suggest that the domain owner's own nameservers are the source of the problem, and you're one call away from incorrectly telling a customer that their servers are misconfigured. Whether it's something you can (or are willing to) do something about is another story, but it's important to have the right information.
dig +trace
is a great tool, but like any tool, you need to know what it does and doesn't do, and how to troubleshoot the issue manually when it proves insufficient.
Edit:
It should also be noted that dig +trace
will not warn you about NS
records that point at CNAME
aliases. This is a RFC violation that ISC BIND (and possibly others) will not attempt to correct. +trace
will be completely happy to accept the A
record it gets from your locally configured nameserver, whereas if BIND were to be performing full recursion it would be rejecting the entire zone with a SERVFAIL.
This can be tricky to troubleshoot if glue is present; this will work just fine until the NS records are refreshed, then suddenly break. Glueless delegations will always break BIND's recursion when a NS
record points at an alias.