27

If I have the hosts example.com and leaf.intermediate.example.com in DNS records for example.com, but do not have any records for intermediate.example.com itself, does that cause a problem in some situations or is it bad style or etiquette for some reason? I have web servers set up like this and everything seems to work fine, but just wanted to check if there's something I'm missing.

Lassi
  • 465
  • 1
  • 5
  • 10

4 Answers4

41

TL;DR: yes intermediate subdomains need to exist, at least when queried for, per definition of the DNS; they may not exist in the zonefile though.

A possible confusion to eliminate first; Definition of "Empty Non-Terminal"

You may be confusing two things, as other answers seem also to do. Namely, what happens when querying for names versus how you configure your nameserver and the content of the zonefile.

The DNS is hierarchical. For any leaf node to exist, all components leading to it MUST exist, in the sense that if they are queried for, the responsible authoritative nameserver should reply for them without an error.

As explained in RFC 8020 (which is just a repeat of what was always the rule, but just some DNS providers needed a reminder), if for any query, an authoritative nameserver reply NXDOMAIN (that is: this resource record does not exist), then it means that any label "below" this resource does not exist either.

In your example, if a query for intermediate.example.com returns NXDOMAIN, then any proper recursive nameserver will immediately reply NXDOMAIN for leaf.intermediate.example.com because this record can not exist if all labels in it do not exist as records.

This was already stated in the past in the RFC 4592 about wildcards (which are unrelated here):

The domain name space is a tree structure. Nodes in the tree either
own at least one RRSet and/or have descendants that collectively own
at least one RRSet. A node may exist with no RRSets only if it has
descendants that do; this node is an empty non-terminal.

A node with no descendants is a leaf node. Empty leaf nodes do not exist.

A practical example with .US domain names

Let us take a working example from a TLD with a lot of labels historically, that is .US. Picking any example online, let us use www.teh.k12.ca.us.

Of course if you query for this name, or even teh.k12.ca.us you can get back A records. Nothing conclusive here for our purpose (there is even a CNAME in the middle of it, but we do not care about that) :

$ dig www.teh.k12.ca.us A +short
CA02205882.schoolwires.net.
107.21.20.201
35.172.15.22
$ dig teh.k12.ca.us A +short
162.242.146.30
184.72.49.125
54.204.24.19
54.214.44.86

Let us query now for k12.ca.us (I am not querying the authoritative nameserver of it, but that does not change the result in fact):

$ dig k12.ca.us A

; <<>> DiG 9.11.5-P1-1ubuntu2.5-Ubuntu <<>> k12.ca.us A
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 59101
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1480
;; QUESTION SECTION:
;k12.ca.us.         IN  A

;; AUTHORITY SECTION:
us.         3587    IN  SOA a.cctld.us. hostmaster.neustar.biz. 2024847624 900 900 604800 86400

;; Query time: 115 msec
;; SERVER: 127.0.0.10#53(127.0.0.10)
;; WHEN: mer. juil. 03 01:13:20 EST 2019
;; MSG SIZE  rcvd: 104

What do we learn from this answer?

First, it is a success because the status is NOERROR. If it had been anything else and specifically NXDOMAIN then teh.k12.ca.us, nor www.teh.k12.ca.us could exist.

Second, the ANSWER section is empty. There are no A records for k12.ca.us. This not an error, this type (A) does not exist for this record, but maybe other record types exist for this record or this record is an ENT, aka "Empty Non Terminal": it is empty, but it is not a leaf, there are things "below" it (see definition in RFC 7719), as we already know (but normally the resolution is top down, so we will reach this step before going one level below and not the opposite like we are doing here for demonstration purpose).

This is why in fact, as a shortcut, we say the status code is NODATA: this is not a real status code it just means NOERROR + empty ANSWER section, which means there is no data for this specific record type but there may be for others.

You can repeat the same experiment for the same result if you query with the next "up" label, that is the name ca.us.

Queries' results vs zonefile content

Now from where the confusion can come? I believe it may come from some false idea that any dot in a DNS name means there is a delegation. This is false. Said differently, your example.com zonefile can be like that, and it is totally valid and working:

example.com. IN SOA ....
example.com. IN NS ....
example.com. IN NS ....

leaf.intermediate.example.com IN A 192.0.2.37

With such a zonefile, querying this nameserver you will get exactly the behavior observed above: a query for intermediate.example.com will return NOERROR with an empty answer. You do not need to create it specifically in the zonefile (if you do not need it for other reasons), the authoritative nameserver will take care of synthesizing the "intermediate" replies, because it sees it needs this empty non-terminal (and any others "in-between" if there had been other labels) as it sees the leaf name leaf.intermediate.example.com.

Note that this is a widespread case in fact in some areas, but you might not see it because it targets more "infrastructure" records that people are not exposed to:

  • in reverse zones like in-addr.arp or ip6.arpa, and specifically the last one. You will have records like 1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.a.1.d.e.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa. 1h IN PTR text-lb.eqiad.wikimedia.org. and there is obviously not a delegation at each dot, nor resource records attached at each label
  • in SRV records, like _nicname._tcp.fr. 12h IN SRV 0 0 43 whois.nic.fr., a domain can have many _proto._tcp.example.com and _proto._udp.example.com SRV records because by design they must have this form, but at the same time _tcp.example.com and _udp.example.com will remain Empty Non-Terminals because never used as records
  • you have in fact many other cases of specific construction of names based on "underscore labels" for various protocols such as DKIM. DKIM mandates you to have DNS records like whatever._domainkey.example.com, but obviously _domainkey.example.com by itself will never be used, so it will remain an empty non-terminal. This is the same for TLSA records in DANE (ex: _25._tcp.somehost.example.com. TLSA 3 1 1 BASE64==), or URI records (ex: _ftp._tcp IN URI 10 1 "ftp://ftp1.example.com/public")

Nameserver behavior and generation of intermediate replies

Why does the nameserver synthesize automatically such intermediate answers? The core resolution algorithm for the DNS, as detailed in RFC 1034 section 4.3.2 is the reason for that, let us take it and summarize in our case when querying the above authoritative nameserver for the name intermediate.example.com (this is the QNAME in protocol below):

  1. Search the available zones for the zone which is the nearest ancestor to QNAME. If such a zone is found, go to step 3, otherwise step 4.

The nameserver finds zone example.com as nearest ancestor of QNAME, so we can go to step 3.

We have now this:

  1. Start matching down, label by label, in the zone. [..]

a. If the whole of QNAME is matched, we have found the node. [..]

b. If a match would take us out of the authoritative data, we have a referral. This happens when we encounter a node with NS RRs marking cuts along the bottom of a zone. [..]

c. If at some label, a match is impossible (i.e., the corresponding label does not exist), look to see if a the "*" label exists. [..]

We can eliminate cases b and c, because our zonefile has no delegation (hence there will be never a referral to other nameservers, no case b), nor wildcards (so no case c).

We only have to deal here with case a.

We start matching down, label by label, in the zone. So even if we had a long sub.sub.sub.sub.sub.sub.sub.sub.example.com name, at some point, we arrive at case a: we did not find a referral, nor a wildcard, but we ended up at the final name we wanted a result for.

Then we apply the rest of the content of case a:

If the data at the node is a CNAME

Not our case, we skip that.

Otherwise, copy all RRs which match QTYPE into the answer section and go to step 6.

Whatever QTYPE we choose (A, AAAA, NS, etc.) we have no RRs for intermediate.example.com as it does not appear in the zonefile. So the copy here is empty. Now we finish at step 6:

Using local data only, attempt to add other RRs which may be useful to the additional section of the query. Exit.

Not relevant for us here, hence we finish with success.

This exactly explains the behavior observed: such queries will return NOERROR but no data either.

Now, you may ask yourself: "but then if I use any name, like another.example.com then by the above algorithm I should get the same reply (no error)", but observations would instead report NXDOMAIN in that case.

Why?

Because the whole algorithm as explained, starts with this:

The following algorithm assumes that the RRs are organized in several tree structures, one for each zone, and another for the cache

This means that the above zonefile is transformed into this tree:

+-----+
| com |  (just to show the delegation, does not exist in this nameserver)
+-----+
   |
   |
   |
+---------+
| example | SOA, NS records
+---------+
   |
   |
   |
+--------------+
| intermediate | no records
+--------------+
   |
   |
   |
+------+
| leaf | A record
+------+

So when following the algorithm, from the top, you can indeed find a path: com > example > intermediate (because the path com > example > intermediate > leaf exists) But for another.example.com, after com > example you do not find the another label in the tree, as children node of example. Hence we fall into part of choice c from above:

If the "*" label does not exist, check whether the name we are looking for is the original QNAME in the query or a name we have followed due to a CNAME. If the name is original, set an authoritative name error in the response and exit. Otherwise just exit.

Label * does not exist, and we did not follow a CNAME, hence we are in case: set an authoritative name error in the response and exit, aka NXDOMAIN.

Note that all the above did create confusion in the past. This is collected in some RFCs. See for example this unexpected place (the joy of DNS specifications being so impenetrable) defining wildcards: RFC 4592 "The Role of Wildcards in the Domain Name System" and notably its section 2.2 "Existence Rules", also cited in part at the beginning of my answer but here it is more complete:

Empty non-terminals [RFC2136, section 7.16] are domain names that own no resource records but have subdomains that do. In section 2.2.1,
"_tcp.host1.example." is an example of an empty non-terminal name.
Empty non-terminals are introduced by this text in section 3.1 of RFC 1034:

# The domain name space is a tree structure.  Each node and leaf on
# the tree corresponds to a resource set (which may be empty).  The
# domain system makes no distinctions between the uses of the
# interior nodes and leaves, and this memo uses the term "node" to
# refer to both.

The parenthesized "which may be empty" specifies that empty non-
terminals are explicitly recognized and that empty non-terminals
"exist".

Pedantically reading the above paragraph can lead to an
interpretation that all possible domains exist--up to the suggested
limit of 255 octets for a domain name [RFC1035]. For example,
www.example. may have an A RR, and as far as is practically
concerned, is a leaf of the domain tree. But the definition can be
taken to mean that sub.www.example. also exists, albeit with no data. By extension, all possible domains exist, from the root on down.

As RFC 1034 also defines "an authoritative name error indicating that the name does not exist" in section 4.3.1, so this apparently is not the intent of the original definition, justifying the need for an updated definition in the next section.

And then the definition in next section is the paragraph I quoted at the beginning.

Note that RFC 8020 (on NXDOMAIN really meaning NXDOMAIN, that is if you reply NXDOMAIN for intermediate.example.com, then leaf.intermediate.example.com can not exist) was mandated in part because various DNS providers did not follow this interpretation and that created havoc, or they were just bugs, see for example this one fixed in 2013 in one opensource authoritative nameserver code: https://github.com/PowerDNS/pdns/issues/127

People needed then to put specific counter measures just for them: that is not aggressively caching NXDOMAIN because for those providers if you get NXDOMAIN at some node, it may still mean you get something else than NXDOMAIN at another node below it.

And this was making QNAME minimization (RFC 7816) impossible to obtain (see https://indico.dns-oarc.net/event/21/contributions/298/attachments/267/487/qname-min.pdf for longer details), while it was wanted to increase privacy. Existence of empty non-terminals in case of DNSSEC also created problems in the past, around handling of non-existence (see https://indico.dns-oarc.net/event/25/contributions/403/attachments/378/647/AFNIC_OARC_Dallas.pdf if interested, but you really need a good understanding of DNSSEC before).

The following two messages give an example of problems one provider had to be able to properly enforce this rule on Empty Non-Terminals, it gives some perspective of the issues and why we where there:

Patrick Mevzek
  • 9,273
  • 7
  • 29
  • 42
  • 1
    Excellent answer. Is the synthesizing of replies for intermediate domains mandated by an RFC or is it just a de facto convention? – Lassi Jul 03 '19 at 08:56
  • 1
    @Lassi see my edited answer, besides putting it in sections, I added a full explanation of the resolver algorithm (so no it is not a convention, but really something coming out of the RFCs, even if the bible of DNS aka RFC 1034 and 1035 are full of imprecision and ambiguities hence needed a lot of other RFCs to refine language and rules) and I hope useful links to learn more if interested. – Patrick Mevzek Jul 03 '19 at 19:35
  • 1
    @Lassi I addded multiple examples of ENTs in the wild in infrastructure records: PTR, SRV, TXT for DKIM, TLSA, URI – Patrick Mevzek Jul 03 '19 at 20:31
  • 1
    Incredibly thorough work. Thanks a lot for your efforts! – Lassi Jul 05 '19 at 07:50
11

It's possible that I misunderstand Khaled's answer, but the lack of intermediate records should in no wise be a problem with the resolution of the subzoned name. Note that this dig output is not from, nor directed to, an authoritative DNS server for teaparty.net or any subzone thereof:

[me@nand ~]$ dig very.deep.host.with.no.immediate.parents.teaparty.net
[...]
;; ANSWER SECTION:
very.deep.host.with.no.immediate.parents.teaparty.net. 3600 IN A 198.51.100.200

Indeed, you should be able to do that dig yourself, and get that answer - teaparty.net is a real domain, under my control, and really does contain that A record. You can verify that there are no records for any of those zones between very and teaparty.net, and that it has no impact on your resolution of the above hostname.

MadHatter
  • 78,442
  • 20
  • 178
  • 229
  • 1
    I'm starting to be out of my depth here, but based on Patrick's answer this probably works because you have all of `teaparty.net`'s records in a single zonefile so empty records are synthesized for the intermediate domains. Can somebody explain what would happen if `parents.teaparty.net` is a delegation and only `very.deep.host.with.no.immediate` has a record in the delegate zonefile? – Lassi Jul 03 '19 at 09:31
  • @Lassi exactly the same thing you see above, because it *is exactly the same case*: `teaparty.net` is a delegated subdomain of `net`; if the only A record in its zonefile were `very.deep...` it wouldn't matter. – MadHatter Jul 03 '19 at 12:02
  • 1
    Example links should use the RFC compliant example domains - meta discussion here: https://meta.stackexchange.com/questions/186529/help-users-create-dummy-links-that-are-not-to-unrelated-commercial-sites – HomoTechsual Jul 04 '19 at 21:26
  • 2
    It isn't an example link. It actually works (did you even bother to try it?) which is germane to the point at hand. As you will see from [this meta discussion](https://meta.serverfault.com/questions/963/what-information-should-i-include-or-obfuscate-in-my-posts) there are lots of domain names that should *not* be obfuscated, in both questions and answers. – MadHatter Jul 05 '19 at 04:34
  • 1
    I was confused by it also, but tried it. For a while I was sure it was some kind of wildcard or such... Until I figured out you are the DNS admin of that domain so you were able to put the record! Which is not something easy to get from just reading the answer, so in general I side with @HomoTechsual. The problem being that in some future you may remove the record, or the domain move, etc. and then this answer will not work anymore... (you can surely say the same thing with my own examples on .US names). Nevertheless, publishing private IP addresses in the public DNS is not a good idea ;-) – Patrick Mevzek Jul 05 '19 at 19:22
  • I think we're going to have to disagree about the choice of domain name. It's mine, and has been for a lot longer than `serverfault.com` has existed; if either of us is going to fear the other's ephemerality, it's not me the finger should point at. I do, however, agree that the choice of IP could have been better, so I have amended both the `A` record and the answer to use an RFC5737 `TEST-NET-2` example address. I also feel you might have guessed that it was my domain from my choice of username, but accept that people are not here for brainteasers, and have amended the answer accordingly. – MadHatter Jul 06 '19 at 07:28
2

If you are directly querying the authoritative DNS server, you will get answers without problems.

However, you will not get a valid answer if you are querying via another DNS server which does not have a valid cache. Querying for intermediate.example.com will result in NXDOMAIN error.

Khaled
  • 35,688
  • 8
  • 69
  • 98
  • Thanks. Does that imply that if I query such a DNS server for `leaf.intermediate.example.com` it will not find that subdomain either? – Lassi Jul 02 '19 at 11:22
  • 4
    It shouldn't result in `NXDOMAIN`, it should result in a `NOERROR` code and an empty Answer section. – Barmar Jul 02 '19 at 17:47
  • 4
    I don't see the point of this answer. There's no reason why anyone would need to query for `intermediate.example.com` if it's not being used for anything. So even if it returns an error (it doesn't), what difference does it make? – Barmar Jul 02 '19 at 17:49
  • 1
    @Barmar It's just that there's a difference between a nonexisting domain (that cannot have subdomains either) and a domain that exists, but has no record of any type associated. Anyone who obtained `NXDOMAIN` in a query for `intermediate.example.com` would rightfully assume that they need not even bother querying for `leaf.intermediate.example.com` – Hagen von Eitzen Jul 02 '19 at 21:33
  • 5
    You won't get `NXDOMAIN`, you get `NOERROR`. That's the response for a node that exists in the DNS hierarchy, but doesn't have any records of the type requested. – Barmar Jul 02 '19 at 21:42
  • 3
    Even if the domain exists, you'll get that response if you ask for a different record type than the ones it has; e.g. if it has `NS` records, but you ask for `A` records, you'll get `NOERROR` with an empty response. – Barmar Jul 02 '19 at 21:43
  • 1
    But the normal process of DNS resolution doesn't perform queries for each level in the hierarchy. If you're trying to go to `leaf.intermediate.example.com`, there's no need to query for `intermediate.example.com` first. – Barmar Jul 02 '19 at 21:44
  • 1
    Name resolution starts with the root name servers and works downwards (i.e. right to left) to prevent the injection of fake domains, so as Khaled says, the missing subdomain will generate an NXDOMAIN. You can see this in bind traces when searching for a non-local domain, but not in resolver traces. – grahamj42 Jul 02 '19 at 22:52
  • 4
    This is wrong. Per RFC 8020 if an authoritative nameserver responds `NXDOMAIN` for `intermediate.example.com` then it means there is nothing "below" and then `leaf.intermediate.example.com` CAN NOT exist. Some aggressive recursive resolver can even cache that and deduct things by themselves. – Patrick Mevzek Jul 03 '19 at 06:03
  • 1
    @Barmar "But the normal process of DNS resolution doesn't perform queries for each level in the hierarchy. " Now it does, with QNAME minimization. See RFC 7816. It also needed for DNSSEC to find out trust anchors. – Patrick Mevzek Jul 03 '19 at 06:35
  • 1
    @Barmar "There's no reason why anyone would need to query for intermediate.example.com" there is, for any kind of resolver needing to find the zone cuts (finding authoritative nameservers for a zone, also needed for DNSSEC), it starts at root and does label by label, so eventually coming to that one, seeing that there is nothing but not an error, and then continuing below (if it was given `leaf.intermediate.example.com` as a starting point). – Patrick Mevzek Jul 03 '19 at 08:16
  • Great discussion. This kind of thing is exactly why I asked the question. If a system is hierarchical in principle and you cheat by leaving out levels of the hierarchy that "nobody needs to use", edge cases are likely to bite eventually. – Lassi Jul 03 '19 at 09:05
2

To directly answer the question, no you do not need to add records for intermediate names that you are not actually using, however that doesn't mean that those names do not exist.

As for whether these names exist or not, that is actually a whole separate question for which I hope to provide a brief and rather intuitive answer.

It all boils down to that DNS is a tree structure, where each label in a domain name is a tree node. Eg www.example.com. has the labels www, example, com and `` (root node), which are the tree nodes that form the path all the way to the root.

What maybe makes this fundamental nature of DNS non-obvious is that almost always when managing DNS data there is no tree to be seen and we don't generally work directly with the tree nodes themselves, instead we typically have a flattened list of what record data that should exist at different domain names (effectively tree paths, as per above).

What happens when this flattened list is used is that the DNS server software constructs the tree based on the existing records, and if there are gaps between the nodes that have records (eg there are records for foo.bar.example.com. and example.com. but not bar.example.com.) these are simply considered empty tree nodes. That is, these are domain names / nodes that do in fact exist, the tree is not broken, these nodes just don't have any data associated with them.

Consequently, if you query one of these empty nodes you will get a NODATA response (NOERROR status + SOA in authority section), saying that the requested record type did not exist at this node. If you instead query some name that actually doesn't exist you will get a NXDOMAIN response, saying that the requested domain name does not exist in the tree.

Now, if you want the nitty gritty details, do read Patrick Mevzek's very thorough answer.

Håkan Lindqvist
  • 33,741
  • 5
  • 65
  • 90