2

Ran into a weird issue where a wildcard CNAME record (i.e. *.example.com) was overriding specific A records (i.e. host1.example.com, host2.example.com). It only affected Verizon Wireless's nameservers. The authoritative nameservers are controlled by Network Solutions (ns1.dnsbycomodo.net and ns2.dnsbycomodo.net).

Other providers' nameservers returned the correct results (OpenDNS and mxtoolbox.com), and it can't be a caching issue because the incorrect IPs (via the CNAME lookup) being returned were never previously used, and on top of that, the change was made 12 hours prior and the TTL on the records was only 7200.

Deleting the wildcard CNAME record appears to have solved the issue. Any thoughts on what happened? Has anyone else run into this? Is this just some bug with Verizon's DNS servers talking to Network Solutions'? Supposedly wildcard CNAME records have been valid for a while (Is a wildcard CNAME DNS record valid?).

EDIT:

Here's the order that things happened

Original config:
A *.example.com -> 1.1.1.1
A host1.example.com -> 2.2.2.2
A host2.example.com -> 3.3.3.3

Changed to:
Removed "A" *.example.com
Added CNAME *.example.com -> hostalias.example.net which resolves to 4.4.4.4

Outcome:
On Verizon queries to host1.example.com and host2.example.com started returning 4.4.4.4 whereas on OpenDNS and mxtoolbox.com, they still correctly returned 2.2.2.2 and 3.3.3.3, respectively.

sa289
  • 1,308
  • 2
  • 17
  • 42
  • The TTL on *which* record though, the wildcard CNAME or its target? The distinction is important. – Andrew B Jan 21 '16 at 19:27
  • @AndrewB There was no CNAME previously - just the wildcard A record, but the A record had a TTL of 7200. I probably shouldn't have even mentioned that since it's not like it could be because of any old record since the new IP is what was pulling in incorrectly. – sa289 Jan 22 '16 at 17:42
  • I'm confused now. The first sentence of the question describes a wildcard CNAME record that is conflicting with static A records, but your comment is describing a wildcard A record. It will probably help us if you edit your question to include specific examples (obfuscated to `example.com` if necessary) and the exact bulleted order in which they were created. – Andrew B Jan 22 '16 at 19:23
  • @AndrewB Good idea - I've edited my question – sa289 Jan 22 '16 at 19:39

1 Answers1

1

Thanks for updating your question, this makes the order of events much clearer. Unfortunately the behavior you're describing remains very puzzling from the perspective of a recursive DNS resolver. This is best illustrated through examples.


When the query isn't in cache, a recursive DNS server is going to send the following query to the authoritative nameserver:

Question: host1.example.com. IN A

The remote authoritative server will respond like so when an explicit A record is defined:

Answer: host1.example.com. IN A 2.2.2.2

Or like this if it's hitting the CNAME record:

Answer: host1.example.com. IN CNAME hostalias.example.net.


In both cases, the choice of whether the A record or CNAME record is served is determined by the authoritative server, not the recursive server. The recursive server is, as the name implies, recursing. The upstream client's request for host.example.com. IN A gets passed along unmodified unless additional queries are needed to arrive at the answer. (in this case, that would be an additional lookup for hostalias.example.net unless the authoritative nameserver can provide that answer within the same response)

Given this well understood behavior, the starting assumptions have to be treated as suspect. One of these facts is not 100% accurate:

  • The order in which those records were defined.
  • Verizon returning a response of 4.4.4.4. (i.e. response came from somewhere closer to your network, or even a host file)
  • That all of your authoritative servers were returning the same response to that request.
  • Operator error.
  • Human memory.

I know that this is kind of a non-answer, but I don't think we can give you a better one without better documentation of what was going on during the event.

Andrew B
  • 31,858
  • 12
  • 90
  • 128
  • I don't blame you for suspecting the starting assumptions, but let me respond to each. Order shouldn't matter because the `host1.example.com` and `host2` always pointed to `2.2.2.2` and `3.3.3.3`. Verizon for sure was what was returning the response because I told nslookup to query Verizon's nameservers directly using the `server` command. I think it really must be either something with the authoritative nameservers and/or how Verizon asked the authoritative nameservers for the records. Continued in next comment... – sa289 Jan 22 '16 at 21:32
  • It couldn't have been operator error or human memory because when I logged in to investigate the issue, host1 and host2 were still A records pointing to the correct IPs and I also took a screenshot the night before showing they were correct then as well. For some reason somehow the wildcard CNAME was overriding the individual host records. – sa289 Jan 22 '16 at 21:33
  • Also, any explanation must take into account that OpenDNS and mxtoolbox.com were working fine, but Verizon was broken until we deleted the wildcard CNAME record. – sa289 Jan 22 '16 at 21:34
  • *"Also, any explanation must take into account that OpenDNS and mxtoolbox.com were working fine, but Verizon was broken until we deleted the wildcard CNAME record"* -- This unfortunately means we're at an impasse. The answer you're looking for requires Verizon to learn of a wildcard CNAME record in a context that is incomprehensible. The authoritative server doesn't announce the presence of a wildcard. It synthesizes responses due to its presence. You are looking at authoritative servers returning inconsistent responses, a typo, or an incorrect assumption. – Andrew B Jan 22 '16 at 23:58
  • Okay, I guess we won't know for sure - I'm thinking maybe whatever ns1.dnsbycomodo.net is, it's something NetSol acquired at some point in the past since it's different from their normal nameservers, so I'm going to chalk it up to some bug with that. – sa289 Jan 23 '16 at 00:22
  • NetSol didn't acquire them. [Comodo Group owns it.](https://en.wikipedia.org/wiki/DNS.com) – Andrew B Jan 23 '16 at 00:53
  • Hmm... well somehow logging into networksolutions.com and making changes to host records via their advanced DNS manager affects the records even though that's what the nameservers are set to. – sa289 Jan 23 '16 at 01:17
  • I've thought of an *extremely* rare, complex chain of interactions that may result in what you've described, but I will need the name of the domain to verify it. Not worth a write-up unless I can confirm it. – Andrew B Jan 23 '16 at 04:07
  • I'm very interested but default to not posting client information =(. I know you have plenty of reputation, but I'd give you a +50 bounty if you spent the time to write it up anyways. – sa289 Jan 23 '16 at 04:49