4

Background:

I have an environment with two different AD domains, each in its own forest, each with two Windows Server 2008 R2 domain controllers acting as DNS servers. There is no trust between the domains.

Each DNS server manages the main DNS zone for its AD domain, and then some other zones, including the reverse lookup zone for its IP subnets; all zones are AD-integrated; all DNS servers which manages a zone are correctly listed as authoritative name servers for that zone.

So, the situation is like this (using fake names and IP addresses):

Domain A:
DNS domain: domainA.dom
IP subnet: 192.168.1
DCs/DNS Servers: serverA1.domainA.dom (192.168.1.1), serverA2.domainA.dom (192.168.1.2)
Authoritative zones: domainA.dom, 1.168.192.in-addr.arpa, somezone.local

Domain B:
DNS domain: domainB.dom
IP subnet: 10.0.0
DCs/DNS Servers: serverB1.domainB.dom (10.0.0.1), serverB2.domainB.dom (10.0.0.2)
Authoritative zones: domainB.dom, 0.0.10.in-addr.arpa, someotherzone.local

DNS servers in domain A have conditional forwarders defined for each zone managed by DNS servers in domain B, forwarding to both domain B's DNS servers; DNS servers in domain B have the opposite configuration. All forwarders are stored in Active Directory.

All is working perfectly, and computers in each domain can resolve forward and reverse DNS queries for both domains, using their domain's DNS servers.


The problem:

I have SCOM 2012 deployed in domain A, with the SCOM agent installed on both DCs; the management packs for Active Directory and DNS Server are installed and up-to-date.

I have a series of alerts like the following ones on both domain controllers; each alert is generated for each forwarded zone and for each forwarded server:

Forwarder someotherzone.local (10.0.0.1) cannot resolve the host name 192.168.1.1,someotherzone.local for serverA1.domainA.dom
Forwarder someotherzone.local (10.0.0.2) cannot resolve the host name 192.168.1.1,someotherzone.local for serverA1.domainA.dom
Forwarder someotherzone.local (10.0.0.1) cannot resolve the host name 192.168.1.2,someotherzone.local for serverA2.domainA.dom
Forwarder someotherzone.local (10.0.0.2) cannot resolve the host name 192.168.1.2,someotherzone.local for serverA2.domainA.dom
Forwarder 0.0.10.in-addr.arpa (10.0.0.1) cannot resolve the host name 192.168.1.1,0.0.10.in-addr.arpa for serverA1.domainA.dom
Forwarder 0.0.10.in-addr.arpa (10.0.0.2) cannot resolve the host name 192.168.1.1,0.0.10.in-addr.arpa for serverA1.domainA.dom
Forwarder 0.0.10.in-addr.arpa (10.0.0.1) cannot resolve the host name 192.168.1.2,0.0.10.in-addr.arpa for serverA2.domainA.dom
Forwarder 0.0.10.in-addr.arpa (10.0.0.2) cannot resolve the host name 192.168.1.2,0.0.10.in-addr.arpa for serverA2.domainA.dom

The only exception is the main AD DNS zone managed by domain B's DNS servers ("domainB.dom"): for that conditional forwarder, no alert is generated and the forwarder availability monitor is green.

Ok, what does this mean?
What are those monitors trying to tell me?
What are they checking?
What's actually wrong?

And why there is no error for the "domainB.dom" zone, which is configured in the exact same way as the other ones, both as a zone in domain B's DNS servers and as a forwarder in domain A's DNS servers?

Massimo
  • 68,714
  • 56
  • 196
  • 319

1 Answers1

3

Answer found, and it's a bit unpleasant (at least if you were expecting the people creating management packs to actually know what they are doing).

Extracted from the monitor's description:

This monitor verifies forwarder availability by performing an NSLOOKUP on the targeted forwarder.

The script executes nslookup -timeout=<value> -querytype=<type> <name> <server>

<type> is A, NS, SOA.

<name> is target DNS name to resolve. If unconditional forwarder, the override value is used else the forwarder domain name is used.

<server> is the name of the server where the NSLOOKUP query occurs.
The value is $Target/Property[Type="DNS!Microsoft.Windows.DNSServer.Library.Server"]/ListeningIP$ which provides a listing of all IP addresses that the current DNS server is listening on.

<value> is timeout seconds/3 because timeout seconds is used to set the maximum time the script can run.

Unconditional Forwarder Example: NSLOOKUP -timeout=30 -querytype=a www.microsoft.com 10.0.0.5 would query server 10.0.0.5 for the DNS name for www.microsoft.com. In this case, you would set the actual timeout seconds to 90 monitor override.

Conditional Forwarder Example: NSLOOKUP -timeout=30 -querytype=a www.msn.com 10.0.0.5 would query server 10.0.0.5 for the actual name of the conditional forwarder targeted by this monitor.

What this means is that the SCOM agent will try to perform queries like these ones:

nslookup -timeout=30 -querytype=a domainB.dom <server>
nslookup -timeout=30 -querytype=a someotherzone.local <server>
nslookup -timeout=30 -querytype=a 0.0.10.in-addr.arpa <server>

This would work for the main AD domain's DNS zone (because DCs automatically register themselves as empty A records for that zone), but would fail for "someotherzone.local", which didn't have any empty A record (I can confirm that, after manually creating one, the alert disappeared and the monitor returned to green).

The third query, of course, would always fail, because it just doesn't make any sense at all to look for an A record in a reverse lookup zone.

Resolution: override the DNS Forwarder Availability Monitor to perform a NS or SOA query for the forwarded zone instead of an A query.
Which is what it should have been doing from the very beginning.

Massimo
  • 68,714
  • 56
  • 196
  • 319