2

To make this easier to wrap my head around, here's what I'm using in my examples:

deecee = my domain controller
dctoo = another domain controller
internal.foo.bar = the full DNSDomainName of my windows domain.
foo = the short (netbios) name of my windows domain.
oursite = The only site in our domain

We have all of the logging turned on for MS DNS Server and see plenty of NXDOMAINs for requests of this form: _ldap._tcp.deecee.internal.foo.bar. Note that I am not talking about _ldap._tcp.internal.foo.bar. Those are working fine. Here is an error entry from the log:

2/19/2015 8:07:06 AM 0960 PACKET  0000000002F885B0 UDP Snd 10.0.0.87       5052 R Q [8385 A DR NXDOMAIN] SRV    (5)_ldap(4)_tcp(6)deecee(8)internal(3)foo(3)bar(0)
UDP response info at 0000000002F885B0
  Socket = 332
  Remote addr 10.0.0.87, port 54309
  Time Query=178201, Queued=0, Expire=0
  Buf length = 0x0fa0 (4000)
  Msg length = 0x006d (109)
  Message:
    XID       0x5052
    Flags     0x8583
      QR        1 (RESPONSE)
      OPCODE    0 (QUERY)
      AA        1
      TC        0
      RD        1
      RA        1
      Z         0
      CD        0
      AD        0
      RCODE     3 (NXDOMAIN)
    QCOUNT    1
    ACOUNT    0
    NSCOUNT   1
    ARCOUNT   0
    QUESTION SECTION:
    Offset = 0x000c, RR count = 0
    Name      "(5)_ldap(4)_tcp(6)deecee(8)internal(3)foo(3)bar(0)"
      QTYPE   SRV (33)
      QCLASS  1
    ANSWER SECTION:
      empty
    AUTHORITY SECTION:
    Offset = 0x0030, RR count = 0
    Name      "(8)internal(3)foo(3)bar(0)"
      TYPE   SOA  (6)
      CLASS  1
      TTL    3600
      DLEN   38
      DATA   
        PrimaryServer: (6)deecee[C030](8)internal(3)foo(3)bar(0)
        Administrator: (5)admin[C030](8)internal(3)foo(3)bar(0)
        SerialNo     = 247565
        Refresh      = 900
        Retry        = 600
        Expire       = 86400
        MinimumTTL   = 3600
    ADDITIONAL SECTION:
      empty

Note that the client is requesting _ldap._tcp.deecee.internal.foo.bar. According to Microsoft's documentation, the proper request should be _ldap._tcp.internal.foo.bar.

The requests come in from all of our AD joined machines. They include Windows 7, Server 2008, 2008 R2, 2012, and 2012 R2.

Our DNS servers do have the appropriate SRV entries for _ldap._tcp.internal.foo.bar and they do resolve correctly. So that's not the issue.

A coworker opened a case with Microsoft and the tech finally claimed after a few days that this is normal. I don't buy it. Why is there no mention of this behavior at all in any documentation?

So, Does anyone else see this behavior? Clients looking up SRV records for _ldap._tcp.deecee.internal.foo.bar ? If so, are they getting NXDOMAIN results?

Any ideas how to fix this?

Thanks in advance.

Update A - There's more

In my domain I'm seeing these invalid queries in order of most common:

_ldap._tcp.oursite._sites.deecee.internal.foo.bar  
_ldap._tcp.deecee.internal.foo.bar  
_ldap._tcp.oursite._sites.dctoo.internal.foo.bar  
_ldap._tcp.dctoo.internal.foo.bar  
_ldap._tcp.deecee                           <- only from our sharepoint hosts  
_ldap._tcp.oursite._sites.decee  
_ldap._tcp.oursite._sites.dctoo  
_ldap._tcp.dctoo                            <- only from our sharepoint hosts  

Update B - There's something in sharepoint

I turned on netlogon debugging on one of the affected machines and found some interesting stuff. First, this is what I believe is a successful query being sent:

02/26 22:31:00 [MISC] [6824] DsGetDcName function called: client PID=1884, Dom:FOO Acct:(null) Flags: DS NETBIOS RET_NETBIOS 
02/26 22:31:00 [MISC] [6824] NetpDcInitializeContext: DSGETDC_VALID_FLAGS is c07ffff1
02/26 22:31:00 [MISC] [6824] NetpDcGetName: internal.foo.bar. using cached information ( NlDcCacheEntry = 0x0000007051E732F0 )
02/26 22:31:00 [MISC] [6824] DsGetDcName: results as follows: DCName:\\DEECEE DCAddress:\\10.1.1.80 DCAddrType:0x1 DomainName:FOO DnsForestName:internal.hlc.com Flags:0x800031fc DcSiteName:oursite ClientSiteName:oursite
02/26 22:31:00 [MISC] [6824] DsGetDcName function returns 0 (client PID=1884): Dom:FOO Acct:(null) Flags: DS NETBIOS RET_NETBIOS

And here's what an unsuccessful query being sent looks like:

02/27 09:13:01 [MISC] [308] DsGetDcName function called: client PID=1884, Dom:DEECEE Acct:(null) Flags: WRITABLE LDAPONLY RET_DNS 
02/27 09:13:01 [MISC] [308] DsIGetDcName: DNS suffix search list allowed but single label DNS disallowed for name DEECEE
02/27 09:13:01 [MISC] [308] NetpDcInitializeContext: DSGETDC_VALID_FLAGS is c07ffff1
02/27 09:13:01 [CRITICAL] [308] NetpDcGetNameIp: DEECEE: No data returned from DnsQuery.
02/27 09:13:01 [MISC] [308] NetpDcGetName: NetpDcGetNameIp for DEECEE returned 1355
02/27 09:13:01 [MAILSLOT] [308] Sent 'Sam Logon' message to DEECEE[1C] on all transports.
02/27 09:13:03 [CRITICAL] [308] NetpDcGetNameNetbios: DEECEE: Cannot NlBrowserSendDatagram. (ALT) 53
02/27 09:13:03 [MISC] [308] NetpDcGetName: NetpDcGetNameNetbios for DEECEE returned 1355
02/27 09:13:03 [CRITICAL] [308] NetpDcGetName: DEECEE: IP and Netbios are both done.
02/27 09:13:03 [MISC] [308] DsGetDcName function returns 1355 (client PID=1884): Dom:DEECEE Acct:(null) Flags: WRITABLE LDAPONLY RET_DNS 

If my understanding is correct (please correct me if not), the first line of this indicates that the process with PID 1884 is asking netlogon to log in to a domain named "DEECEE". It literally thinks the domain name is DEECEE. Of course, the previous snippet (and others) show that this process, pid=1884, is shotgunning out requests, some of which are legit, and some aren't.

Checking the process list on that machine tells me it's a w3wp process. So I found out the application pool:

C:\Windows\System32\inetsrv>appcmd list wps
WP "1856" (applicationPool:SharePoint - 80)
WP "6540" (applicationPool:SharePoint Central Administration v4)
WP "1884" (applicationPool:272b926088ea454c8a4b4caa8526d3bb)
WP "8468" (applicationPool:6997d03e3ea94018841409e8b821d8da)
WP "6696" (applicationPool:SecurityTokenServiceApplicationPool)

And then I checked which applications are running in that pool:

PS C:\Users\administrator.HLC> Get-SPServiceApplication | foreach { if($_.ApplicationPool.Id -eq "272b9260-88ea-454c-8a4b-4caa8526d3bb") { $_ } }

DisplayName          TypeName             Id
-----------          --------             --
PerformancePoint ... PerformancePoint ... 8681c71c-81b9-41e5-ac19-58d0ccf11227
Managed Metadata ... Managed Metadata ... ef99af38-a3f8-4864-8c88-9ee421f3dfa0
App Management Se... App Management Se... 183ca7a4-825a-4807-91fc-4fe1c9fe93e0
Excel Services       Excel Services Ap... 46557c93-3d60-47f0-99ab-45cc32258137
Subscription Sett... Microsoft SharePo... 9fd75bbe-1464-4a4c-8bd0-3382c0c03dce
Search Administra... Search Administra... ee519543-e311-41fd-a8a4-0b952f731ff8
User Profile Service User Profile Serv... fe6886ab-4a2d-4216-8bcf-5160dad5c037
Business Data Con... Business Data Con... 813bb77c-9eb4-43d0-b2cc-09e8162e58e7
Work Management S... Work Management S... 81dbd284-2506-43a0-be93-2820759bb804
Search Service Ap... Search Service Ap... d641f112-b299-4318-baaf-817ef96107c4

So I spent some time enabling and disabling these sharepoint services and watching the DNS queries go out. It appears that the User Profile Service is causing the queries for at least _ldap._tcp.deecee.

I know the whole thing isn't sharepoint's fault; as I said earlier these queries are coming from all over the place. The ones for just _ldap._tcp.deecee, though, are coming only from our sharepoint hosts.

So that adds another question. What is the user profile service doing that's causing the lookups to _ldap._tcp.deecee? It still leaves the question for the rest of our servers, though.

Keith Twombley
  • 235
  • 2
  • 11
  • Nothing to fix because AD is based on DNS . Your ad client need to resolve domain with dns and store the result. Client look for FSMO and other NXDOMAIN name . Its not verywell documented but noting strange (and noting dangerous he get from your dns record) . – YuKYuK Feb 19 '15 at 14:27
  • 4
    `A coworker opened a case with Microsoft and the tech finally claimed after a few days that this is normal. I don't buy it.` +1 for hubris. `Why is there no mention of this behavior at all in any documentation?` Two seconds of Googling later: https://technet.microsoft.com/en-us/library/cc961719.aspx?f=255&MSPPError=-2147217396 – Wesley Feb 19 '15 at 14:32
  • Your linked article says that clients will query for _ldap._tcp.dnsdomainname. Not _ldap._tcp.domaincontroller.dnsdomainname. I have read that article. – Keith Twombley Feb 19 '15 at 15:00
  • 1
    The documentation explicitly says that service records will be queried for in the form of `_ldap._tcp.domaincontroller.dnsdomainname.` (among other forms as well) – Wesley Feb 19 '15 at 15:16
  • Where does it say that? None of the examples show clients querying for the actual names of the domain controllers. – Keith Twombley Feb 19 '15 at 15:24
  • Scroll down to the bottom of your article and look at the example DC, phoenix.reskit.com. Nowhere in their examples does it show that a DNS record is registered for _ldap._tcp.phoenix. They don't mention it. – Keith Twombley Feb 19 '15 at 15:26
  • 1
    Interesting. I was trying to catch this in a few environments that I manage and I finally managed to catch it in one of them. I see the same queries you're seeing, so it doesn't appear to be an aberration, but I have no idea what those queries are for. I'm hoping someone can weigh in with a definitive answer. – joeqwerty Feb 19 '15 at 18:59

3 Answers3

2

This is a bug.

Microsoft has known about it for a long time (since Win2000) but no one has convinced them to fix it.

Ryan Ries
  • 55,011
  • 9
  • 138
  • 197
  • I have read the documentation. My clients are not querying _ldap._tcp.dnsdomainname. They are querying _ldap._tcp.domaincontroller.dnsdomainname. – Keith Twombley Feb 19 '15 at 15:03
  • All the devices making the weird queries are running windows and are joined to the domain. All of the clients do issue the expected queries (such as _ldap._tcp.dnsdomainname), but they issue these NXDOMAIN ones as well. Thanks. – Keith Twombley Feb 19 '15 at 16:14
  • I wonder if it's a matter of the services you have running, and/or domain controller configuration. On my legacy, horribly undesigned forest where every service imaginable going back to the NT days, was probably run at one time, the I see queries like the one in @KeithTwombley 's question, referencing the FMSO master role holder (which is also DNS server configured as primary for most clients). On my greenfield forest We're migrating everything to, which I set up myself, and set up right, I don't see that. Either way, I agree with MS support that this isn't something that needs fixed, though. – HopelessN00b Feb 20 '15 at 01:47
1

With netlogon debugging enabled I found the same result in my Win7 SP1 machines (domain controllers are 2008r2SP1). It also caused an 8 second delay in processing so far as I can tell. Looks like a faulty API call from netlogon to me.

You can replicate the same 1355 error by running the following on a workstation:

nltest /dsgetdc:domaincontroller.domain.com

returns:

Getting DC name failed: Status = 1355 0x54b ERROR_NO_SUCH_DOMAIN

clearly because it's calling the dsgetdc with the wrong parameter.

Though I agree with everyone else, it's most likely nothing wrong with your infrastructure. It would be nice to get to the bottom of it though.

masegaloeh
  • 17,978
  • 9
  • 56
  • 104
0

No need to fix, these lookups are being done to find the corresponding LDAP server for your AD tree.

vautee
  • 470
  • 3
  • 11