what are some possible reasons why my company's DNS server sometimes fails to resolve the host name of some of my VM's?

3

1

I'm in charge of a group of VMs that are used by members of the group that I belong to for accessing web apps, web services, build servers, and stuff of that nature. IT is in charge of the DNS servers and I do not have access to them.

All of the VMs are part of the same local domain and they are all configured to use the same default gateway, subnet mask, and dns1 and dns2 servers, each with its own unique static IP address.

As I have tested again and again from different client computers, the DNS servers never fail to resolve the host name of some of the VMs (the windows server VMs) but half the time they fail to resolve the host name of a specific set of VMs (the windows 7 VMs).

For example if I run the following commands on a client PC immediately after an attempt to access a machine results in a "server not found" error message, I get the following output:

ipconfig /displaydns

vm1host.mycompany.local
----------------------------------------
Name does not exist.

nslookup vm1host

Server:  dnsserver1.mycompany.local
Address:  <dnsserver1-ip-address>

*** dnsserver1.mycompany.local can't find vm1host: Non-existent domain

nslookup vm1host.mycompany.local

Server:  dnsserver1.mycompany.local
Address:  <dnsserver1-ip-address>

*** dnsserver1.mycompany.local can't find vm1host.mycompany.local: Non-existent domain

nslookup

Server:  dnsserver1.mycompany.local
Address:  <dnsserver1-ip-address>

Name:    vm1host.mycompany.local
Address:  <vm1-ip-address>

ping vm1host

Ping request could not find host vm1host. Please check the name and try again.

ping vm1host.mycompany.local

Ping request could not find host vm1host.mycompany.local Please check the name and try again.

ping

Pinging <vm1-ip-address> with 32 bytes of data:
Reply from <vm1-ip-address>: bytes=32 time=1ms TTL=127
Reply from <vm1-ip-address>: bytes=32 time<1ms TTL=127
Reply from <vm1-ip-address>: bytes=32 time=2ms TTL=127
Reply from <vm1-ip-address>: bytes=32 time<1ms TTL=127

Ping statistics for <vm1-ip-address>:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 0ms, Maximum = 2ms, Average = 0ms

I discussed the problem with IT and after I was able to prove to them that the problem was not the client pc's, I was given the choice of living with the problem, modifying the hosts file of each client, or file a ticket to have someone from IT manually add an entry to the DNS server(s) mapping the hostname/ip-address of each problematic machine.

The last course of action will likely solve the current problem, but it will not get me any closer to understanding the cause of the problem nor will it make me less reliant on IT the next time a new VM experiences this problem.

Attempts on my part to solve the problem include doing the following on each problematic machine:

  1. running ipconfig /registerdns

  2. running batch scripts that periodically call ipconfig /registerdns

  3. changing the machine's name, having it join a workgroup, restarting the machine, changing the machine name's back to its original name, having it rejoin the local domain, restarting the machine.

  4. Adding a DWORD entry named "DisabledComponents" with a value of 0x20 to the following registry key: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip6\Parameters\ (in case the machines are incorrectly registering their IPv6 address instead of their IPv4 address)

Concerning step 3, when I had the machines rejoin the local domain, I received the message "Changing the primary domain DNS name of this computer to "" failed."

According to this article http://support.microsoft.com/kb/2018583 the cause of the message could be one of the following:

  1. The "Disable NetBIOS over TCP/IP" checkbox has been disabled in the IPv4 properties of the computer being joined
  2. Connectivity over UDP port 137 is blocked between client and the helper DC servicing the join operation in the target domain*
  3. The TCP/IPv4 protocol has been disabled so that the client being joined or the DC in the destination domain targeted by the LDAP BIND is running TCP/IPv6 only.

I confirmed that none of those is the case. I also confirmed that the machines' firewall are using the domain's profile and have the following rules concerning UPD port 137:

Domain Inbound Enabled:

  • -File and Printer Sharing (NB-Name-In), local UPD port 137
  • -NetBIOS Name Service, local UPD port 137 + a specific remote address set I suppose by the domain's firewall policy

Domain Inbound Disabled:

  • -Network Discovery (NB-Name-In), local UPD port 137

Domain Outbound Enabled:

  • -File and Printer Sharing (NB-Name-Out), remote UPD port 137

Domain Outbound Disabled:

  • -Network Discovery (NB-Name-Out), remote UPD port 137

And then again, the firewall settings for the Windows Server machines, the one whose host name the DNS servers never fail to resolve, are set similarly.

The KB article I linked to ultimately conceded that as long as an NetpCompleteOfflineDomainJoin SUCCESS: Requested a reboot :0x0 entry appears in the C:\Windows\debug\NetSetup.LOG, the error message is little more than annoyance.

To make the story even longer, in an attempt to determine whether the problem was the way the Windows 7 VMs are configured or the DNS server, I set up my own DNS server and had my client PC point toward that DNS server, and to my surprise my DNS server never failed to resolve the hostnames of any of my VM's. Alas, as soon as I shared the good news with IT they told me to take it down because it could potentially mess up the company's DNS servers.

Other than never again relying on a DNS server to resolve the hostname of a Windows 7 machine (and I'm going out on a limp here and assuming that the problem is somehow related to the fact that the affected machines are all running Windows 7), what else can I do in order to solve the current problem and prevent similar problems in the future?

J Smith

Posted 2014-01-18T20:06:16.167

Reputation: 177

You could try not obscuring and not falsifying your data when asking for help. Do your domain names really end in local.?

– JdeBP – 2014-01-20T20:55:52.777

2If your domain does end in .local, keep in mind that Multicast DNS may be interfering if your clients are also attempting to resolve using that method. – cpugeniusmv – 2014-02-05T07:19:56.080

Answers

1

If the problem machines are not members of an Active Directory domain that the DNS server trusts then they will not have permission to insert entries into the DNS server.

If your company's DNS server hosts a .local domain then it will interfere with the zerconf functionality of mDNS/DNS-SD available to OSes like Windows 7.

The question Windows DNS sometimes can't pick up my VM's hostname is similar, though the problem machine is not windows based, similar issues may be involved.

BeowulfNode42

Posted 2014-01-18T20:06:16.167

Reputation: 1 629