6

last Saturday we promoted a new virtual 2008R2 server as dc and transferred all FSMO roles to it. Today we spend all day chasing down a problem with the domain browser function.

First some background:

  • We have a WAN with about 30 sites/subnets. Most sites have a DC running either 2000 or 2003. On the main site we had two 2000 (ntads01, ntads02) and one 2003 (ntads03) DCs before installing the 2008R2 (ntads04).
  • A 2003 Server was never owner of the FSMO roles. The Roles were transferred from ntads01 to ntads04 directly.
  • The IP of the old primary DC (ntads01) was changed and ntads04 got assigned to it. All DCs are still up and running.
  • WINS was installed on all three DCs on the main site. The two 2000 DCs were listed as WINS servers in DHCP.

Problem information

Problem this morning was, that net send between xp machines quit working. We started to investigate.

In the event-log on ntads04 we have the messages, that the DCs on the other site get detected as master browsers and the local master browser is shutdown or an election is forced. (Message is in German sorry). The event-log xml is as follows.

<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name="bowser" /> 
    <EventID Qualifiers="49152">8003</EventID> 
    <Level>2</Level> 
    <Task>0</Task> 
    <Keywords>0x80000000000000</Keywords> 
    <TimeCreated SystemTime="2011-12-06T16:46:16.873033800Z" /> 
    <EventRecordID>19226</EventRecordID> 
    <Channel>System</Channel> 
    <Computer>ntads04.domain.local</Computer> 
    <Security /> 
  </System>
  <EventData>
    <Data>\Device\LanmanDatagramReceiver</Data> 
    <Data>NTFIL_BIET03</Data> 
    <Data>NetBT_Tcpip_{53FF57AE-2EC4-44D7-B96F-0A9F89AD9909}</Data> 
    <Binary>000000000300320000000000431F00C00000000000000000B1010000000000000000000000000000</Binary> 
  </EventData>
</Event>

This morning we had some 8021 browser events on ntads01, ntads02 and ntads03 stating that the server list could not be retrieved from ntads04. After browser eventid 8032 they disappeared even so I restarted the computer browser service on ntads01 and ntads03.

WINS on the for DCs on the main site does have a different number of entries per server. On ntads04 we have WINS eventids (all translated):

  • 4342 WINS could not read Cachesizeparamter "minimum". Standard is used
  • 4325 WINS could not get the Initial-Challenge-Retry-Intervall from registry
  • 4326 WINS could not get the maximal Count of Challenge-Retries from the registry

Steps taken so far:

  • Activated Computer Browser on ntads04 (solved only some sites in network neighborhood)
  • Disabled WINS on ntads01 and ntads02, re-installed on ntads04 and after first sync cleared database on ntads03 which seemed to solve some replication problems
  • Analyzed old event-logs of ntads01. It had the MRxSmb 8003 once a day per site, we now have it every 12 minutes.
  • Ran Wireshark on ntads04 (lots of browser elections and no domain master announcement)

    .... .... .... .... .... .... .... ...1 = Workstation: This is a Workstation
    .... .... .... .... .... .... .... ..1. = Server: This is a Server
    .... .... .... .... .... .... .... .0.. = SQL: This is NOT an SQL server
    .... .... .... .... .... .... .... 1... = Domain Controller: This is a Domain Controller
    .... .... .... .... .... .... ...0 .... = Backup Controller: This is NOT a Backup Controller
    .... .... .... .... .... .... ..1. .... = Time Source: This is a Time Source
    .... .... .... .... .... .... .0.. .... = Apple: This is NOT an Apple host
    .... .... .... .... .... .... 0... .... = Novell: This is NOT a Novell server
    .... .... .... .... .... ...0 .... .... = Member: This is NOT a Domain Member server
    .... .... .... .... .... ..0. .... .... = Print: This is NOT a Print Queue server
    .... .... .... .... .... .0.. .... .... = Dialin: This is NOT a Dialin server
    .... .... .... .... .... 0... .... .... = Xenix: This is NOT a Xenix server
    .... .... .... .... ...1 .... .... .... = NT Workstation: This is an NT Workstation
    .... .... .... .... ..0. .... .... .... = WfW: This is NOT a WfW host
    .... .... .... .... 0... .... .... .... = NT Server: This is NOT an NT Server
    .... .... .... ...0 .... .... .... .... = Potential Browser: This is NOT a Potential Browser
    .... .... .... ..0. .... .... .... .... = Backup Browser: This is NOT a Backup Browser
    .... .... .... .1.. .... .... .... .... = Master Browser: This is a Master Browser
    .... .... .... 0... .... .... .... .... = Domain Master Browser: This is NOT a Domain Master Browser
    .... .... ...0 .... .... .... .... .... = OSF: This is NOT an OSF host
    .... .... ..0. .... .... .... .... .... = VMS: This is NOT a VMS host
    .... .... .0.. .... .... .... .... .... = Windows 95+: This is NOT a Windows 95 or above host
    .... .... 1... .... .... .... .... .... = DFS: This is a DFS server
    .0.. .... .... .... .... .... .... .... = Local: This is NOT a local list only request
    0... .... .... .... .... .... .... .... = Domain Enum: This is NOT a Domain Enum request
    
  • Set the IsDomainMaster key on ntads04 and restarted computer browser (no change)
  • checked with browstat who is master browser (seems to be ok, nothing strange seen), but not sure about browstat sta:

    C:\temp>browstat sta
    
    Status for domain DOMAIN on transport \Device\NetBT_Tcpip_{08C65950-9A5B-4E16-8CD8-F2890F7E81C7}
        Browsing is active on domain.
        Master browser name is: NTADS04
            Master browser is running build 7601
        3 backup servers retrieved from master NTADS04
            \\NTADS03
            \\NTADS01
            \\NTADS04
        Unable to retrieve server list from NTADS04: 64
    

Questions open

  • Why is the interval of the master browser events so much higher than before? Can we get rid of this event completely? And is it true that ntads04 does not promote itself as Domain Master Browser? (See Edit 2)
  • Why did net send stop working? (See Edit 1)
  • Is WINS all right? I.e. is it normal that of two wins server one has about 200 extra entries (total 4000) even after a couple of hours? (They have the same intervals setup)

Thanks for helping, I hope all the relevant information is included.

Edit 1:

The net send problem is solved. During the update gpos were cleaned up, which involved deleting all service configurations from the domain default gpo. This in turn made the Messenger service revert to its default (deactivated). Made a new gpo which made it autostart again. fixed

Edit 2:

The problem with the interval of the master browser events is fixed to. We got rid of them alltogether. Problem was the DHCP helper of the cisco routers. The did not only forward dhcp and bootp but also a lot of other protocolls in the default config. See http://www.cisco.com/en/US/docs/ios/12_3/ipaddr/command/reference/ip1_i1g.html#wp1108053 for details.

Jonathan
  • 575
  • 1
  • 7
  • 17
  • It looks like you've answered your own question, but I'm upvoting the question anyway because it's detailed, technical and interesting. :) – Ryan Ries Jan 23 '12 at 13:59
  • I'm glad your problem was solved. When you solve your own issue, you shouldn't edit it into your question, but rather post it as an answer and then accept it so that others can easily see that there is a solution, and this question won't be listed as "unanswered" – MDMarra Jan 23 '12 at 13:59

1 Answers1

1

See edits of question.

Wins still has different number of entries, but seems to function well. So I'm closing this.


The problem with the interval of the master browser events is fixed to. We got rid of them alltogether. Problem was the DHCP helper of the cisco routers. The did not only forward dhcp and bootp but also a lot of other protocolls in the default config. See http://www.cisco.com/en/US/docs/ios/12_3/ipaddr/command/reference/ip1_i1g.html#wp1108053 for details.

MDMarra
  • 100,183
  • 32
  • 195
  • 326
Jonathan
  • 575
  • 1
  • 7
  • 17