BIND9 SERVFAIL Issue with Windows 2008 R2 DNS Server

0

1

I was looking over a strange problem with BIND 9 when one of my Windows 2008 R2 instances is pointed to it as a forwarder. Specifically, when DNSSEC is turned on in BIND, some domain names fail to resolve under specific circumstances. These problems resolve spontaneously when switched to a public DNS server, like Google's 8.8.8.8.

Looking at this further, it appears when EDNS is turned on in the Windows 2008 R2 DNS server (default, accepting DNSSEC responses), resolution fails occasionally with a SERVFAIL when NODATA is returned to BIND (i.e. 0 answers with a status code of NOERROR.)

For example, mx2.comcast.com type SRV will fail when looked up in the Windows 2008 R2 DNS server pointed to BIND as a forwarder, but bat.comcast.com type SRV works just fine.

Doing the query locally with dig, I get these results:

mx2.comcast.com SRV - Local BIND query

; <<>> DiG 9.9.2-P1 <<>> @127.0.0.1 mx2.comcast.com SRV +dnssec
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 42484
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 4, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 4096
;; QUESTION SECTION:
;mx2.comcast.com.               IN      SRV

;; AUTHORITY SECTION:
mx2.comcast.com.        3600    IN      RRSIG   NSEC 5 3 3600 20130711200520 20130704170020 2643 comcast.com. pmOHJX7dSNuFSRiFvxNIIuhQk/Sh6/9xSiZ2wj2I6RDKkrQlDScdFjDB nSpeWt9068Wq+aQE36dbTsvyyCKgtrPcJIUxKVCtsXzTavXdx9XVGwG9 cKF6TrQx+MGPRwRwjPorDmPJxImveGMeE7X4Nl1mkGk/lRJwbvk1yFWV w1w=
mx2.comcast.com.        3600    IN      NSEC    mx3.comcast.com. A RRSIG NSEC
comcast.com.            3600    IN      SOA     dns101.comcast.net. domregtech.comcastonline.com. 2009085823 7200 3600 1209600 3600
comcast.com.            3600    IN      RRSIG   SOA 5 2 3600 20130711200520 20130704170020 2643 comcast.com. Te6jKcUXakWpPGQYpZICPShPZYEHHEcCnfFoof6VfOLPhhQP5MlWMbni QSQTY1UZLLCqU0j2U5n48wAMrSLSXoye+9W+pFnHtSl00fCQoQJ2ts+x DDQkdcJo2jWhNHGr6zsP6y9clhLUkFRW7ZVdqCV62KtTumU8Qe4UOjNK R3s=

Same query, but made with Google's DNS server:

; <<>> DiG 9.9.2-P1 <<>> @8.8.8.8 mx2.comcast.com SRV +dnssec
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 3537
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 4, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 512
;; QUESTION SECTION:
;mx2.comcast.com.               IN      SRV

;; AUTHORITY SECTION:
comcast.com.            1800    IN      SOA     dns101.comcast.net. domregtech.comcastonline.com. 2009085823 7200 3600 1209600 3600
comcast.com.            1800    IN      RRSIG   SOA 5 2 3600 20130711200520 20130704170020 2643 comcast.com. Te6jKcUXakWpPGQYpZICPShPZYEHHEcCnfFoof6VfOLPhhQP5MlWMbni QSQTY1UZLLCqU0j2U5n48wAMrSLSXoye+9W+pFnHtSl00fCQoQJ2ts+x DDQkdcJo2jWhNHGr6zsP6y9clhLUkFRW7ZVdqCV62KtTumU8Qe4UOjNK R3s=
mx2.comcast.com.        3600    IN      NSEC    mx3.comcast.com. A RRSIG NSEC
mx2.comcast.com.        3600    IN      RRSIG   NSEC 5 3 3600 20130711200520 20130704170020 2643 comcast.com. pmOHJX7dSNuFSRiFvxNIIuhQk/Sh6/9xSiZ2wj2I6RDKkrQlDScdFjDB nSpeWt9068Wq+aQE36dbTsvyyCKgtrPcJIUxKVCtsXzTavXdx9XVGwG9 cKF6TrQx+MGPRwRwjPorDmPJxImveGMeE7X4Nl1mkGk/lRJwbvk1yFWV w1w=

When using Windows with the BIND server as forwarder:

; <<>> DiG 9.9.3-P1 <<>> mx2.comcast.com SRV @127.0.0.1
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 57054
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4000
;; QUESTION SECTION:
;mx2.comcast.com.               IN      SRV

and with Google's DNS as forwarder:

; <<>> DiG 9.9.3-P1 <<>> mx2.comcast.com SRV @127.0.0.1
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 56582
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4000
;; QUESTION SECTION:
;mx2.comcast.com.               IN      SRV

;; AUTHORITY SECTION:
comcast.com.            900     IN      SOA     dns101.comcast.net. domregtech.comcastonline.com. 2009085823 7200 3600 1209600 3600

Now, trying this with bat.comcast.com:

; <<>> DiG 9.9.2-P1 <<>> @127.0.0.1 bat.comcast.com SRV +dnssec
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 2383
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 4, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 4096
;; QUESTION SECTION:
;bat.comcast.com.               IN      SRV

;; AUTHORITY SECTION:
comcast.com.            1603    IN      SOA     dns101.comcast.net. domregtech.comcastonline.com. 2009085823 7200 3600 1209600 3600
comcast.com.            1603    IN      RRSIG   SOA 5 2 3600 20130711200520 20130704170020 2643 comcast.com. Te6jKcUXakWpPGQYpZICPShPZYEHHEcCnfFoof6VfOLPhhQP5MlWMbni QSQTY1UZLLCqU0j2U5n48wAMrSLSXoye+9W+pFnHtSl00fCQoQJ2ts+x DDQkdcJo2jWhNHGr6zsP6y9clhLUkFRW7ZVdqCV62KtTumU8Qe4UOjNK R3s=
awrelaypool02.comcast.com. 1603 IN      RRSIG   NSEC 5 3 3600 20130711200520 20130704170020 2643 comcast.com. U87nbvAj7j7pAk4kigqMyVy8XDeHqRP9756PTQsucrRTEchtScfBKWLl Eo7cWJc4Vcsfept+ixg0IiAxpwHATqwNTmq/giAeglFfeFmMHlXrhdOl Bl5myReo1gSXlpm0+bvinOFRek/MUlYGLvDAq17noJag2k1oXrvhaNBo qWo=
awrelaypool02.comcast.com. 1603 IN      NSEC    www.bat.comcast.com. A RRSIG NSEC

and Google's resolver:

; <<>> DiG 9.9.2-P1 <<>> @8.8.8.8 bat.comcast.com SRV +dnssec
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 28253
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 4, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 512
;; QUESTION SECTION:
;bat.comcast.com.               IN      SRV

;; AUTHORITY SECTION:
comcast.com.            1800    IN      SOA     dns101.comcast.net. domregtech.comcastonline.com. 2009085823 7200 3600 1209600 3600
comcast.com.            1800    IN      RRSIG   SOA 5 2 3600 20130711200520 20130704170020 2643 comcast.com. Te6jKcUXakWpPGQYpZICPShPZYEHHEcCnfFoof6VfOLPhhQP5MlWMbni QSQTY1UZLLCqU0j2U5n48wAMrSLSXoye+9W+pFnHtSl00fCQoQJ2ts+x DDQkdcJo2jWhNHGr6zsP6y9clhLUkFRW7ZVdqCV62KtTumU8Qe4UOjNK R3s=
awrelaypool02.comcast.com. 3600 IN      NSEC    www.bat.comcast.com. A RRSIG NSEC
awrelaypool02.comcast.com. 3600 IN      RRSIG   NSEC 5 3 3600 20130711200520 20130704170020 2643 comcast.com. U87nbvAj7j7pAk4kigqMyVy8XDeHqRP9756PTQsucrRTEchtScfBKWLl Eo7cWJc4Vcsfept+ixg0IiAxpwHATqwNTmq/giAeglFfeFmMHlXrhdOl Bl5myReo1gSXlpm0+bvinOFRek/MUlYGLvDAq17noJag2k1oXrvhaNBo qWo=

Once again with Windows (BIND Resolver):

; <<>> DiG 9.9.3-P1 <<>> bat.comcast.com SRV @127.0.0.1
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 11140
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4000
;; QUESTION SECTION:
;bat.comcast.com.               IN      SRV

;; AUTHORITY SECTION:
comcast.com.            900     IN      SOA     dns101.comcast.net. domregtech.comcastonline.com. 2009085823 7200 3600 1209600 3600

Once again with Windows (Google Resolver):

; <<>> DiG 9.9.3-P1 <<>> bat.comcast.com SRV @127.0.0.1
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 22907
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4000
;; QUESTION SECTION:
;bat.comcast.com.               IN      SRV

;; AUTHORITY SECTION:
comcast.com.            900     IN      SOA     dns101.comcast.net. domregtech.comcastonline.com. 2009085823 7200 3600 1209600 3600

Looking at these results, Windows resolution fails on mx2.comcast.com, yet succeeds on bat.comcast.com, and from the error code that Windows reported (SERVFAIL), it may seem initially that DNSSEC validation is failing, although this was not the case since all of the query responses had the 'ad' (authenticated) bit set on them. That said, BIND appears to have an intriguing tendency to tamper with the order in which the authority section RRs appear. Looking at the Google query for mx2.comcast.com, we can see that the authority section appear in this order (this is how the authoritative server respond too):

  • SOA
  • RRSIG - SOA
  • NSEC
  • RRSIG - NSEC

whereas BIND returns responses in this order:

  • RRSIG - NSEC
  • NSEC
  • SOA
  • RRSIG - SOA

For bat.comcast.com, Google responds in this order:

  • SOA
  • RRSIG - SOA
  • NSEC
  • RRSIG - NSEC

and BIND responds in this order:

  • SOA
  • RRSIG - SOA
  • RRSIG - NSEC
  • NSEC

Given that the first query failed in Windows yet the second works just fine, it seems apparent that Windows 2008 R2 requires that the SOA record appear first when there are no answers and return code = NOERROR. (Do note that if the remote server returned a NXDOMAIN, then the ordering of these RRs does not seem to matter, and Windows will return NXDOMAIN accordingly back to the client).

Looking at the BIND documentation and see if there are any configuration options that control the ordering of these RRs, but to no avail. Here's what I have tried:

    rfc2308-type1 yes;
    minimal-responses yes;
    rrset-order {order fixed;};

I have also tried upgrading the local BIND version to 9.9.3-P1 from 9.9.2-P1, but the behavior did not seem to have changed.

Lastly, I could theoretically disable EDNS support in Windows 2008 R2 as a workaround and have these queries work (since disabling EDNS will also suppress the DO flag for DNSSEC, thus omitting the RRSIG and the NSEC RRs in the response), although I would rather have leave EDNS turned on for its efficiency over UDP.

Does anyone know anything that I am missing here, or ran into similar situations?

Any comments would be greatly appreciated!

user235909

Posted 2013-07-05T09:29:25.567

Reputation: 1

Answers

0

Have you been able to validate your assumption that it is the order of the records that causes MSDNS to return SERVFAIL? (It's plausible from what you show in the question but it's not clear to me that other possibilities have been ruled out.)

Also, is there anything logged on the MSDNS side relating to the failure?

I am not aware of any bind options that would be applicable to how the RRSIG/NSEC/SOA records are ordered in this situation.

Out of the settings you mention, rrset-order is the only one that should affect ordering but to my knowledge it's intended for a scenario like a response with multiple A records and how those should be ordered rather than this.

Either way the value fixed is not supported by default:

In this release of BIND 9, the rrset-order statement does not support "fixed" ordering by default. Fixed ordering can be enabled at compile time by specifying "--enable-fixed-rrset" on the "configure" command line.

It seems to me that if the ordering is what causes your problem, either MSDNS or BIND has a bug.

It's obvious that BIND has changed the order of the records in its response but it's not obvious (to me anyway) why that would be a problem.

HÃ¥kan Lindqvist

Posted 2013-07-05T09:29:25.567

Reputation: 916