DNS/resolv.conf settings for a Primary DNS Server failure?

Question

I'm currently the administrator of some RHEL Linux machines, in a mixed network. Our DNS servers are Windows AD controllers. As such, they occasionally need to come down for maintenance. (eg: patching) This means that at some point, the primary DNS controller for my Linux machines will be unreachable.

In the Windows world, this is handled pretty well. When DNS queries to the primary fail, Windows clients stop using it for 15 minutes. So, barring the initial hiccup, they all putt along pretty smoothly. But Linux keeps trying the same (failed) primary server. By default it will wait at least 5 seconds before trying a secondary server. This translates into EVERYTHING taking a long time, and even applications timing out if there are a good number of DNS lookups.

So, I'm looking into making my server more robust. My current plan is to A) modify resolv.conf to only wait 1/2 a second for a response, and not retry. and B) possibly make some strategic entries to /etc/hosts so that major servers are still reachable quickly.

All that being said, I'd love to have a better solution. Alternately, I'd like to hear what other people are doing with their setups. Or just theoretical "Your idea is good/bad, here's why."

--Christopher Karel

score 2 · Accepted Answer · edited Jul 08 '14 at 20:50

2

You might look at using dnsmasq instead of relying solely on the resolver library - dnsmasq queries the upstream servers in parallel, not a serial fashion, so having one drop out shouldn't cause so many problems.

edited Jul 08 '14 at 20:50

Andrew B

31,858
12
90
128

answered Aug 25 '10 at 20:18

gbroiles

1,344
8
8

score 2 · Answer 2 · answered Aug 26 '10 at 00:28

2

Maybe running an nscd and adding

options rotate

to /etc/resolv.conf already does the trick for you.

answered Aug 26 '10 at 00:28

al.

915
6
17

score 1 · Answer 3 · answered Aug 25 '10 at 15:55

1

An easier solution is to redirect the traffic for a certain time (maintenance window).

If you have a spare machine, you could give it temporary the ip of your primary server. Otherwise you could deploy the redirection in the router. If a packet has as destination your primary server you can redirect it to your secondary server

answered Aug 25 '10 at 15:55

Nikolaidis Fotis

1,994
11
13

Those are *quite* bad things to do with the IP address of a domain controller. – Massimo Aug 25 '10 at 20:49
I agree ... but it's still a dirty solution :> Maybe something easier would be to change the dhcpd configuration and broadcast secondary dns as primary. In any case i think that these solution are much more elegant / easy to deploy than changing timeout manually (even if he uses puppet or something like that it's quite a mess) – Nikolaidis Fotis Aug 25 '10 at 21:15

score 1 · Answer 4 · answered Aug 25 '10 at 17:31

1

A harder but more robust solution would be to set up a couple name servers (slaved off AD) and use anycast. http://en.wikipedia.org/wiki/Anycast#Domain_Name_System

answered Aug 25 '10 at 17:31

Mark Wagner

17,764
2
30
47

DNS/resolv.conf settings for a Primary DNS Server failure?

4 Answers4

Linked