0

Dan Kaminsky described how DNS servers could be poisoned with spoofed DNS responses [1]. As I understand it, the problem was that Kaminsky found a way to account for most other sources of randomness in a DNS query such that the main barrier to an attacker was in guessing the DNS query id (16 bits of entropy) when generating a spoofed response. An attacker could, on average, spoof the response within 32k guesses. So, the recommended mitigation was to randomize the source port, and everyone applied their patches and all was well.

Except that this only brought up the number of guesses from 32k to somewhere between 134m to 4b. Sure, it couldn't be done quickly, but a patient attacker could still do this slowly - in fact, Bert Hubert calculated that an attack at 100qps has 50% chance of success within 6 weeks. [2]

I don't have sufficient reputation to post more links. However, I see that many technical approaches have been considered, such as draft-wijngaards-dnsext-resolver-side-mitigation-01 and draft-vixie-dnsext-dns0x20-00 on tools.ietf.org, RFC5452 as well as the Google Public DNS security docs:

  1. DNS label bit 0x20 (ie, cAsE.gAmEs)

    • does Bind do this? I can't believe that Bind wouldn't implement something that Vixie proposed
    • still, an attack could force the query of domains whose case cannot be significantly munged. eg. "d293823."
  2. RTT banding, IPv4/IPv6 selection, source address randomisation (from wijngaards)

    • I don't think this will add significant entropy, but I'd do it if Bind can.
  3. Authority query for NS/nameserver/A/AAAA after referral (from wijngaards)

    • This seems to be an elegant solution. Don't understand the problems with it. It might not be the preferred solution for large scale deployment, but it seems reasonable for my site. Can Bind do this?
  4. Attack mode trouble counter

    • Can Bind do this?
    • However, if I have a statefull firewall in front of the DNS server, the DNS server's trouble counter will never see spoofed responses arriving with the wrong destination port right?
  5. Fallback to TCP (especially after going into attack mode)

  6. Ask twice/thrice (especially after going into attack mode)

  7. Removing duplicate queries (from Google Public DNS)

Unfortunately, I don't see any configuration options for turning these on even in the latest version of Bind.

So I would like to ask what else can be done to protect against this style of spoofing attacks, specifically when I am running Bind. If the mitigation remains a probabilistic thing, I'd like for the odds to be stacked against the attack succeeding in a million years.

[1] http://s3.amazonaws.com/dmk/DMK_BO2K8.ppt

[2] http://ds9a.nl/har-presentation-bert-hubert-3.pdf slide 24.

  • `2.` is entirely possible in Bind, see the `query-source` option.. "The BIND default is any server interface IP address and a random unprivileged port (1024 to 65535)." http://www.zytrax.com/books/dns/ch7/queries.html#query-source – NickW May 23 '14 at 14:59
  • It also adds significant entropy, from 10 seconds `@50k qps` (non random) to 36 hours `@50k qps`. – NickW May 23 '14 at 15:01
  • I understand that it adds ~ 15.8 bits of source port randomization which takes the number of possibilities up to ~ 4b. Does it also do source address randomization though, or does that just let the OS pick the default IP on the interface? Anyway, I don't really have tons of IPv4 addresses to throw at this :) DNS address randomization is not an acceptable reason for the RIR. – Bingu Bingme May 23 '14 at 15:34
  • I think you'd find that the source address is probably OS dependent, if they're in the same subnet. Routes probably have a non trivial impact as well. – NickW May 23 '14 at 15:41

2 Answers2

1

Most TTLs are less than 30 days, right? Clear your cache every month.

There is no way to stop someone from hacking into your servers.
Security is just putting enough hurdles in the way that 99.9% of people give up.

You could also try putting an IPS in front of your DNS, like Snort, and create rules to alert you when someone makes X excessive amount of DNS queries in X amount of time.

Just so you know, I've run a cluster of DNS servers with hundreds of zones. Not only are these servers authoritative, but they are recursive. They have been used in DNS amplification attacks, but not a single time has there been an issue of cache poisoning. They've been running for 15 years.
One of them is a domain controller. Yes, it is ridiculous. I didn't design it.

Vasili Syrakis
  • 4,435
  • 3
  • 21
  • 29
  • Any attacker worth his salt would be using a botnet to do a distributed attack, so you would have an even lower volume from any single client. – NickW May 23 '14 at 09:50
  • Maybe they are worth a pile of salt, but there's a myriad ways to mitigate it. Rate limit the server to 50 qps, as above it says 100 qps over 6 weeks has a 50% chance of success. Plus clearing cache every 2 to 4 weeks... It becomes pointless to attack "slowly". – Vasili Syrakis May 23 '14 at 09:56
  • BTW, look at page 21 of the second PDF, 50k qps can compromise a server in 36 hours (with random source ports.. static source port 10 seconds). If you're going to rate limit your server to 50 qps, you're going to want a pretty big load balanced pool, or you're actually not serving that many clients.. plus, if your server is authoritative, you'll need DNSSEC enabled to avoid enabling someone's cache poisoning attacks on your domains.. (they just flood questions from the IP, and push fake replies to the resolver per here: https://conference.apnic.net/data/37/apricot-2014-rrl_1393309768.pdf ) – NickW May 23 '14 at 10:56
  • I think that goes beyond the issue of cache poisoning, and into the realm of DDoS, no??? – Vasili Syrakis May 23 '14 at 11:14
  • The scary thing is it's all kinda linked, if you limit your replies too much, you allow other people to spoof your replies. I mean, if they're determined enough to try slow cache poisoning, why not something like that? – NickW May 23 '14 at 12:00
  • Clearing my resolvers' cache monthly/weekly/daily still means that a successful attack would understand and intercept my traffic for half that period - this is something I would like to avoid. Although my site is smallish, I do believe that it is sufficiently high profile that attackers may be interested. In addition, I already have separate authoritative and recursive name servers, where the recursive name servers only serve internal users. However, it is trivial for an attack to trigger dns resolution or arbitrary domain and time, eg. smtp helo, email address, compromised user – Bingu Bingme May 23 '14 at 15:08
  • Also, rate limiting my recursive servers is just punishing my users for an attack that hasn't taken place (yet). Isn't Bind able to do more than the standard source port randomization? – Bingu Bingme May 23 '14 at 15:35
  • I would not leave this up to bind. If your nameservers are high profile you will need to make use of other options like scrubbing centres, ddos protection... if they are important then it is worth the money. – Vasili Syrakis May 23 '14 at 22:09
  • How would scrubbing centres and ddos protection help in the context of a resolver (not an authoritative name server)? – Bingu Bingme May 23 '14 at 23:50
0

While I realize that DNSSEC isn't ubiquitous, signed zones should add another layer of complexity, as Bert mentions on page 30-31, with the later versions of Bind, it's quite easy to implement (though your logs can get a bit verbose when talking to non DNSSEC enabled servers).

NickW
  • 10,183
  • 1
  • 18
  • 26
  • Roger that. However, this depends on the zone being DNSSEC-enabled already, which is still not very common. Even my cctld is not, yet. DNSSEC, as well as approaches like DNS Curve, require the other party to support it, which is not within my (resolver's) control. Would prefer measures that can be implemented on the resolver, running Bind. – Bingu Bingme May 23 '14 at 14:53