6

I am studying algorithm construction and weaknesses to resource consumption. One vulnerability that really caught my eye was the Apache Range Header DoS Vulnerability. The following quote was taken from Apache developers discussing the flaw:

From looking at the code, I think the problem is the bucket structs.
With N the number of requested ranges, the initial brigade is
partitioned into 2*N buckets at the maximum. Then those buckets are
copied into the output brigade N times, which means that O(N^2)
buckets are created. The data is not copied, and only N "A-B" strings
are allocated from the pool.

Does anyone else know of other resources about building algorithms that are resistant to resource exhaustion attacks? Does anyone of interesting papers related to Algorithmic analysis and susceptibility to resource exhaustion? Do you know of other resource consumption vulnerabilities that where caused by flawed algorithms?

CodesInChaos
  • 11,854
  • 2
  • 40
  • 50
rook
  • 46,916
  • 10
  • 92
  • 181

2 Answers2

7

This attack is known as Hash DoS, or more generally as Algorithmic Complexity Attack.

There are several ways to implement lookup tables:

  1. Balanced trees
    The relevant operations have logarithmic performance, regardless of the data. These are naturally immune to this attack, but they're slower.
  2. Hashtables
    If the hashes are well distributed, the relevant operations are O(1) and really fast, but if they're badly distributed they become O(n).

There are two common strategies to avoid Hash DoS: You can switch to trees, or you can use keyed hashes.

For keyed hashes the server chooses a random secret key. Depending on the situation this key can be per-process, per-table, per-request,... Longer lifetimes can allow adaptive attacks over multiple requests, but I'm not sure how big a problem that is in practice.

Then it uses a keyed hash to determine the bucket, instead of an unkeyed hashfunction. If the keyed hash is good, this prevents the attacker from producing collisions quickly. Earlier keyed hashes often suffered from key independent collisions, and thus didn't prevent the attack once these were found. Currently SipHash is gaining popularity, since it's fast, but still cryptographically secure.

My recommendation is using SipHash and to avoid key reuse across requests where possible.


  • Breaking Murmur: Hash-flooding DoS Reloaded (On Martin Boßlet's blog)

  • SipHash/Attacks

    Attacks
    Jointly with Martin Boßlet, we demonstrated weaknesses in MurmurHash (used in Ruby, Java, etc.), CityHash (used in Google), and in Python's hash. Some of the technologies affected have switched to SipHash. See this oCERT advisory, and the following resources:

    • Slides of the presentation "Hash-flooding DoS reloaded: attacks and defenses" at Application Security Forum Western Switzerland 2012 (Aumasson, Boßlet)
  • Hash DoS against BTRFS

  • Many programming languages are currently upgrading their standard libraries to avoid Hash DoS. These attacks hit dynamically typed languages very hard, since they typically use hashtables almost everywhere (member names to values etc.)

    Some big projects upgrading their hashtables:

CodesInChaos
  • 11,854
  • 2
  • 40
  • 50
  • 1
    @bonsaiviking This is a great answer. I am looking for other vulnerable algorithms, and he found a whole set! – rook Dec 18 '12 at 00:44
  • 1
    Actually the Apache Range Header bug is not a hash collission problem. Apache had hash problems as well. – eckes Dec 21 '12 at 04:57
3

The specific resource exhaustion bug is caused by replicated transmission buffers for each requested chunk. As such, a configuration that has a 16kB send buffer would actually consume 1MB of memory if 64 full length ranges are sent, or 10MB if you send 640 range headers. As long as the connection is kept open, the unsent buffers remain in memory. Obviously this can be utilised to perform an effective DoS, since just 10 open connections with 640 range headers each could eat up 100MB. Your mileage may vary on different configuration settings.

I don't know of any papers on algorithm security analysis, but one of the best ways to look for resource exhaustion bugs is to identify any point in the algorithm where an unrestricted input can alter the amount of memory allocated, or where the client has done significantly less work than the server.

Polynomial
  • 132,208
  • 43
  • 298
  • 379