8

A typical strategy for defeating ASLR is to find both a buffer overflow bug and an information disclosure bug. But when attacking servers that are automatically restarted whenever they crash/die, is a buffer overflow bug enough? Can we use that buffer overflow bug to also give us the information disclosure capability?

Let me flesh out the scenario. Suppose I have a server program that processes a request from the network and will be automatically restarted on a crash, and suppose I have found a buffer overrun vulnerability (of a heap-allocated buffer B) in the server that I can reliably exploit by sending an appropriately crafted request to the server. Suppose also that I can detect when the server crashes (e.g., it might crash because I sent it a request that corrupted its memory and caused it to segfault; in any case, assume I can send it a request of my choice and determine whether it crashed or not while trying to handle that request). The server is using ASLR, and I want to derandomize ASLR so I can mount a code injection attack, but I don't know of a separate information disclosure bug -- the buffer overrun vulnerability is all I've got to work with.

Can I use this buffer overrun vulnerability for information disclosure, to learn the contents of memory after B?

Here's an example of the sort of attack I am imagining:

Suppose that the overflowable buffer B is 512 bytes long, and suppose there is a secret 8-byte pointer P stored immediately after B. Suppose field F of the request is copied byte-for-byte over B without any length check.

If I send a request with a 513-byte value for F, that'll be copied over B. If the 513th byte of my value differs from the first byte of P, then the value of P will be corrupted, and (assuming the program later dereferences P) then program will probably crash during processing of this request. On the other hand, if the 513 byte of my value matches the first byte of P, then P will remain unchanged, and the program will probably not crash.

So, I can imagine sending 256 requests, each with a different value for the 513th byte of field F; if 255 of them cause the server to crash and one does not, then I immediately know the value of the first byte of P. Now I can continue until I learn each of the bytes of P. This might be useful if the program is using ASLR: by learning the value of the pointer P, I can derandomize part of memory.

This is just a simple example. In practice, I would imagine there might be unused space after the end of B and before the next object stored in the heap, but you can imagine ways to adapt these techniques to deal with that situation as well (e.g., if the byte after B is unused, then you can overwrite it with anything and the server won't crash, so it's easy to detect locations that are unused, and continue the attack until you find the next object after B).

Does this attack work in practice? Does it provide an effective way to defeat ASLR, when you have a heap overflow in a server that's automatically restarted and when you have a way to detect crashes?

Are there any hurdles that I've overlooked that prevent this from working? For instance, I can imagine that if memory allocation for objects in the heap were non-deterministic and random, the attack would fail; but do platforms do that? Are the relative offsets between objects in the heap deterministic in practice, if you run the same program twice on the same inputs?


I am assuming that the buffer overflow allows overwriting B with arbitrary binary data that's totally under the attacker's control. (The attack won't work with a strcpy() or string-related overflow, since then the data is forced to be nul-terminated.) Also, let's assume that either the server is re-started using fork(), or for some other reason, part of the memory layout is the same each time the server is restarted. (For instance, this automatically holds on on Windows and Mac, since libraries are at the same base address every time you restart the server, and it holds for non-PIE processes on Linux.)

Credit: I am inspired by a method I recently read about for exploiting a buffer overflow bug of a stack-allocated buffer for information disclosure purposes. This was described in the Blind ROP paper recently published at IEEE Security & Privacy 2014. They show how to do it when the buffer B is allocated on the stack. In this question, I am asking whether their technique can be generalized to the case where the buffer B is on the heap.

D.W.
  • 98,420
  • 30
  • 267
  • 572

1 Answers1

5

As you note, after each failed attempt, the program crashes. Depending on the operating system/robustness of ASLR implementation and how the program is restarted, this may result in:

  1. the same layout (e.g. if fork without exec is used to restart the program);
  2. a partially different layout (e.g. non-library code addresses for non-PIE executable on Linux, or for library code addresses on some (maybe all?) versions of Windows and Mac OS X);
  3. a completely random layout (e.g. for a PIE Linux executable).

Therefore, if the value you are trying to guess is not static between program restarts, you can determine if you guessed correctly, but as soon as you guess incorrectly, the program restarts and you no longer have any information; thus, you don't gain anything by guessing incrementally.

Clearly ASLR does not provide any absolute protection, and its effectiveness is particularly limited when only some areas are randomized or the same layout is used across multiple program runs, but it does seem that making a generic attack with a heap overflow that does not depend on very specific details of the program in question, as done with stack overflows in the Blind ROP paper, would be rather difficult.

jbms
  • 466
  • 2
  • 3
  • 1
    Actually, in the Blind ROP paper, the authors do specifically note that their method only works when ASLR is not applied to some of the executable code, or the server is restarted using fork(), which prevents re-randomization. If the secret value you are trying to learn is a function pointer (as opposed to some other secret key that nonetheless causes a crash if modified), then the same restrictions would apply. The assumption that the heap layout remains the same seems plausible if no one else is accessing the server. – jbms May 30 '14 at 17:54
  • Good point, I was considering just Linux; I revised my answer, but I see that you already revised your question to take this into account. – jbms May 30 '14 at 20:29