4

I'm wondering if anyone has previously proposed, evaluated, or deployed the following measure to harden systems against heap-based buffer overruns: basically, stack canaries, but applied before function pointers in objects stored in the heap rather than before return addresses stored in the stack.

Consider a struct like

struct whatever {
    int blah;
    char buf[256];
    void (*fp)(); // a function pointer
}

Notice that if there is an overrun that writes past the end of the buf field, it will be possible to overwrite the function pointer field fp.

A compiler could plausibly defend against this by introducing a canary -- a secret random value -- stored between the buffer and the function pointer. Basically, the compiler would transform the layout of the structure to

struct whatever {
    int blah;
    char buf[256];
    unsigned int canary; // inserted by compiler; not exposed to source code
    void (*fp)(); // a function pointer
}

For instance, the compiler could arrange to write the canary field with a global secret value any time the program writes to fp, and could check that the value of the canary remains unchanged any time the program reads from fp.

This is basically the analog of stack canaries, but where now we focus on protecting function pointers in the heap instead of return addresses in the stack. It seems like a natural idea.

Has anyone proposed this before? Has anyone prototyped it or evaluated the performance cost of doing something like this? Are there any non-obvious barriers to deployment (beyond the fact that it requires changes to compilers, just like stack canaries do)?


Research I've done: I'm aware of the idea of inserting guard pages between objects in the heap, but that's different (it protects against heap overflows that go beyond the bounds of a single object, whereas I'm talking about something to protect against heap overflows that stay within the region of a single heap object). I'm familiar with Cruiser and ContraPolice, which places canaries between objects in the heap, but that too focuses on cross-object overflows rather than intra-object overflows. I'm also familiar with use of stack canaries or pointer encryption for protecting malloc's metadata, but again, that doesn't protect against intra-object overflows and is intended to protect malloc's metadata rather than function pointers.

D.W.
  • 98,420
  • 30
  • 267
  • 572

2 Answers2

2

Canaries within an object run into one practical problem: it changes the in-memory layout of these objects. This layout needs to be consistent when passed between, say, the program and libraries. If you passed a pointer of type struct whatever * from a program compiled with this instrumentation, to a library compiled without this instrumentation (but with the same declaration of struct whatever), then things would break, because some of the fields would be at a different offset. As a result, using internal canaries for anything other than theoretical research would break compatibility with almost all existing code and libraries.

D.W.
  • 98,420
  • 30
  • 267
  • 572
Mark
  • 34,390
  • 9
  • 85
  • 134
  • Thank you. Can you elaborate on which standard would be violated if the compiler changes the in-memory layout of a struct along the lines I suggested? I thought the compiler is always free to insert padding between fields (e.g., to align fields as it sees fit), and there are no promises that the compiler won't do that. Have I misunderstood? Can you give an example where compatibility would break? (e.g., are you thinking that a program might write out the bytes of a struct to disk, and then read them back in? But that seems unlikely to work at all if the struct contains a function pointer.) – D.W. Jun 05 '14 at 06:28
  • @D.W. the compatibility problem is similar to that with encryption - instrumented code interfacing with non-instrumented. A big problem with the heap; with the stack (and also malloc internals) that can't happen. – paj28 Jun 05 '14 at 07:09
  • @D.W., the C language specification says that the compiler is free to change the layout. However, in order to interoperate with other libraries and programs (and the operating system), the layout needs to be consistent: this is part of the [application binary interface](https://en.wikipedia.org/wiki/Application_binary_interface) for a platform. – Mark Jun 05 '14 at 07:59
  • Mark, I have edited your answer to try to be more specific about what I think the problem you are getting at is. Please take a look and see if I did it justice or if I am still misunderstanding. – D.W. Jun 05 '14 at 20:47
  • @paj28, yup, qualitatively similar... but maybe differing quantitatively. With function pointer encryption, passing a function pointer (e.g., a callback) to an uninstrumented library screws you up. For instance, calling `qsort()` would break. But with internal canaries, that case doesn't break. Things break only if there is a `struct` that is declared in both the program header *and* the library header, and where the program was compiled with instrumentation and the library without. I suspect those cases might be rarer... but Mark is absolutely right that they're an issue. – D.W. Jun 05 '14 at 20:50
1

A similar approach is function pointer encryption, similar to that used to protect malloc metadata. This is proposed in section 2.4 of "Protection Against Overflow Attacks"

The book's analysis is favorable. There is also an older paper about this technique (and another one). However, as far as I know, this technique is not is active use. I have absolutely no idea why not; perhaps it is performance and compatibility concerns; or perhaps the NSA have silenced the people who proposed it.

paj28
  • 32,736
  • 8
  • 92
  • 130
  • Thank you, good stuff! One concern/criticism I've heard about function pointer encryption has to do with compatibility: if instrumented code passes a pointer to a location holding a function pointer to an uninstrumented library, then things will break horribly. This was why I asked about canaries rather than function pointer encryption; canaries seem potentially more resilient to the presence of uninstrumented libraries (maybe). – D.W. Jun 05 '14 at 06:25