How do compilers detect buffer overflow?

Question

I just started researching about security at the systems level and challenges, especially with respect to low level languages such as C/C++ and Objective-C. I have understood buffer overflow and how it works. I was playing around with it on OS X and Ubuntu. Of course these systems have ASLR and stack guards implemented, which we can disable at compile time. So I have couple of questions with respect to this:

How does a compiler detect an overflow at compile time? I understand that while compiling it will add a canary and if any instruction tries to overwrite this, it will throw an error. But what is the exact algorithm? If someone can point the gcc code functions for this, that would be great.
Is disabling the stack guard and making stack/heap executable on Ubuntu sufficient? Or can ASLR still make it difficult to exploit?
If I have a binary (no source code) and I know it crashes due to buffer overflow, how do I detect it?

Compiling in itself doesn't catch buffer overflows, static code analysis does. Of course, compilers may opt to perform static code analysis themselves (e.g. in Java `(new byte[2])[2]` may fail at compile time, even though valid byte code can be produced). What is detected by what is a muddy subject, at least in Java there is quite a lot of duplicate checks in `javac` or the Eclipse compiler, source code analysis such as CheckStyle and byte code analysis such as findbugs. — Maarten Bodewes, Apr 03 '15 at 14:03
Darn, it actually doesn't fail compilation in Eclipse. Need new example... sheesh. — Maarten Bodewes, Apr 03 '15 at 14:05

score 6 · Answer 1 · edited Mar 17 '17 at 13:14

Buffer overflows aren't detected at compile time. There are code analysis tools such as Sparse or Lint (cpplint, pc-lint) that will perform further analysis on both source code files or compiled binaries. Each analysis tool has their own algorithms for determining a buffer overflow, but it comes down to common known instructions that lead to buffer overflows.

You can also add Bounds Checking at compile which inserts bounds information for each allocated block of memory. This bounds information is then checked at run-time to ensure buffers are within their limits. A common implementation is to use "fat" pointers. Which contain both the address of the real pointer to the data, and additional data describing the region it is located. I believe Firefox does this for their memory allocations, but I could be mistaken.

Canaries are inserted at compile time to help detect buffer overflows by inserting a word of data between a buffer and the control data on the stack. At a certain point before the return of the function the canary is verified to be intact.

ASLR has nothing to do with stack protection. It randomizes the address that your program runs in memory. This means that you can't rely on functions to be at the same address each time the program is run. This prevents the hardcoding of library and function addresses. Making the stack or heap (or any piece of memory that you're trying to execute) executable is necessary regardless, but ASLR will still cause you problems. If you're just experimenting and trying to understand exploit code I would do the following:

Disable ASLR
Disable stack protections
Attack each problem individually, and re-enable them one at time until your exploit can handle both.

If you have a binary that crashes due to... well anything (not just a buffer overflow) run the program in a debugger. The debugger will catch the crash at the exact point that the program fails. You should probably realize that this might not be the exact location of the overflow, but a stack trace should help determine the root cause. I would suggest reading these posts if you're not familiar with reverse engineering tools, or x86 computer architecture.

Why are buffer overflows executed in the direction they are?
Reverse Engineering Tools

user93353 · Answer 2 · 2015-05-09T20:49:05.833

Consider the following code

void f1()
{
     char buf[20];
     strcpy(buf,"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"); // The buffer overrun
} 

void f2() 
{
    ....
    f1(); 
    (address of this place will be pushed onto to the stack 
     as the return address before calling f1)
    ...
}

When f2 calls f1, by default, the return address (i.e. address where call has to return to after f1 has finished executing) is pushed on to the stack.

Typically when buffer buf is overrun in f1, it is exploited by overwriting the return address on the stack, so that when after f1 is finished the code returns to some place where the attacker controlled data which can execute a script or something which does the exploit.

Microsoft Compilers implement a protection in the following way.
When the function loads, it puts a security cookie (a random value) between the end of the buffer and the place in the stack where the return address is stored. The compiler also adds some code at the end of the function f1 to check if the security cookie is the same as what was placed.
So let's say there was a buffer overrun, and the buffer overrun was used to changed all the stuff after the end of the buffer including the return address. This will mean that the security cookie was also overwritten. When the function f1 finishes execution, the code added at the end of the function will detect this and will shutdown the program - which means the exploit (running of code which the attacker wanted to happen) will not be run.

Here is a more detailed explanation

How do compilers detect buffer overflow?

2 Answers2