String Termination Vulnerability
Upon thinking about this more, using strncpy()
is probably the most common way (that I can think of) that could create null termination errors. Since generally people think of the length of the buffer as not including \0
. So you'll see something like the following:
strncpy(a, "0123456789abcdef", sizeof(a));
Assuming that a
is initialized with char a[16]
the a
string will not be null terminated. So why is this an issue? Well in memory you now have something like:
30 31 32 33 34 35 36 37 38 39 61 62 63 64 65 66
e0 f3 3f 5a 9f 1c ff 94 49 8a 9e f5 3a 5b 64 8e
Without a null terminator standard string functions won't know the length of the buffer. For example, strlen(a)
will continue to count until it reaches a 0x00
byte. When is that, who knows? But whenever it finds it it will return a length much larger than your buffer; lets say 78. Lets look at an example:
int main(int argc, char **argv) {
char a[16];
strncpy(a, "0123456789abcdef", sizeof(a));
... lots of code passes, functions are called...
... we finally come back to array a ...
do_something_with_a(a);
}
void do_something_with_a(char *a) {
int a_len = 0;
char new_array[16];
// Don't know what the length of the 'a' string is, but it's a string so lets use strlen()!
a_len = strlen(a);
// Gonna munge the 'a' string, so lets copy it first into new_array
strncpy(new_array, a, a_len);
}
You've now just written 78 bytes to a variable that only has 16 bytes allocated to it.
Buffer Overflows
A buffer overflow occurs when more data is written to a buffer than is allocated for that buffer. This is no different for a string except that many of the string.h
functions rely on this null byte to signal the end of a string. As we saw above.
In the example we wrote 78 bytes to a buffer that is only allocated for 16. Not only that, but it's a local variable. Which means that the buffer has been allocated on the stack. Now those last 66 bytes that were written, they just overwrote 66 bytes of the stack.
If you write enough data past the end of that buffer you'll overwrite the other local variable a_len
(also not good if you use it later), any stack frame pointer that was saved on the stack, and then the return address of the function. Now you have really gone and screwed things up. Because now the return address is something completely wrong. When the end of do_something_with_a()
is reached, bad things happen.
Now we can add a further to the example above.
void do_something_with_a(char *a, char *new_a) {
int a_len = 0;
char new_array[16];
// Don't know what the length of the 'a' string is, but it's a string so
// lets use strlen()!
a_len = strlen(a);
//
// By the way, copying anything based on a length that's not what you
// initialized the array with is horrible horrible coding. But it's
// just an example.
//
// Gonna munge the 'a' string, so lets copy it first into new_array
strncpy(new_array, a, a_len);
// 'a_len' was on the stack, that we just blew away by writing 66 extra
// bytes to the 'new_array' buffer. So now the first 4 bytes after 16
// has now been written into a_len. This can still be interpreted as
// a signed int. So if you use the example memory, a_len is now 0xe0f33f5a
//
// ... did some more munging ...
//
// Now I want to return the new munged string in the *new_a variable
strncpy(new_a, new_array, a_len);
// Everything burns
}
I think my comments pretty much explain everything. But at the end you've now written a huge amount of data into an array most likely thinking that you're only writing 16 bytes. Depending on how this vulnerability manifests itself this could lead to exploitation via remote code execution.
This is a very contrived example of poor coding, but you can see how things can escalate quickly if you're not careful when working with memory, and copying data. Most of the time the vulnerability will not be this obvious. With large programs you have so much going on that the vulnerability might not be easy to spot, and could be triggered by code multiple function calls away.
For more on how buffer overflows work.
And before anyone mentions it, I ignored endianess when referencing the memory for the sake of simplicity
Further Reading
Full Description of the Vulnerability
Common Weakness Enumeration (CWE) entry
Secure Coding Strings Presentation (PDF automatically downloads)
University of Pittsburgh - Secure Coding C/C++: String Vulnerabilities (PDF)