2

Referenced in the recent VLC vulnerability and other places, apparently buffer overreads can cause arbitrary code execution. How does it do that? Suppose in the following toy example

void badcpy(const char* src, char* dst, int n) {
    for(int i = 0; i < n; i++)
        dst[i] = src[i];
    dst[n] = '\0';
}

int main(int argc, char** argv) {
    const char* str = "I'm being overread!";

    int n = argc > 1 ? atoi(argv[1]) : strlen(str);

    char* buf = (char*)malloc(n + 1);
    badcpy(str, buf, n);

    for(int i = 0; i < n; i++)
        buf[i] += 42;
    printf("%s", buf);
    free(buf);

    return 0;
}

The worst that could happen is either the application crashing or leaking some value in memory that shouldn't be, no arbitrary execution ever takes place.

user212957
  • 21
  • 2
  • 1
    "Reading" data from a buffer doesn't just mean printing it out. Data stored in memory can be used for a lot of things. – Nic Jul 24 '19 at 14:32
  • Remember also that a lot of buffer overreads cause later buffer overflows. – forest Jul 25 '19 at 02:37
  • Also, technically since you're not checking the return value of your `malloc` this could cause an out-of-bound write. It'd be at `NULL` though, so it would still just crash. – CBHacking Jul 25 '19 at 02:47
  • 1
    There is lot of misinformation about the VLC "vulnerability". VLC has publicly made a comment about it that one can read here: https://twitter.com/videolan/status/1153963312981389312 TL;DR: It's a vulnerability in a 3rd party library that have been patched since version 3.0.3 (end of may 2018) and the media and MITRE corp blew the incident out of proportions – Artog Jul 25 '19 at 09:51

3 Answers3

1

Here's a simple example:

struct {
    char userControlledIndex[2]; // single-character string
    char whocares[6]; // some more data of arbitrary purpose
} blob;

Our user is malicious:

blob.userControlledIndex = '62'; // Look ma, no null terminator!
blob.whocares = "177"; // or {'1','7','7',0,0,0}, maybe not attacker controlled?

Here's where the input is consumed:

typedef void (*func)(); // func is a pointer to a parameterless function
func func_table[10] = { f0, f1, ... , f9 }; // an array of 10 nice safe functions
int index = atoi(blob.userControlledIndex); // totally safe, will be one of 0-9
#if DEBUG
printf("index is %d\n", index); // no possible way this prints "62177", right?
#endif
func_table[index](); // invokes one of ten safe functions

Overreading the string (char buffer) resulted in control over the program flow. If the attacker knows where the func_table variable will be relative to executable program code, and can find a suitable target within the range of "indices" they can select, then they can use this as the entry point for a ROP or return-to-libc attack.

Of course, such attacks are complicated by the presence of ASLR... but buffer overreads can also give you the information necessary to defeat ASLR, by exposing a masked pointer's value inadvertently.

CBHacking
  • 40,303
  • 3
  • 74
  • 98
  • Is that more technically not a _write_ buffer overflow, in allowing the **two** characters of user-controlled data (`62`) to be written to `userControlledIndex` where only **one** character should have been allowed to be written? The fact that the _effects_ of the overwrite don't trigger until `userControlledIndex` is read doesn't (IMHO) make this a read overflow. – TripeHound Jul 25 '19 at 09:41
  • No, `userControlledIndex` is a two-byte buffer *and that's all it receives*. It was certainly a failure of input validation to allow the second byte to be anything other than `'\0'`, but the buffer received no more than it had room for. Another error was assuming that the index was in range without validating it, but pretty much any overread or overflow will be due to missing (or incorrect) input and/or range validation. Even the actual exploit was technically an **overread**, not an overflow; I'm not *writing* anything to `func_table`, just reading a value (way) off the end of the array. – CBHacking Jul 25 '19 at 18:57
0

It depends entirely on what the read data is used for. Imagine you could trigger an overread that reads into some attacker-controlled data. All that's needed for code execution is for the read data to be used in such a way that an exploit is possible, whether because the data is executed directly or not. Unlike an overflow however, an overread does not automatically imply the risk of code execution. In fact, most kinds of buffer overreads cause either "harmless" crashes (DoS) or result in infoleaks, but not all.

forest
  • 64,616
  • 20
  • 206
  • 257
0

Imagine a program looks something like this in memory

|--- program code ----|-- 64 bytes of buffer space --|-- more program code ---|

If you can trick the program into reading more than 64 bytes then you can change what is in the more program code section. In most cases this will be gibberish and the program will simply crash, but if you are a serious attacker then you will have a copy of the program, and a compiler, and you can carefully construct the string you send so that the overwritten section is in fact valid, executable program code. Then when the program naturally reaches that section, it will execute whatever you have written there. You can use techniques like NOP sliding to give yourself a little leeway here.

A variant on this is ROP, where memory looks like this

|-- some variable to be passed to a function --|- the return address -|

You overflow that variable and then you can set the return address to whatever you want. Again, if you have a copy of the program and a replica of the target environment (very easy to do these days since everything is x86 or x86_64, running either a free OS or commonly available Windows, you would figure this out during the reconnaissance phase) you can conduct extensive experiments and only deploy your attack when you are ready. You don't do these kinds of hacks "on the fly" like you see hacking is done on TV, they are done through meticulous preparation, known as the weaponisation phase.

Gaius
  • 810
  • 6
  • 7
  • This question is about buffer overREADS (See https://cwe.mitre.org/data/definitions/125.html ). You are talking about writing to memory in your answer - it's not useful for this. I believe your answer more correctly addresses "Out-of-bound write", "buffer overflow", etc... – the_endian Jun 22 '21 at 19:11