2

I am trying to understand how "Integer Overflow" happens here and how it works.

The vulnerability exists in the chunk of “tx3g”. Chunk_size is the unit which overflows the sum of size. That's to say, the memory assigned is less than the size. Thus the memcpy function will cause heap overflow.

case FOURCC('t', 'x', '3', 'g'):
{
    uint32_t type;
    const void *data;
    size_t size = 0;
    if (!mLastTrack->meta->findData(
            kKeyTextFormatData, &type, &data, &size)) {
        size = 0;
    }
    uint8_t *buffer = new uint8_t[size + chunk_size]; // <---- Integer overflow here
    if (size > 0) {
        memcpy(buffer, data, size);                   // <---- Oh dear.
    }
    if ((size_t)(mDataSource->readAt(*offset, buffer + size, chunk_size))
            < chunk_size) {
        delete[] buffer;
        buffer = NULL;
        return ERROR_IO;
    }
    mLastTrack->meta->setData(
            kKeyTextFormatData, 0, buffer, size + chunk_size);
    delete[] buffer;
    *offset += chunk_size;
    break;
}

Note that chunk_size is a uint64_t that is parsed from the file; it’s completely controlled by the attacker and is not validated with regards to the remaining data available in the file.

If we try to exploit it with such a MP4 file:

0000000: 0000 0014 6674 7970 6973 6f6d 0000 0001  ....ftypisom....
0000010: 6973 6f6d 0000 0020 7472 616b 0000 0018  isom... trak....
0000020: 7478 3367 4141 4141 4141 4141 4141 4141  tx3gAAAAAAAAAAAA
0000030: 4141 4141 0000 0001 7478 3367 ffff ffff  AAAA....tx3g....
0000040: ffff ffff 4242 4242 4242 4242 4242 4242  ....BBBBBBBBBBBB
0000050: 4242 4242 4242 4242 4242 4242 4242 4242  BBBBBBBBBBBBBBBB
0000060: 4242 4242                                BBBB

This should happen while debugging:

MPEG4Extractor: Identified supported mpeg4 through LegacySniffMPEG4.
MPEG4Extractor: trak: new Track[20] (0xb6048160)
MPEG4Extractor: trak: mLastTrack = 0xb6048160
MPEG4Extractor: tx3g: size 0 chunk_size 24
MPEG4Extractor: tx3g: new[24] (0xb6048130)
MPEG4Extractor: tx3g: mDataSource->readAt(*offset, 0xb6048130, 24)
MPEG4Extractor: tx3g: size 24 chunk_size 18446744073709551615
MPEG4Extractor: tx3g: new[23] (0xb6048130)
MPEG4Extractor: tx3g: memcpy(0xb6048130, 0xb6048148, 24)
MPEG4Extractor: tx3g: mDataSource->readAt(*offset, 0xb6048148, 18446744073709551615)

Here is my question:

If I set the Chunk size to 0xffffffffffffffff, why is it being interpreted as "-1", so in this code "24-1", so "23" in this code:

uint8_t *buffer = new uint8_t[size + chunk_size]; // <---- Integer overflow here

I see it here in debug:

MPEG4Extractor: tx3g: new[23] (0xb6048130)

and not "24+18446744073709551615", which I think should result in "0"?

Maybe I didn't explain it well enough or have some thinking error, here is the link to original blog entry explaining this Integer Overflow.

RoraΖ
  • 12,317
  • 4
  • 51
  • 83
dev
  • 937
  • 1
  • 8
  • 23
  • 1
    0xffffffffffffffff is the maximum value of a 64-bit unsigned variable. If you add anything to it, it wraps around the limit, effectively being interpreted as -1 – paj28 Sep 19 '15 at 18:08
  • Thanks paj28! Hmmmm thought that unsigned variable cannot have a sign ("-") and wraps to 0 when exceeded. So -1 is the maximum value it can has no matter how much I add to it? Can I somehow wrap it to different value, for example -2? – dev Sep 19 '15 at 18:30
  • is 0xFFFFFFFFFFFFFFFE -2 decimal? will it fit into this variable (64-bit unsigned) ? But it is signed ("-")? Hhmmmm? Can you explain this? I think I am missing something here .... – dev Sep 19 '15 at 18:38
  • Roughly speaking, yes 0xFFFFFFFFFFFFFFFE is -2. The distinction between signed and unsigned is subtle. This is about the limit of what I can explain here. I suggest you read up more about this - and crucially, experiment by writing some code and seeing how these things work. – paj28 Sep 19 '15 at 19:20
  • Thanks again! Do you know some good resource/tutorial on this? Maybe I dont have the right skills, but trying to understand it more with help from the community. Wrote a small C program. "uint64_t chunk_size=0xffffffffffffffff; printf("chunk_size: %"PRId64"\n", chunk_size);" prints "chunk_size: -1". Huh? I didnt add anything to it .... – dev Sep 19 '15 at 19:57
  • and uint64_t chunk_size=0xfffffffffffffffe; printf("chunk_size: %"PRId64"\n", chunk_size+24); 0xfffffffffffffffe fits into unsigned 64 var and is seen as -2 and it prints out "chunk_size: 22" – dev Sep 19 '15 at 19:58
  • Nebula Exploit Exercises is a good training course (although I struggled to find it online just now). It covers a whole load of stuff, and I think integer overflows are in there. – paj28 Sep 19 '15 at 21:06
  • 1
    You got -1 because you didn't print it as an unsigned so `0xffffffffffffffff` was interpreted as a signed number and evaluates to -1. Computers use 2's complement to represent negative numbers. You can read about that representation at http://www.cs.cornell.edu/~tomf/notes/cps104/twoscomp.html – Neil Smithline Sep 19 '15 at 22:16
  • OK thanks for the tip, I read though them, but still got questions. "printf("%" PRIu64 "\n", chunk_size);" prints it correctly, but why with "printf("%" PRIu64 "\n", chunk_size+24);" it evaluates to 23 not to -1. paj28: As far as I understood, you said if I add anything, for example 24, to 0xffffffffffffffff it will evaluate to -1. How is this addition made? – dev Sep 20 '15 at 11:21
  • I think I found the answer to wrapping. http://stackoverflow.com/questions/16056758/c-c-unsigned-integer-overflow "UINT_MAX + 1 == 0 UINT_MAX + 2 == 1 UINT_MAX + 3 == 2 .. and so on". So in my case 0xffffffffffffffff+24 = 23. So this is the integer overflow? That with max chunk_size it wraps? So I can allocate "uint8_t *buffer = new uint8_t[size + chunk_size];" also to 0 with chunk_size of 0xffffffffffffffe8 and than I will be writing out of bounds in "memcpy(buffer, data, size);"? – dev Sep 20 '15 at 12:36
  • @android_dev - no, I said if you add anything to UINT_MAX it wraps, which means UINT_MAX is effectively -1, not the result. – paj28 Sep 20 '15 at 15:14

2 Answers2

1

"24+18446744073709551615", which I think should result in "0"?

The concept you may be missing is called two's complement, and is by far the most common way to represent integers in computing. Two's complement has many interesting properties, such as each possible binary value mapping to exactly one integer (for a given type, including signedness), and (for signed integers) there being one more negative number than positive (e.g. signed char where char is 8 bits has a range from -128 to +127). The relevant ones here, though are:

  • There is no difference, in binary representation, between a signed and an unsigned integer. For example, with a byte, 0-127 are the same for signed and unsigned, and 128-255 (unsigned) - the same values as before plus 128 - maps exactly to -128 - -1. It also means that, for any integer TYPE, UTYPE_MAX (the maximum value for the unsigned variant of TYPE) is equal to -1 in the signed representation of that type. In other words, ((int)UINT_MAX) == -1 is true.
  • There is no need to special case negative numbers for addition and subtraction. Adding a positive and a negative number naturally operates like the subtraction of the absolute values, and vice versa. In other words, for addition and subtraction, the ALU doesn't need different logic for signed and unsigned (it still does for some other operations), and nor does the compiler.
  • Overflows (and underflows) just naturally wrap around; for an unsigned TYPE, UTYPE_MAX + 1 always equals 0, and (TYPE)0 - 1 always equals UTYPE_MAX. This behavior is actually required by the C specification, so it's very convenient. [1]

Given these facts, the result makes sense. 0xffffffffffffffff + 1 is 0. 0xffffffffffffffff + 2 is 1. 0xffffffffffffffff + 24 is 23. In other words, even though the 0xffffffffffffffff is an unsigned type, it is mathematically equivalent to -1, at least for addition and subtraction.


[1] It also means that for a signed TYPE, (TYPE_MAX + 1) == TYPE_MIN and (TYPE_MIN - 1) == TYPE_MAX. However, this is not only not required behavior in C, it is explicitly undefined behavior; per the language spec, the computer is permitted to burn your house down and shoot your dog if signed integer over- or underflow ever happens. In practice, since the vast majority of CPUs use two's complement, this is only relevant in compiler optimizations; such as if you ever write code like this:

int v = getSignedVal();
int s = v + 5;
// But what if v > (INT_MAX - 5)? Better check for overflow!
if (s < v) {
    // It doesn't matter what you write here!
    // The compiler will optimize it out - and the check itself -
    // because this "can't happen" since adding a positive value (5)
    // to any integer (v) can "never" make the sum less.
}

There are ways to perform this check safely - for example, simply checking directly whether v > (INT_MAX - 5), in which case s would overflow, works fine - but you have to be cognizant of this risk!

CBHacking
  • 40,303
  • 3
  • 74
  • 98
0

Simply test it yourself by writing easy CPP files, which is probably self explaining:

vul.cpp

integeroverflow

In this example you would even only allocate a buffer of size 1.