Crash Course in Computer Architecture
In an Intel x86 and x64 architectures there is something called the stack. This is essentially where everything to determine the execution path is stored. Parameters to functions, local variables, and return addresses are all stored on the stack. CPU registers keep track of where in the stack the program is executing. You can push values and memory addresses onto the stack up to the architecture's bit-size.
What do I mean by that? If you're on a 32-bit system, then each value you push onto the stack will essentially be an unsigned int
32-bits in size (4 bytes). If you're on a 64-bit system it will be 64-bits in size (8 bytes). Below is an example[1] of what the stack looks like for a function:
uint32_t function(int a, int b, int c, int d, int e, int f, int g, int h);
The example above is for x64. CPU registers RDI, RSI, RDX, RCX, R8 and R9 store the first 6 parameters, and the rest are pushed on the the stack. The return address
, g
, and h
will be 8-byte values. g
might only equal 0x10
, but when you push it on the stack it will look like 0x0000000000000010
.
After the parameters are pushed onto the stack the call
instruction is executed. This will push the return address onto the stack, and jump to the function for execution. Because the stack grows towards low address space each time you push something onto the stack it moves closer to zero. In the picture above you'll also see the local variables xx
, yy
, and zz
. Each of these are also moving down the stack towards lower memory. Of course, realize that the program is always manipulating the top of the stack.
Stack Overflow
Lets say you create a local variable that is a buffer with a maximum of 12 bytes. Something like this, unsigned char buffer[12];
. The stack makes space for 12 bytes of data. Lets say we fill this buffer with "012345678912"
it's going to look like this on the stack1:
High
...
32313938
37363534
33323130
...
Low
Because the beginning of the buffer will always be towards lower memory[2]. So when you begin writing into a buffer you're always writing from Low to High. If you don't allocate enough space and you write more data than you have allocated you have a buffer overflow.
Endianess
Now you want to overwrite the return address on the stack. You put it at the end of your string so that as the copy writes over higher memory it overwrites your return address with the address you want there instead2. Intel throws another curve ball at you. x86 and x64 systems are Little Endian. Which means that the least significant byte is in the smallest address.
So you'll want the least significant byte of the address (0x32
) written to lower memory. So you'll write the address into memory backwards because you're writing from low to high memory. When viewing the address from high to low (as the example above shows) the memory address will look correct. However, when viewing from low to high it will be backwards. The important aspect to remember is that the architecture requires the LSB to be written to lower memory.
1 - I'm using a 32-bit system here because a 4-byte width is easier to draw out.
2 - Saved EBP, I'm ignoring this because it's not important to the discussion. It's very important to remember, but just kind of ignore its existence for now.