Static and dynamic analysis of binaries

Slides:



Advertisements
Similar presentations
The art of exploitation
Advertisements

Utilizing the GDB debugger to analyze programs Background and application.
Binghamton University CS-220 Spring 2015 Binghamton University CS-220 Spring 2015 Object Code.
Memory Image of Running Programs Executable file on disk, running program in memory, activation record, C-style and Pascal-style parameter passing.
PC hardware and x86 3/3/08 Frans Kaashoek MIT
Microprocessors Frame Pointers and the use of the –fomit-frame-pointer switch Feb 25th, 2002.
September 22, 2014 Pengju (Jimmy) Jin Section E
C Prog. To Object Code text text binary binary Code in files p1.c p2.c
Memory & Storage Architecture Seoul National University Computer Architecture “ Bomb Lab Hints” 2nd semester, 2014 Modified version : The original.
Recitation 2: Assembly & gdb Andrew Faulring Section A 16 September 2002.
F13 Forensic tool analysis Dr. John P. Abraham Professor UTPA.
Trying to like a boss… REVERSE ENGINEERING. WHAT EVEN IS… REVERSE ENGINEERING?? Reverse engineering is the process of disassembling and analyzing a particular.
Application Security Tom Chothia Computer Security, Lecture 14.
Introduction to InfoSec – Recitation 2 Nir Krakowski (nirkrako at post.tau.ac.il) Itamar Gilad (itamargi at post.tau.ac.il)
6.828: PC hardware and x86 Frans Kaashoek
Computer Architecture and Operating Systems CS 3230 :Assembly Section Lecture 7 Department of Computer Science and Software Engineering University of Wisconsin-Platteville.
Introduction: Exploiting Linux. Basic Concepts Vulnerability A flaw in a system that allows an attacker to do something the designer did not intend,
Practical Session 4. Labels Definition - advanced label: (pseudo) instruction operands ; comment valid characters in labels are: letters, numbers, _,
Lecture-1 Compilation process
Goals: To gain an understanding of assembly To get your hands dirty in GDB.
EECS 354 Network Security Reverse Engineering. Introduction Preventing Reverse Engineering Reversing High Level Languages Reversing an ELF Executable.
Introduction to Information Security מרצים : Dr. Eran Tromer: Prof. Avishai Wool: מתרגלים : Itamar Gilad
Recitation 6 – 2/26/01 Outline Linking Exam Review –Topics Covered –Your Questions Shaheen Gandhi Office Hours: Wednesday.
Introduction to InfoSec – Recitation 2 Nir Krakowski (nirkrako at post.tau.ac.il) Itamar Gilad (itamargi at post.tau.ac.il)
Computer Architecture and Operating Systems CS 3230 :Assembly Section Lecture 3 Department of Computer Science and Software Engineering University of Wisconsin-Platteville.
CNIT 127: Exploit Development Ch 3: Shellcode. Topics Protection rings Syscalls Shellcode nasm Assembler ld GNU Linker objdump to see contents of object.
Lec 4Systems Architecture1 Systems Architecture Lecture 4: Compilers, Assemblers, Linkers & Loaders Jeremy R. Johnson Anatole D. Ruslanov William M. Mongan.
Buffer Overflow Proofing of Code Binaries By Ramya Reguramalingam Graduate Student, Computer Science Advisor: Dr. Gopal Gupta.
ELF binary # readelf -a foo.out ELF Header:
Functions/Methods in Assembly
Compiler Construction Code Generation Activation Records
Introduction to Information Security מרצים : Dr. Eran Tromer: Prof. Avishai Wool: מתרגלים : Itamar Gilad
University of Amsterdam Computer Systems – the instruction set architecture Arnoud Visser 1 Computer Systems The instruction set architecture.
COMP1070/2002/lec1/H.Melikian COMP1070 Lecture #2 Computers and Computer Languages Some terminology What is Software? Operating Systems.
EXPLOITATION CRASH COURSE – FALL 2013 UTD Computer Security Group – Andrew Folloder csg.utdallas.edu (credit: Scott Hand)
LECTURE 3 Translation. PROCESS MEMORY There are four general areas of memory in a process. The text area contains the instructions for the application.
Lecture 3 Translation.
Section 5: Procedures & Stacks
Programs – Calling Conventions
Introduction to Information Security
Instruction Set Architecture
Introduction to Information Security
Dynamic Analysis ddaa.
Computer Architecture and Assembly Language
Microprocessor and Assembly Language
Debugging with gdb gdb is the GNU debugger on our CS machines.
Homework Reading Machine Projects Labs PAL, pp ,
Computer Architecture and Assembly Language
Homework In-line Assembly Code Machine Language
Introduction to Compilers Tim Teitelbaum
Computer Architecture “Bomb Lab Hints”
Computer Architecture and Assembly Language
Discussion Section – 11/3/2012
C Prog. To Object Code text text binary binary Code in files p1.c p2.c
Procedures – Overview Lecture 19 Mon, Mar 28, 2005.
Assembly Language Programming II: C Compiler Calling Sequences
Understanding Program Address Space
Machine-Level Programming: Introduction
CAP6135: Malware and Software Vulnerability Analysis Buffer Overflow : Example of Using GDB to Check Stack Memory Cliff Zou Spring 2015.
Week 2: Buffer Overflow Part 1.
Jeremy R. Johnson Anatole D. Ruslanov William M. Mongan
Reverse Engineering for CTFs
CAP6135: Malware and Software Vulnerability Analysis Buffer Overflow : Example of Using GDB to Check Stack Memory Cliff Zou Spring 2016.
CSC 497/583 Advanced Topics in Computer Security
Computer Architecture and Assembly Language
Computer Architecture and System Programming Laboratory
Computer Architecture and System Programming Laboratory
Reverse Engineering for CTFs
Return-to-libc Attacks
By Hugues Leger / Intro to GDB debugger By Hugues Leger / 11/16/2019.
Presentation transcript:

Static and dynamic analysis of binaries Reverse Engineering 101 Static and dynamic analysis of binaries

The same ideas can be transferred over to Win PEs and Mach-O binaries. Preamble This presentation will focus on reverse engineering x86 Linux ELF binaries. The same ideas can be transferred over to Win PEs and Mach-O binaries.

What is Reverse Engineering? Traditionally it means to understand something by taking it apart and analysing it’s components. In terms of IT, we can take apart binaries to learn exactly how they work at an instruction level. There are 2 main techniques: Static and Dynamic Analysis Something amazing.

IDA looks pretty dope when you have it open. Why learn it? It’s fun to learn new skills. It’s a problem solving skill that’ll help you in many other areas of Info-Sec research such as Binary Exploitation. Learning RE techniques can lead to gaining a greater appreciation for the complexities of computers. IDA looks pretty dope when you have it open. It is also a great skill to have.

This can lead to developing patches for software. Why learn it? Continued You can analyse vulnerable programs to find out why exactly they’re vulnerable. This can lead to developing patches for software. You could also develop exploits tailored to a binary, if you really wanted to… 😏

What will I need to get started? Tools you’ll need to start your journey into RE IDA (Demo Version Available) – https://www.hex-rays.com/products/ida/support/download.shtml GNU Binutils – Install via your *nix package manager GDB – As above A Text Editor – To take notes If you don’t want to use IDA you can use a similar tool like: Hopper or BinaryNinja

Okay, so let’s get started.

Static Analysis Part I

What is Static Analysis? To analyse a binary statically is to do so without executing it. The binary is analysed as a file and not a running process. Static Analysis is done with disassemblers, decompilers, hex editor, and other similar tools. We’ll focus on disassemblers.

Disassemblers Disassemblers will take your binary as input and interpret the machine instructions it contains into a low level language such as assembly. For the x86 Linux ELFs we’ll see later it’ll be interpreted into x86 Intel Syntax Assembly.

Disassemblers Continued Most modern disassemblers can: Find embedded symbols (basically, mnemonics for variables, functions, etc. ) Explicitly interpret specific parts of the binary as either data or text (code) [map the binaries regions]. Find Structs, Enums, Strings, Functions, Types, etc. Make a pretty graph showing the control flow of the program. Show Imports / Exports of libraries and external functions … and more!

Static Analysis of a Very Simple C Program To introduce the functionality of static and dynamic analysis we’ll be using the following code. #include <stdio.h> int main() { int a = 1; int b = 2; a = a + b; return (0); } simple.elf compiled x86 machine instructions gcc … 55 89 e5 83 ec 0c 31 c0 c7 45 fc 00 00 00 00 c7 45 f8 01 00 00 00 c7 45 f4 02 00 00 00 8b 4d f8 03 4d f4 89 4d f8 83 c4 0c 5d c3 … This C code is compiled into an ELF, which when represented by hex shows gcc –m32 -O0 -o simple.elf This doesn’t mean much, but you could translate this into assembly if you had an x86 opcode lookup table. You can see the hex by using a hex editor or objdump -d simple.elf

Static Analysis of a Very Simple C Program Continued I push ebp mov ebp, esp sub esp, 0xc xor eax, eax mov DWORD PTR [ebp-0x4], 0x0 mov DWORD PTR [ebp-0x8], 0x1 mov DWORD PTR [ebp-0xc], 0x2 mov ecx, DWORD PTR [ebp-0x8] add ecx, DWORD PTR [ebp-0xc] mov DWORD PTR [ebp-0x8], ecx add esp, 0xc pop ebp ret objdump File: simple.elf Here’s what the output looks like from Objdump (from GNU binutils) using objdump -M intel -d simple.elf Make sure you include “-M intel” so it produces Intel Syntax Assembly. This is what we’ll be using. I did clean up the output, just to get the assembly code.

Static Analysis of a Very Simple C Program Continued II Using IDA we can see it has given us some symbols (var_C, var_8, etc…) These are offsets that can be used with ebp to find local variables in our main function’s stack frame. In more complicated programs that use conditions, we’ll be able to see control flow. We’ll look at that later. simple.elf loaded into IDA

Static Analysis of a Very Simple C Program Continued III push ebp ; Push previous stack frame. mov ebp, esp ; Move SP to EBP to set new stack frame. sub esp, 0xc ; Reserve 0xc bytes for local variables. xor eax, eax ; Clear eax (eax is returned from the function). mov DWORD PTR [ebp-0x4], 0x0 ; Move 0x0 into local variable ebp-0x4. mov DWORD PTR [ebp-0x8], 0x1 ; Move 0x1 into local variable ebp-0x8. mov DWORD PTR [ebp-0xc], 0x2 ; Move 0x2 into local variable ebp-0xc. mov ecx, DWORD PTR [ebp-0x8] ; Move local variable ebp-0x8 into ecx. add ecx, DWORD PTR [ebp-0xc] ; Add local variable ebp-0xc to ecx. mov DWORD PTR [ebp-0x8], ecx ; Move value of ecx into local variable ebp-0x8. add esp, 0xc ; Set SP back to location before. pop ebp ; Restore base pointer. ret ; Pop EIP. You should have some background in reading basic Intel Syntax Assembly. If not, that’s fine. Most instructions are self explanatory. If you want to know more about how registers, stack frames, and calling conventions work you can view the “Binary Exploitation” slides on the Wiki.

Dynamic Analysis Part 2

What is Dynamic Analysis? Unlike Static Analysis, you analyse a binary by executing it and following it’s process of execution. You can perform all the same actions as if you were statically analysing, but with the advantage of running the code and seeing how it physically modifies registers and memory. This is often more quick. Two main tools that are used are Debuggers and Memory Editors. We’ll focus on Debuggers (though debuggers can edit memory).

Debuggers Debuggers will take your binary as input, create a running process, and attach itself to that process. The debugger can halt, step through, and modify all aspects of your binary’s running process. We’ll be using GDB.

Most modern debuggers can: Debuggers Continued Most modern debuggers can: Do mostly all that a disassembler can do – and more. Disassemble the instructions in the program, see which instruction is going to run next, and then step through those instructions. Read / Write memory (heap, stack), map memory regions. Modify and inspect register values. Manipulate and tracking states.

Dynamic Analysis of a Very Simple C Program gdb simple.elf -q Reading symbols from simple.elf...(no debugging symbols found)...done. (gdb) disassemble main Dump of assembler code for function main: 0x00001f80 <+0>: push ebp 0x00001f81 <+1>: mov ebp,esp 0x00001f83 <+3>: sub esp,0xc 0x00001f86 <+6>: xor eax,eax 0x00001f88 <+8>: mov DWORD PTR [ebp-0x4],0x0 0x00001f8f <+15>: mov DWORD PTR [ebp-0x8],0x1 0x00001f96 <+22>: mov DWORD PTR [ebp-0xc],0x2 0x00001f9d <+29>: mov ecx,DWORD PTR [ebp-0x8] 0x00001fa0 <+32>: add ecx,DWORD PTR [ebp-0xc] 0x00001fa3 <+35>: mov DWORD PTR [ebp-0x8],ecx 0x00001fa6 <+38>: add esp,0xc 0x00001fa9 <+41>: pop ebp 0x00001faa <+42>: ret End of assembler dump. Attaching GDB to simple.elf using gdb simple.elf and disassembling the main function

Dynamic Analysis of a Very Simple C Program Continued (gdb) break *main Breakpoint 1 at 0x1f80 (gdb) run Starting program: /Users/nandayo/Desktop/simple.elf Breakpoint 1, 0x00001f80 in main () (gdb) disassemble Dump of assembler code for function main: => 0x00001f80 <+0>: push ebp 0x00001f81 <+1>: mov ebp,esp 0x00001f83 <+3>: sub esp,0xc 0x00001f86 <+6>: xor eax,eax 0x00001f88 <+8>: mov DWORD PTR [ebp-0x4],0x0 0x00001f8f <+15>: mov DWORD PTR [ebp-0x8],0x1 0x00001f96 <+22>: mov DWORD PTR [ebp-0xc],0x2 0x00001f9d <+29>: mov ecx,DWORD PTR [ebp-0x8] 0x00001fa0 <+32>: add ecx,DWORD PTR [ebp-0xc] 0x00001fa3 <+35>: mov DWORD PTR [ebp-0x8],ecx 0x00001fa6 <+38>: add esp,0xc 0x00001fa9 <+41>: pop ebp 0x00001faa <+42>: ret End of assembler dump. Setting a breakpoint in the main function, running the program, and disassembling to see which instruction we’ve landed on when it hits the breakpoint.

Dynamic Analysis of a Very Simple C Program Continued What if we wanted to see the final result of a + b from our C program? Well, we know this is the line: Is where it moves the value of ecx (being our addition of a and b) back into memory location [ebp-0x8], which is our memory address of a. We can print this location after the instruction is executed. 0x00001fa3 <+35>: mov DWORD PTR [ebp-0x8],ecx (gdb) break *main+38 Breakpoint 2 at 0x1fa6 (gdb) continue Continuing. Breakpoint 2, 0x00001fa6 in main () (gdb) x/dwx $ebp-0x8 0xbffffae0: 0x00000003 THE RESULT! Set a breakpoint AFTER the instruction, so we know it has executed. Then we can examine 1 DWORD in hex at memory location [ebp-0x8].

Quick Static Analysis Test push ebp mov ebp, esp sub esp, 0xc xor eax, eax mov DWORD PTR [ebp-0x8], 0x5 mov DWORD PTR [ebp-0xc], 0x4 mov ecx, DWORD PTR [ebp-0x8] sub ecx, DWORD PTR [ebp-0xc] mov DWORD PTR [ebp-0x8], ecx add esp, 0xc pop ebp ret What is the value inside the local variable [ebp-0x8]?

Try the “Firetruck” challenge from the C2C 2016 CTF Event Try a real challenge! Try the “Firetruck” challenge from the C2C 2016 CTF Event You can find the challenge at: ctf.hacktheplanet.club/challenges#Firetruck Give it a go before you watch the solution: here You can solve this challenge through Static Analysis, however you can use what ever tool you would like to. I used IDA.

GDB / Reversing Basics: https://www.youtube.com/watch?v=VroEiMOJPm8 Useful Resources IDA Basics: https://www.youtube.com/watch?v=zvWc-XsBKrAs GDB / Reversing Basics: https://www.youtube.com/watch?v=VroEiMOJPm8 Assembly Basics: https://www.youtube.com/watch?v=6jSKldt7Eqs