Reverse Engineering Workshop

Slides:



Advertisements
Similar presentations
Practical Malware Analysis
Advertisements

INSTRUCTION SET ARCHITECTURES
Introduction to Information Security מרצים : Dr. Eran Tromer: Prof. Avishai Wool: מתרגלים : Itamar Gilad
C Programming and Assembly Language Janakiraman V – NITK Surathkal 2 nd August 2014.
Instruction Set Architecture & Design
PC hardware and x86 3/3/08 Frans Kaashoek MIT
1 ICS 51 Introductory Computer Organization Fall 2006 updated: Oct. 2, 2006.
Accessing parameters from the stack and calling functions.
Practical Session 3. The Stack The stack is an area in memory that its purpose is to provide a space for temporary storage of addresses and data items.
Microcomputer & Interfacing Lecture 3
OllyDbg Debuger.
David Evans CS201j: Engineering Software University of Virginia Computer Science Lecture 18: 0xCAFEBABE (Java Byte Codes)
Application Security Tom Chothia Computer Security, Lecture 14.
CEG 320/520: Computer Organization and Assembly Language ProgrammingIntel Assembly 1 Intel IA-32 vs Motorola
Introduction to InfoSec – Recitation 2 Nir Krakowski (nirkrako at post.tau.ac.il) Itamar Gilad (itamargi at post.tau.ac.il)
6.828: PC hardware and x86 Frans Kaashoek
Computer Architecture and Operating Systems CS 3230 :Assembly Section Lecture 7 Department of Computer Science and Software Engineering University of Wisconsin-Platteville.
Instruction Set Design by Kip R. Irvine (c) Kip Irvine, All rights reserved. You may modify and copy this slide show for your personal use,
Dr. José M. Reyes Álamo 1.  The 80x86 memory addressing modes provide flexible access to memory, allowing you to easily access ◦ Variables ◦ Arrays ◦
Dr Masri Ayob TK 2633: Microprocessor & Interfacing Lecture 7: Assembly Language.
Machine Instruction Characteristics
1 A Simple but Realistic Assembly Language for a Course in Computer Organization Eric Larson Moon Ok Kim Seattle University October 25, 2008.
Hello ASM World: A Painless and Contextual Introduction to x86 Assembly rogueclown DerbyCon 3.0 September 28, 2013.
Practical Session 4. Labels Definition - advanced label: (pseudo) instruction operands ; comment valid characters in labels are: letters, numbers, _,
ITEC 352 Lecture 12 ISA(3). Review Buses Memory ALU Registers Process of compiling.
EECS 354 Network Security Reverse Engineering. Introduction Preventing Reverse Engineering Reversing High Level Languages Reversing an ELF Executable.
The ISA Level The Instruction Set Architecture (ISA) is positioned between the microarchtecture level and the operating system level.  Historically, this.
Introduction to Information Security מרצים : Dr. Eran Tromer: Prof. Avishai Wool: מתרגלים : Itamar Gilad
Introduction to InfoSec – Recitation 2 Nir Krakowski (nirkrako at post.tau.ac.il) Itamar Gilad (itamargi at post.tau.ac.il)
5. Assembly Language. Basics of AL Program data Pseudo-ops Array Program structures Data, stack, code segments.
Computer Architecture and Organization
1 ICS 51 Introductory Computer Organization Fall 2009.
Microprocessors The ia32 User Instruction Set Jan 31st, 2002.
26-Nov-15 (1) CSC Computer Organization Lecture 6: Pentium IA-32.
CNIT 127: Exploit Development Ch 1: Before you begin.
Disclaimer The Content, Demonstration, Source Code and Programs presented here is "AS IS" without any warranty or conditions.
Assembly Language. Symbol Table Variables.DATA var DW 0 sum DD 0 array TIMES 10 DW 0 message DB ’ Welcome ’,0 char1 DB ? Symbol Table Name Offset var.
Introduction to Assembly II Abed Asi Extended System Programming Laboratory (ESPL) CS BGU Fall 2013/2014.
Copy to Tape TOI. 2 Copy to Tape TOI Agenda Overview1 Technical Feature Implementation2 Q&A3.
Compiler Construction Code Generation Activation Records
Computer Systems – Machine & Assembly code. Objectives Machine Code Assembly Language Op-code Operand Instruction Set.
October 1, 2003Serguei A. Mokhov, 1 SOEN228, Winter 2003 Revision 1.2 Date: October 25, 2003.
Computer Architecture and Assembly Language
Overview of Back-end for CComp Zhaopeng Li Software Security Lab. June 8, 2009.
Assembly Language Wei Gao. Assembler language Instructions.
Practical Session 3.
Introduction to Information Security
Debugging Survival Guide (x86-DSG) Divyanand Kutagulla
Assembly language.
Static and dynamic analysis of binaries
Format of Assembly language
Data Transfers, Addressing, and Arithmetic
Microprocessor T. Y. B. Sc..
Computer Architecture and Assembly Language
Assembly Language Assembly Language
Aaron Miller David Cohen Spring 2011
Introduction to Compilers Tim Teitelbaum
Assembly Language Programming Part 2
Y86 Processor State Program Registers
LING 408/508: Programming for Linguists
Fundamentals of Computer Organisation & Architecture
Chapter 8 Central Processing Unit
Practical Session 4.
X86 Assembly Review.
CSC 497/583 Advanced Topics in Computer Security
Computer Organization and Assembly Language
Computer Architecture and System Programming Laboratory
Computer Architecture and System Programming Laboratory
Computer Architecture and System Programming Laboratory
Presentation transcript:

Reverse Engineering Workshop Tip: simple SEO adjustments can make your presentation more discoverable. Read this PDF for best practices:  http://seo.ges.symantec.com/seo-best-practices-for-file-optimization.pdf Cathal Mullaney / nemo@rb Alan Neville / anev@rb

Agenda 1 Reverse Engineering in Security 2 PE File Format 3 Crash Course in Assembly 4 Tools 5 Challenges Copyright © 2015 Symantec Corporation

Reverse Engineering in Security Video Time! Copyright © 2015 Symantec Corporation

A Peek at the PE File Format Copyright © 2015 Symantec Corporation

Let’s start with an example.. 9,216 bytes Simple.exe Compiled in VS2015 1. Using Visual Studios, we can write a simple Hello, World! Example in C 2. We compile the code as an application.. This is what it looks like on disk – it takes up 9,216 bytes of space 3. Looking at the assembly code generated by the bytes-code – we have the following block 4. On the left side, you can see the op-codes that are generates. This is what the machine reads 5. And when we run the application, we get the output we expect Copyright © 2015 Symantec Corporation

Inside Simple.exe C code translated into op-codes Op-codes represented by bytes Highlighted bytes represent the C code written for simple.exe Only a total of 20 bytes! But what’s all the other stuff?! PE Format to save the day! Copyright © 2015 Symantec Corporation

PE Format Layout Standard binary file format for Windows Introduced in Windows NT 3.1 Win32 SDK contains file winnt.h Describes the structure and variables used in PE files File imagehlp.dll contains functions to manipulate PE files PE files are broken into regions that can be examined Developed by Mark Zbikowski at MS in 1990 Used to bridge gap between DOS and Windows executables PE File DOS Header PE Header Optional Header Section Table Sections (code, data, imports) Resources Overlay Copyright © 2015 Symantec Corporation

DOS Header First two letters always ‘MZ’ (0x4D5A) MS DOS header Starts at offset 0x0 Aka “magic number” or File Signature Determines the type of file All file types use magic numbers Java – 0xCAFEBABE ZIP – 0x504B – “PK” MS DOS header Ensures backwards compatibility If file executed in the wrong environment, error message is displayed “This program cannot be run in DOS mode.” Copyright © 2015 Symantec Corporation

PE Header Known as the ‘File Header’ Contains useful information Machine what architecture the file was compiled for Number of sections TimeDateStamp when the file was compiled SizeOfOptionalHeader length of following header Characteristics 0x02 – executable file 0x2000 – file is a DLL, not an EXE Copyright © 2015 Symantec Corporation

Optional PE Header Despite it’s name, it is not optional  Required in executable files, not COM files Occurs directly after PE Header Contains LOTS of information Version info of the compiler Size of the file Checksum – ensure file integrity And more! AddressOfEntryPoint Most important field here! Copyright © 2015 Symantec Corporation

Section Table PE header defines a number of sections Code is placed in these sections Each section definition is 40 bytes in length Section Name (.text, .data etc) Size of the section once loaded into memory Location of section (RVA) Physical size of section on disk Physical location etc Resource named .rsrc Import named .idata Export named .edata Flags used to describe the type of data in the section PE File DOS Header PE Header Optional Header Section Table Sections (code, data, imports) 1..n Resources Overlay Copyright © 2015 Symantec Corporation

Parsing the PE file structure yourself Pefile – python module - https://github.com/erocarrera/pefile “multi-platform Python module to parse and work with PE files” Self-contained – not dependencies! Access PE header Retrieve embedded data Read strings Identify malformed values Explore all features of the PE format Can be used to manipulate the PE structure Allows writing to some of the fields Experiment by changing the data – see how Windows reacts Copyright © 2015 Symantec Corporation

Assembly Crash Course Copyright © 2015 Symantec Corporation

Copyright © 2015 Symantec Corporation

What is assembly language? HHL (Java, C, C++ etc) Assembly  Op-codes Machine Code Copyright © 2015 Symantec Corporation

Registers EAX, EBX, ECX, EDX – all purpose registers ESI, EDI – source and destination index pointers ESP – stack pointer EBP – base pointer, points to return address Also important is EIP – instruction pointer, points to the next instruction to run Copyright © 2015 Symantec Corporation

The Stack Part of memory where program stores variables + function args Last-in-first-out (LIFO) Things are added to the top of the stack push Things are removed from the top of the stack pop It grows backwards Grows from highest memory address to lowest ESP + EBP – registers to work with stack ESP points to top of stack EBP points to local variables Copyright © 2015 Symantec Corporation

The Stack – Example int a = 10 int b = 5 int c = 2 push 10 push 5 High address int a = 10 int b = 5 int c = 2 C – esp+16 B – esp+12 push 10 push 5 push 2 call addNums A – esp+8 EBP register Points to top of the current frame This is the same as your return address EBP+4 Points to first argument passed into function EBP-4 Points to first local variable Usually the old value of EBP Can use this to restore the prior frame Return address esp+4 saved ebp ebp-4 EBP result ebp-8 ESP addNums: int result = a+b+c; return result; Low address Copyright © 2015 Symantec Corporation

The Stack – Example push ebp This is the standard prologe. mov ebp, esp sub esp, X This is the standard prologe. X is the total size in bytes of variables used in the function void myfunc() { int a, b, c; … } Local variables can be accessed by referencing ebp e.g. mov eax, [ebp-8] _myfunc: push ebp mov ebp, esp sub esp, 12 Copyright © 2015 Symantec Corporation

Basic Instructions mov dst, src Data Movement mov, push, pop, lea Copyright © 2015 Symantec Corporation

Basic Instructions mov eax, 0x3 Data Movement mov, push, pop, lea EAX 00 03 Copyright © 2015 Symantec Corporation

Basic Instructions mov ebx, 0x40100000 mov eax, [ebx] Data Movement mov, push, pop, lea mov ebx, 0x40100000 mov eax, [ebx] EAX 00 6F 6C 65 68 Copyright © 2015 Symantec Corporation

Basic Instructions push 10 call myfunc Data Movement mov, push, pop, lea push 10 call myfunc Copyright © 2015 Symantec Corporation

push offset aSimpleHello Basic Instructions Data Movement mov, push, pop, lea push offset aSimpleHello call printf Copyright © 2015 Symantec Corporation

Basic Instructions push 10 .. pop Data Movement mov, push, pop, lea Copyright © 2015 Symantec Corporation

Basic Instructions mov eax, 5 push eax .. pop eax Data Movement mov, push, pop, lea mov eax, 5 push eax .. pop eax EAX 00 05 Copyright © 2015 Symantec Corporation

Basic Instructions lea ebx, [eax] call ebx Data Movement mov, push, pop, lea lea ebx, [eax] call ebx LEA – load effective address Copyright © 2015 Symantec Corporation

Basic Instructions mov eax, 2 add eax, 2 Arithmetic and Logic Instructions add, sub – integer addition and subtraction inc, dec – increment, decrement mul, div – integer multiplication and division and, or, xor – bitwise logical and, or and exclusive or not, neg – bitwise logical not and negate shl, shr – shift left and shift right mov eax, 2 add eax, 2 EAX 00 04 Copyright © 2015 Symantec Corporation

Basic Instructions mov eax, 65 mov ecx, 4 div ecx Arithmetic and Logic Instructions add, sub – integer addition and subtraction inc, dec – increment, decrement mul, div – integer multiplication and division and, or, xor – bitwise logical and, or and exclusive or not, neg – bitwise logical not and negate shl, shr – shift left and shift right mov eax, 65 mov ecx, 4 div ecx EDX 00 01 Copyright © 2015 Symantec Corporation

Basic Instructions call myfunc cmp eax, 0 jnz fail ..continue.. Control Flow instructions jmp – jump to location j<condition> - jump to location when condition met je, jne, jz, jg, jge, jl, jle cmp – compare call, ret – subroutine call and return call myfunc cmp eax, 0 jnz fail ..continue.. Copyright © 2015 Symantec Corporation

Basic Instructions mov eax, 3 incFunc: push eax push ebp call incFunc Control Flow instructions jmp – jump to location j<condition> - jump to location when condition met je, jne, jz, jg, jge, jl, jle cmp – compare call, ret – subroutine call and return mov eax, 3 push eax call incFunc cmp eax jz error call printf incFunc: push ebp mov ebp, esp add eax, 1 mov esp, ebp pop ebp retn Copyright © 2015 Symantec Corporation

Opcode map Copyright © 2015 Symantec Corporation

Experiment Play around and experiment! Write programs, decompile them Windows, Linux, Android Other architectures – AMD, ARM gdb./prog disas main Play around with other file formats ELF, Mach-O, Java etc Netwide Assembler for 80x86 - NASM http://www.nasm.us/ Adam Stanislav - http://www.int80h.org/ Lots of tutorials online! Copyright © 2015 Symantec Corporation

Tools Copyright © 2015 Symantec Corporation

Hiew Hex editor for windows. Curses based interface. Understands and is capable of parsing the PE Format. Easily allows you to navigate through the PE Format. Allows simple manipulation of Hex bytes/opcodes, Assembly language and Ascii strings embedded in the binary. A handy tool to quickly triage a new binary. Copyright © 2015 Symantec Corporation

Hiew Contains three distinct views of the Binary. Pure hex, with Ascii strings displayed on the RHS. Disasembly view of the binary. Binary presented in a plaintext view. Looks old fashioned and ugly but very powerful and extremely fast. Lots of nice features and great shortcuts. Bind it to a shortcut key or in a context menu for ease of use. Copyright © 2015 Symantec Corporation

Hiew Copyright © 2015 Symantec Corporation

Ollydbg OllyDbg is a 32-bit assembler level analysing debugger for Microsoft Windows. Emphasis on binary code analysis makes it particularly useful in cases where source is unavailable. A go to tool for malware analysis tool for reverse engineers. Can be used to reverse engineer and debug all binaries for which you don’t have the original source code. Immediately lets you see disassembly, registers, stack and arbitrary memory locations. Allows you to rename functions, locations and add comments to the disassembly to ease analysis. Allows you to add breakpoints directly to the disassembly. Contains a number of very useful plugins which can ease malware analysis and prevent nasty tricks in the code. “Smart” debugger that recognises loops, API calls, switches etc. Allows you to debug tricky applications easily. Copyright © 2015 Symantec Corporation

Disasm view Displays disassembly, opcodes and some useful hints of the binary/executable. Allows single stepping into (F7) or over (F8) commands or sequences of commands. Allows setting breakpoints (F2) on “interesting” instructions you want execution to break at (breakpoints are highlighted in red). Easily displays Jumps (highlighted in yellow) and function Calls (highlighted in turquoise). Copyright © 2015 Symantec Corporation

Registers view Displays the common x86 registers: EAX, EBX, ECX, EDX, ESI, EDI and allows direct modification of the registers. Useful to modify the current program’s state during execution. Can also be useful to redirect current instructions to point at new sections of memory. Displays the X86 flag registers and allows toggling of the flags (can be used to control conditional jumps). Copyright © 2015 Symantec Corporation

Stack View Displays the current stack of the debugged program allows modification of entries in the stack. Allows a user to easily track stack frames for current functions and procedures. Can be used to modify stack frames and alter stack memory directly. Copyright © 2015 Symantec Corporation

Memory View Allows an arbitrary view of memory from the currently debugged program. Allows a user to easily manipulate memory in the current program. Can be used to quickly jump all over the program’s memory layout using CTRL-G and an address! Copyright © 2015 Symantec Corporation

Frequently used shortcuts Ctrl+F2 Restart program Alt+F2 Close program F3 Open new program F5 Maximize/restore active window Alt+F5 Make OllyDbg topmost F7 Step into (entering functions) Ctrl+F7 Animate into (entering functions) F8 Step over (executing function calls at once) Ctrl+F8 Animate over (executing function calls at once) F9 Run Shift+F9 Pass exception to standard handler and run Ctrl+F9 Execute till return Alt+F9 Execute till user code Ctrl+F11 Trace into F12 Pause Ctrl+F12 Trace over Alt+B Open Breakpoints window Alt+C Open CPU window Alt+E Open Modules window Alt+L Open Log window Alt+M Open Memory window Alt+O Open Options dialog Ctrl+T Set condition to pause Run trace Alt+X Close OllyDbg Copyright © 2015 Symantec Corporation

Frequently used shortcuts Toggle breakpoint Shift+F2 Set conditional breakpoint F4 Run to selection Alt+F7 Go to previous reference Alt+F8 Go to next reference Ctrl+A Analyse code Ctrl+B Start binary search Ctrl+C Copy selection to clipboard Ctrl+E Edit selection in binary format Ctrl+F Search for a command Ctrl+G Follow expression Ctrl+J Show list of jumps to selected line Ctrl+K View call tree Ctrl+L Repeat last search Ctrl+N Open list of labels (names) Ctrl+O Scan object files Ctrl+R Find references to selected command Ctrl+S Search for a sequence of commands Asterisk (*) Origin Enter Follow jump or call Plus (+) Go to next location/next run trace item Minus (-) Go to previous location/previous run trace item Space (  ) Assemble Colon (:) Add label Semicolon (;) Add comment Copyright © 2015 Symantec Corporation

IDA Free Free version of the IDA program. More emphasis on binary code analysis makes it particularly useful in cases where source is unavailable. A go to tool for malware analysis tool for reverse engineers. Can be used to reverse engineer and debug all binaries for which you don’t have the original source code. Immediately lets you see disassembly, registers, stack and arbitrary memory locations. Complicated program, lots of options (has an entire book devoted to it). More of a static file analysis tool than Ollydbg, but contains a brilliant debugger that is well worth learning. The free version is a slightly crippled but more than enough for our simple programs. Has a lot more features for static analysis of programs than Olly, using Olly side by side with IDA is very powerful. Use IDA as a database for all analysis performed with Olly. Use Olly to fill in any “blanks” you may have while statically analysing programs with IDA. Code that looks like gibberish with IDA will make more sense when executed with Olly. Copyright © 2015 Symantec Corporation

Disasm View Displays disassembly, opcodes and some useful hints of the binary/executable. Has “smart” code recognition, helps to identify loops, stack variables, functions (including parameters) and return values. Will track variables deep into assembly code allowing you to identify their use accurately. Allows commenting of almost every line of code. Has a number (loads) of shortcuts for: renaming code, setting bookmarks, getting cross references to specific parts of code and memory locations. Copyright © 2015 Symantec Corporation

Copyright © 2015 Symantec Corporation

Overview Navigator Represents a linear view of the whole address space of the loaded program. Colour coded so you can quickly see interesting parts of memory. Turquoise: Library function. Blue: Regular function. Red: Instruction. Grey: Data Item. Pink: External Symbol. Allows you to jump to a part of memory with left click. Allows you to zoom in with right click. Copyright © 2015 Symantec Corporation

Strings View Strings present in the binary and associated data section addresses. Allows you to jump directly to the location where the string is defined. Then allows you to take an Xref, by pressing shortcut X, to see where the string is referenced from. Can be useful for tracking interesting parts of code. Copyright © 2015 Symantec Corporation

Functions View All functions and associated address in the binary. Useful for tracking interesting functions in the binary. When used in conjunction with function renaming makes the binary simple to navigate. Try using it to find the _main function in each of the challenges! Once you find the _main function you can then set a breakpoint on the first instruction using Olly! Copyright © 2015 Symantec Corporation

Challenges Demo! Copyright © 2015 Symantec Corporation

Copyright © 2015 Symantec Corporation

Alan Neville / anev@rb Cathal Mullaney / nemo@rb

Additional Resources Campaign imagery, logos and enhanced slides are located here: https://library.symantec.com Alternate background pictures for Title slides and Transition slides are located on the Brand, Digital and Advertising site: http://syminfo.ges.symantec.com/marketing/globalcommunicati ons/globalbrand/powerpoint-templates.asp If you are interested in additional training, specifically designing visual messages in PowerPoint please contact The Presentation Company LLC +1.888.991.0208 E-mail: inquiries@presentation-company.com Copyright © 2015 Symantec Corporation