Reverse Engineering Workshop Tip: simple SEO adjustments can make your presentation more discoverable. Read this PDF for best practices: http://seo.ges.symantec.com/seo-best-practices-for-file-optimization.pdf Cathal Mullaney / nemo@rb Alan Neville / anev@rb
Agenda 1 Reverse Engineering in Security 2 PE File Format 3 Crash Course in Assembly 4 Tools 5 Challenges Copyright © 2015 Symantec Corporation
Reverse Engineering in Security Video Time! Copyright © 2015 Symantec Corporation
A Peek at the PE File Format Copyright © 2015 Symantec Corporation
Let’s start with an example.. 9,216 bytes Simple.exe Compiled in VS2015 1. Using Visual Studios, we can write a simple Hello, World! Example in C 2. We compile the code as an application.. This is what it looks like on disk – it takes up 9,216 bytes of space 3. Looking at the assembly code generated by the bytes-code – we have the following block 4. On the left side, you can see the op-codes that are generates. This is what the machine reads 5. And when we run the application, we get the output we expect Copyright © 2015 Symantec Corporation
Inside Simple.exe C code translated into op-codes Op-codes represented by bytes Highlighted bytes represent the C code written for simple.exe Only a total of 20 bytes! But what’s all the other stuff?! PE Format to save the day! Copyright © 2015 Symantec Corporation
PE Format Layout Standard binary file format for Windows Introduced in Windows NT 3.1 Win32 SDK contains file winnt.h Describes the structure and variables used in PE files File imagehlp.dll contains functions to manipulate PE files PE files are broken into regions that can be examined Developed by Mark Zbikowski at MS in 1990 Used to bridge gap between DOS and Windows executables PE File DOS Header PE Header Optional Header Section Table Sections (code, data, imports) Resources Overlay Copyright © 2015 Symantec Corporation
DOS Header First two letters always ‘MZ’ (0x4D5A) MS DOS header Starts at offset 0x0 Aka “magic number” or File Signature Determines the type of file All file types use magic numbers Java – 0xCAFEBABE ZIP – 0x504B – “PK” MS DOS header Ensures backwards compatibility If file executed in the wrong environment, error message is displayed “This program cannot be run in DOS mode.” Copyright © 2015 Symantec Corporation
PE Header Known as the ‘File Header’ Contains useful information Machine what architecture the file was compiled for Number of sections TimeDateStamp when the file was compiled SizeOfOptionalHeader length of following header Characteristics 0x02 – executable file 0x2000 – file is a DLL, not an EXE Copyright © 2015 Symantec Corporation
Optional PE Header Despite it’s name, it is not optional Required in executable files, not COM files Occurs directly after PE Header Contains LOTS of information Version info of the compiler Size of the file Checksum – ensure file integrity And more! AddressOfEntryPoint Most important field here! Copyright © 2015 Symantec Corporation
Section Table PE header defines a number of sections Code is placed in these sections Each section definition is 40 bytes in length Section Name (.text, .data etc) Size of the section once loaded into memory Location of section (RVA) Physical size of section on disk Physical location etc Resource named .rsrc Import named .idata Export named .edata Flags used to describe the type of data in the section PE File DOS Header PE Header Optional Header Section Table Sections (code, data, imports) 1..n Resources Overlay Copyright © 2015 Symantec Corporation
Parsing the PE file structure yourself Pefile – python module - https://github.com/erocarrera/pefile “multi-platform Python module to parse and work with PE files” Self-contained – not dependencies! Access PE header Retrieve embedded data Read strings Identify malformed values Explore all features of the PE format Can be used to manipulate the PE structure Allows writing to some of the fields Experiment by changing the data – see how Windows reacts Copyright © 2015 Symantec Corporation
Assembly Crash Course Copyright © 2015 Symantec Corporation
Copyright © 2015 Symantec Corporation
What is assembly language? HHL (Java, C, C++ etc) Assembly Op-codes Machine Code Copyright © 2015 Symantec Corporation
Registers EAX, EBX, ECX, EDX – all purpose registers ESI, EDI – source and destination index pointers ESP – stack pointer EBP – base pointer, points to return address Also important is EIP – instruction pointer, points to the next instruction to run Copyright © 2015 Symantec Corporation
The Stack Part of memory where program stores variables + function args Last-in-first-out (LIFO) Things are added to the top of the stack push Things are removed from the top of the stack pop It grows backwards Grows from highest memory address to lowest ESP + EBP – registers to work with stack ESP points to top of stack EBP points to local variables Copyright © 2015 Symantec Corporation
The Stack – Example int a = 10 int b = 5 int c = 2 push 10 push 5 High address int a = 10 int b = 5 int c = 2 C – esp+16 B – esp+12 push 10 push 5 push 2 call addNums A – esp+8 EBP register Points to top of the current frame This is the same as your return address EBP+4 Points to first argument passed into function EBP-4 Points to first local variable Usually the old value of EBP Can use this to restore the prior frame Return address esp+4 saved ebp ebp-4 EBP result ebp-8 ESP addNums: int result = a+b+c; return result; Low address Copyright © 2015 Symantec Corporation
The Stack – Example push ebp This is the standard prologe. mov ebp, esp sub esp, X This is the standard prologe. X is the total size in bytes of variables used in the function void myfunc() { int a, b, c; … } Local variables can be accessed by referencing ebp e.g. mov eax, [ebp-8] _myfunc: push ebp mov ebp, esp sub esp, 12 Copyright © 2015 Symantec Corporation
Basic Instructions mov dst, src Data Movement mov, push, pop, lea Copyright © 2015 Symantec Corporation
Basic Instructions mov eax, 0x3 Data Movement mov, push, pop, lea EAX 00 03 Copyright © 2015 Symantec Corporation
Basic Instructions mov ebx, 0x40100000 mov eax, [ebx] Data Movement mov, push, pop, lea mov ebx, 0x40100000 mov eax, [ebx] EAX 00 6F 6C 65 68 Copyright © 2015 Symantec Corporation
Basic Instructions push 10 call myfunc Data Movement mov, push, pop, lea push 10 call myfunc Copyright © 2015 Symantec Corporation
push offset aSimpleHello Basic Instructions Data Movement mov, push, pop, lea push offset aSimpleHello call printf Copyright © 2015 Symantec Corporation
Basic Instructions push 10 .. pop Data Movement mov, push, pop, lea Copyright © 2015 Symantec Corporation
Basic Instructions mov eax, 5 push eax .. pop eax Data Movement mov, push, pop, lea mov eax, 5 push eax .. pop eax EAX 00 05 Copyright © 2015 Symantec Corporation
Basic Instructions lea ebx, [eax] call ebx Data Movement mov, push, pop, lea lea ebx, [eax] call ebx LEA – load effective address Copyright © 2015 Symantec Corporation
Basic Instructions mov eax, 2 add eax, 2 Arithmetic and Logic Instructions add, sub – integer addition and subtraction inc, dec – increment, decrement mul, div – integer multiplication and division and, or, xor – bitwise logical and, or and exclusive or not, neg – bitwise logical not and negate shl, shr – shift left and shift right mov eax, 2 add eax, 2 EAX 00 04 Copyright © 2015 Symantec Corporation
Basic Instructions mov eax, 65 mov ecx, 4 div ecx Arithmetic and Logic Instructions add, sub – integer addition and subtraction inc, dec – increment, decrement mul, div – integer multiplication and division and, or, xor – bitwise logical and, or and exclusive or not, neg – bitwise logical not and negate shl, shr – shift left and shift right mov eax, 65 mov ecx, 4 div ecx EDX 00 01 Copyright © 2015 Symantec Corporation
Basic Instructions call myfunc cmp eax, 0 jnz fail ..continue.. Control Flow instructions jmp – jump to location j<condition> - jump to location when condition met je, jne, jz, jg, jge, jl, jle cmp – compare call, ret – subroutine call and return call myfunc cmp eax, 0 jnz fail ..continue.. Copyright © 2015 Symantec Corporation
Basic Instructions mov eax, 3 incFunc: push eax push ebp call incFunc Control Flow instructions jmp – jump to location j<condition> - jump to location when condition met je, jne, jz, jg, jge, jl, jle cmp – compare call, ret – subroutine call and return mov eax, 3 push eax call incFunc cmp eax jz error call printf incFunc: push ebp mov ebp, esp add eax, 1 mov esp, ebp pop ebp retn Copyright © 2015 Symantec Corporation
Opcode map Copyright © 2015 Symantec Corporation
Experiment Play around and experiment! Write programs, decompile them Windows, Linux, Android Other architectures – AMD, ARM gdb./prog disas main Play around with other file formats ELF, Mach-O, Java etc Netwide Assembler for 80x86 - NASM http://www.nasm.us/ Adam Stanislav - http://www.int80h.org/ Lots of tutorials online! Copyright © 2015 Symantec Corporation
Tools Copyright © 2015 Symantec Corporation
Hiew Hex editor for windows. Curses based interface. Understands and is capable of parsing the PE Format. Easily allows you to navigate through the PE Format. Allows simple manipulation of Hex bytes/opcodes, Assembly language and Ascii strings embedded in the binary. A handy tool to quickly triage a new binary. Copyright © 2015 Symantec Corporation
Hiew Contains three distinct views of the Binary. Pure hex, with Ascii strings displayed on the RHS. Disasembly view of the binary. Binary presented in a plaintext view. Looks old fashioned and ugly but very powerful and extremely fast. Lots of nice features and great shortcuts. Bind it to a shortcut key or in a context menu for ease of use. Copyright © 2015 Symantec Corporation
Hiew Copyright © 2015 Symantec Corporation
Ollydbg OllyDbg is a 32-bit assembler level analysing debugger for Microsoft Windows. Emphasis on binary code analysis makes it particularly useful in cases where source is unavailable. A go to tool for malware analysis tool for reverse engineers. Can be used to reverse engineer and debug all binaries for which you don’t have the original source code. Immediately lets you see disassembly, registers, stack and arbitrary memory locations. Allows you to rename functions, locations and add comments to the disassembly to ease analysis. Allows you to add breakpoints directly to the disassembly. Contains a number of very useful plugins which can ease malware analysis and prevent nasty tricks in the code. “Smart” debugger that recognises loops, API calls, switches etc. Allows you to debug tricky applications easily. Copyright © 2015 Symantec Corporation
Disasm view Displays disassembly, opcodes and some useful hints of the binary/executable. Allows single stepping into (F7) or over (F8) commands or sequences of commands. Allows setting breakpoints (F2) on “interesting” instructions you want execution to break at (breakpoints are highlighted in red). Easily displays Jumps (highlighted in yellow) and function Calls (highlighted in turquoise). Copyright © 2015 Symantec Corporation
Registers view Displays the common x86 registers: EAX, EBX, ECX, EDX, ESI, EDI and allows direct modification of the registers. Useful to modify the current program’s state during execution. Can also be useful to redirect current instructions to point at new sections of memory. Displays the X86 flag registers and allows toggling of the flags (can be used to control conditional jumps). Copyright © 2015 Symantec Corporation
Stack View Displays the current stack of the debugged program allows modification of entries in the stack. Allows a user to easily track stack frames for current functions and procedures. Can be used to modify stack frames and alter stack memory directly. Copyright © 2015 Symantec Corporation
Memory View Allows an arbitrary view of memory from the currently debugged program. Allows a user to easily manipulate memory in the current program. Can be used to quickly jump all over the program’s memory layout using CTRL-G and an address! Copyright © 2015 Symantec Corporation
Frequently used shortcuts Ctrl+F2 Restart program Alt+F2 Close program F3 Open new program F5 Maximize/restore active window Alt+F5 Make OllyDbg topmost F7 Step into (entering functions) Ctrl+F7 Animate into (entering functions) F8 Step over (executing function calls at once) Ctrl+F8 Animate over (executing function calls at once) F9 Run Shift+F9 Pass exception to standard handler and run Ctrl+F9 Execute till return Alt+F9 Execute till user code Ctrl+F11 Trace into F12 Pause Ctrl+F12 Trace over Alt+B Open Breakpoints window Alt+C Open CPU window Alt+E Open Modules window Alt+L Open Log window Alt+M Open Memory window Alt+O Open Options dialog Ctrl+T Set condition to pause Run trace Alt+X Close OllyDbg Copyright © 2015 Symantec Corporation
Frequently used shortcuts Toggle breakpoint Shift+F2 Set conditional breakpoint F4 Run to selection Alt+F7 Go to previous reference Alt+F8 Go to next reference Ctrl+A Analyse code Ctrl+B Start binary search Ctrl+C Copy selection to clipboard Ctrl+E Edit selection in binary format Ctrl+F Search for a command Ctrl+G Follow expression Ctrl+J Show list of jumps to selected line Ctrl+K View call tree Ctrl+L Repeat last search Ctrl+N Open list of labels (names) Ctrl+O Scan object files Ctrl+R Find references to selected command Ctrl+S Search for a sequence of commands Asterisk (*) Origin Enter Follow jump or call Plus (+) Go to next location/next run trace item Minus (-) Go to previous location/previous run trace item Space ( ) Assemble Colon (:) Add label Semicolon (;) Add comment Copyright © 2015 Symantec Corporation
IDA Free Free version of the IDA program. More emphasis on binary code analysis makes it particularly useful in cases where source is unavailable. A go to tool for malware analysis tool for reverse engineers. Can be used to reverse engineer and debug all binaries for which you don’t have the original source code. Immediately lets you see disassembly, registers, stack and arbitrary memory locations. Complicated program, lots of options (has an entire book devoted to it). More of a static file analysis tool than Ollydbg, but contains a brilliant debugger that is well worth learning. The free version is a slightly crippled but more than enough for our simple programs. Has a lot more features for static analysis of programs than Olly, using Olly side by side with IDA is very powerful. Use IDA as a database for all analysis performed with Olly. Use Olly to fill in any “blanks” you may have while statically analysing programs with IDA. Code that looks like gibberish with IDA will make more sense when executed with Olly. Copyright © 2015 Symantec Corporation
Disasm View Displays disassembly, opcodes and some useful hints of the binary/executable. Has “smart” code recognition, helps to identify loops, stack variables, functions (including parameters) and return values. Will track variables deep into assembly code allowing you to identify their use accurately. Allows commenting of almost every line of code. Has a number (loads) of shortcuts for: renaming code, setting bookmarks, getting cross references to specific parts of code and memory locations. Copyright © 2015 Symantec Corporation
Copyright © 2015 Symantec Corporation
Overview Navigator Represents a linear view of the whole address space of the loaded program. Colour coded so you can quickly see interesting parts of memory. Turquoise: Library function. Blue: Regular function. Red: Instruction. Grey: Data Item. Pink: External Symbol. Allows you to jump to a part of memory with left click. Allows you to zoom in with right click. Copyright © 2015 Symantec Corporation
Strings View Strings present in the binary and associated data section addresses. Allows you to jump directly to the location where the string is defined. Then allows you to take an Xref, by pressing shortcut X, to see where the string is referenced from. Can be useful for tracking interesting parts of code. Copyright © 2015 Symantec Corporation
Functions View All functions and associated address in the binary. Useful for tracking interesting functions in the binary. When used in conjunction with function renaming makes the binary simple to navigate. Try using it to find the _main function in each of the challenges! Once you find the _main function you can then set a breakpoint on the first instruction using Olly! Copyright © 2015 Symantec Corporation
Challenges Demo! Copyright © 2015 Symantec Corporation
Copyright © 2015 Symantec Corporation
Alan Neville / anev@rb Cathal Mullaney / nemo@rb
Additional Resources Campaign imagery, logos and enhanced slides are located here: https://library.symantec.com Alternate background pictures for Title slides and Transition slides are located on the Brand, Digital and Advertising site: http://syminfo.ges.symantec.com/marketing/globalcommunicati ons/globalbrand/powerpoint-templates.asp If you are interested in additional training, specifically designing visual messages in PowerPoint please contact The Presentation Company LLC +1.888.991.0208 E-mail: inquiries@presentation-company.com Copyright © 2015 Symantec Corporation