Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Software Reverse Engineering Dr. Khaled Salah Associate Professor of Computer Science October.

Similar presentations


Presentation on theme: "1 Software Reverse Engineering Dr. Khaled Salah Associate Professor of Computer Science October."— Presentation transcript:

1 1 Software Reverse Engineering Dr. Khaled Salah Associate Professor of Computer Science http://faculty.kfupm.edu.sa/ics/salah salah@kfupm.edu.sa October 7, 2009

2 K. Salah2 Outline What is Reverse Engineering? What is Reverse Engineering? What is Software Cracking? What is Software Cracking? Legality and Ethics Legality and Ethics Executable Format Executable Format Reversing Tools Reversing Tools Keygenning and Code Injection Keygenning and Code Injection Strive for Unbreakable Copyright Protection Scheme Strive for Unbreakable Copyright Protection Scheme  Design Considerations  Existing Techniques  Cryptoprocessors AntiReversing Techniques AntiReversing Techniques  Tamper-resistant software Reversing JAVA, C#, and.NET Reversing JAVA, C#, and.NET

3 K. Salah3 What is Reverse Engineering? The process of extracting the knowledge or design blueprints from anything man-made. The process of extracting the knowledge or design blueprints from anything man-made. It is different than conventional scientific research (e.g, coming up with blueprints of the atom) It is different than conventional scientific research (e.g, coming up with blueprints of the atom)  With reverse engineering, the artifact being investigated is man-made  With Scientific research, the artifact is a natural phenomenon Software Reverse Engineering Software Reverse Engineering  About opening up a program’s “box” and looking inside  No screwdrivers needed, but integrates several arts of Code breaking Puzzle solving Programming Logical analysis

4 K. Salah4 “Software Cracking is a scientific process as much as it is an art,” by Greg Hoglund  Describe Phenomenon, Formulate Hypothesis, Validate the Hypothesis, Take measurements, Analyze  Repetitive process until it converges to a solution  Using the proper tools skillfully

5 K. Salah5 Why Reverse Engineering? “Sometimes, the best way to advance is in reverse,” By Eldad Eilam

6 K. Salah6 Why Reverse Engineering? Application development Application development  Why reinvent the wheel?  Undocumented API  Interoperability with proprietary software Used enormously for all MS products  Software quality and robustness  COTS validation  Multilingual Applications Security-Related Security-Related  Malicious Software Black-Hat Hackers To find vulnerabilities and exploit them White-Hat Hackers To identify, dissect and analyze malware  Reversing ciphers  Defeating copy right protection  Must-do when it comes to propriety software Really unknown what the code does and how secure Vulnerability research

7 K. Salah7 Is reversing legal? Jury is still out Jury is still out Bottom line Bottom line  What is “reverse engineering” used for?

8 K. Salah8 US/EU DMCA Laws Exceptions Interoperability Interoperability Security auditing and testing Security auditing and testing Educational purposes Educational purposes Government investigation Government investigation Regulation compliance Regulation compliance Protection of privacy Protection of privacy Evaluation of encryption technology Evaluation of encryption technology

9 K. Salah9 What is Software Cracking?

10 K. Salah10 Reversing 101 Books Books  Greg Hoglund, “Exploiting Software: How to Break Code,” Addison-Wesley, 2004  Kris Kaspersky, “Hacker Disassebmling Uncovered,” A-List Publishing, 2003  Eldad Eilam, “Reversing: Software Reverse Engineering,” Wiley, 2005  Mike Perry and Nasko Oskov, “Introduction to Reverse Engineering Software,” Incomplet online book. Web sites Web sites  Learn to crack http://www.learn2crack.com/ http://www.learn2crack.com/ http://www.learn2crack.com/ http://woodmann.net/krobar/ http://woodmann.net/krobar/ http://woodmann.net/krobar/  Crack Find http://www.crackfind.com/ http://www.crackfind.com/ http://www.crackfind.com/ http://www.cracksearchengine.net/ http://www.cracksearchengine.net/ http://www.cracksearchengine.net/ http://astalavista.box.sk http://astalavista.box.sk http://astalavista.box.sk http://ttdown.com/ http://ttdown.com/ http://ttdown.com/

11 K. Salah11 Approaches to reverse engineering White Box Analysis White Box Analysis  Analyzing and understanding source code  Code coverage – walking down dark allies Black Box Analysis Black Box Analysis  Analyzing a running program by probing it with various inputs Gray Box Analysis Gray Box Analysis  A combination

12 K. Salah12 Compilation Process

13 K. Salah13 Java Compilation Process

14 K. Salah14 Executable Format Windows: PE, EXE, DLL Windows: PE, EXE, DLL Unix/Linux: a.out, COFF, and ELF Unix/Linux: a.out, COFF, and ELF Specifies a standard format for mapping code on disk into a complete executable image in the memory that consists of code, data, stack, heap (for malloc), and all the libraries Specifies a standard format for mapping code on disk into a complete executable image in the memory that consists of code, data, stack, heap (for malloc), and all the libraries http://www.wotsit.org/ is a greate resource for all type of file formats http://www.wotsit.org/ is a greate resource for all type of file formats http://www.wotsit.org/

15 K. Salah15 ELF Layout

16 K. Salah16 PE Sections

17 K. Salah17 Multilingual Applications Resource Hackers Resource Hackers Modifying Resources: Modifying Resources:  Resources can be modified by replacing the resource with a resource located in another file (*.ico, *.bmp, *.res etc) or by using the internal resource script compiler (for menus, dialogs etc).  Dialog controls can also be visually moved and/or resized by clicking and dragging the respective dialog controls prior to recompiling with the internal compiler.

18 18 Demo on ResouceHacker

19 K. Salah19 Reversing Tools No all-in-one No all-in-one Disassemblers and low-level debuggers Disassemblers and low-level debuggers  IDA Pro (Interactive Disassembler from www.datarescue.com, and now hex-rays.com)www.datarescue.com  ILDasm  OllyDbg  Win32Dasm  WinDbg  Numega SoftICE Decompilers Decompilers  Dream to come true  Hex-Ray’s Decompiler  In development: Andromeda and Boomerang  For Java and C#, the dream has come true System monitors System monitors  FileMon, TCPView, TDIMon, RegMon, PortMon, WinObj, Process Explorer, Show Traffic, etc. Hex Editors Hex Editors  Hex Workshop  Hiew  WinHex

20 K. Salah20 Unix/Linux Platform Black Box Analysis Black Box Analysis  /proc filesystem as in PE of sysinternals  nm, lsof  netstat  ethereal, wireshark, tcpdump White Box Analysis White Box Analysis  gcc, gdb  Strace  DDD  readelf, dumpelf

21 K. Salah21 IDA-generated function flowchart

22 K. Salah22 IDA-generated function interactions at high level

23 K. Salah23 Program Structure Static libraries Static libraries  Third party library external to program  Added to program while being built  Become integral part of the final executable Dynamic libraries Dynamic libraries  Idea is to utilize memory  Multiple processes can use the same DLL  Remains separate from the executable  Update to DLL requires no update to program  Simplify the reversing process and give many hints on program behavior.

24 K. Salah24 The.exe format and DLLs Before loading DLLs, addresses of functions in DLL are pointing to dummy addresses in an import table Before loading DLLs, addresses of functions in DLL are pointing to dummy addresses in an import table When process is loaded, the OS loader loads every module listed in the imported table, and resolves the addresses of each of the functions listed in each modules. The addresses are found in the exported table of the module. When process is loaded, the OS loader loads every module listed in the imported table, and resolves the addresses of each of the functions listed in each modules. The addresses are found in the exported table of the module.

25 K. Salah25 Load Address 0x00400000 Typically the load address of an Executable or DLL is 0x00400000 Typically the load address of an Executable or DLL is 0x00400000 This is called the proffered load address (pla) This is called the proffered load address (pla) However, the loader will load at a different load address However, the loader will load at a different load address Example: Example:  If the executable loads at 00500000, the OS loader has to add offset to all reference addresses within the image  Debugger does the same for seamless translation

26 K. Salah26 Important Flags

27 K. Salah27 Important Machine Instructions Here are some basic machine code hex values: Here are some basic machine code hex values: jmp = EB je = 74 jne = 75 ret = C3 nop = 90

28 K. Salah28 Demos Removing Time Limit Removing Time Limit  Xtheme  SmartCapture Removing Unregistered Title from JpegOptimizer Removing Unregistered Title from JpegOptimizer Finding out the valid key in JpegOptimizer Finding out the valid key in JpegOptimizer Tour of W32Dasm & IDA Pro Tour of W32Dasm & IDA Pro

29 K. Salah29 Keygenning Definition: The process of creating programs that mimic the key-generation algorithm for everyone to use Definition: The process of creating programs that mimic the key-generation algorithm for everyone to use Software can be blacklisted if it always works with same pair of (serial number or username.) Software can be blacklisted if it always works with same pair of (serial number or username.) Better yet: Better yet:  Key depends on pair of username plus computer- specific info (disk driver serial number) Ripping Keygen Algorithms Ripping Keygen Algorithms  The process is somewhat easy  Locate/copy/paste/modify

30 30 Keygening static demo

31 K. Salah31 Code Injection Write offline code to inject, compile, and link Write offline code to inject, compile, and link Disassemble the executable and extract related machine code Disassemble the executable and extract related machine code Overwirte parts of CS and DS, and transferring control to it Overwirte parts of CS and DS, and transferring control to it  Original execution flow is changed More popular is the inline-hooking used by rootkits More popular is the inline-hooking used by rootkits  Behavior and dynamics of the application remains the same  Injected code is executed seemlessly

32 K. Salah32 Rootkit Hooking What is Rootkit What is Rootkit  a set of programs and code that allows a permanent or consistent, undetectable presence on a computer.  Goals: Hide malicious resources Processes, files, registry keys, open ports, etc Provide hidden backdoor access

33 K. Salah33 In-Line Hooking Overwrite first few bytes of target function with a jump to rootkit code Overwrite first few bytes of target function with a jump to rootkit code Create “trampoline” function that first executes overwritten bytes from original function, then jumps back to original function Create “trampoline” function that first executes overwritten bytes from original function, then jumps back to original function When function is called, rootkit code executes When function is called, rootkit code executes Rootkit code calls trampoline, which executes original function Rootkit code calls trampoline, which executes original function

34 K. Salah34 In-line Hooking

35 35 Copyright Protection Schemes  Building a “Tamper-resistant Software” Is it a losing battle ? Is it a losing battle ?

36 K. Salah36 Design Considerations  Resistance to attack  End-user transparency

37 K. Salah37 Types of Protection Media-Based Protections Media-Based Protections  Deliberately writing bad sectors on floppy or CD (e.g. No-Crack CD).  Easy to break Watermarking Watermarking  Check for watermark on media before running (against sw counterfeiting)  Can be broken Serial Numbers Serial Numbers  Program has “secret validation algorithm” to check for good serial numbers  Easy to break Checksum Checksum  Check for valid checksum before or during running  Can be broken Challenge Response and Online Activations Challenge Response and Online Activations  A secret key is shared  Can be broken Hardware-Based Protections Hardware-Based Protections  Program runs or keeps running if a dongle is attached  Can be broken Software as a Service (SaaS) Software as a Service (SaaS)  Highly secure  Cloud Computing  Vendor is in control Advanced Protection Concepts – Cryptoprocessors Advanced Protection Concepts – Cryptoprocessors  Violates Open Architecture

38 K. Salah38 Pinpointing the weakness Open Architecture Open Architecture  CPU can not check “authorized” or “legal” code  Only execution mode and process memory protection by OS– useless  When program runs, its data and code has to be exposed or decrypted  Key or decrypted data are impossible to hide – It is the CPU that does the decryption operation

39 K. Salah39 Cryptoprocessors Originated by Robert M. Best in his patent, “Executing Enciphered Programs.” Originated by Robert M. Best in his patent, “Executing Enciphered Programs.” Decrypt code then Execute Decrypt code then Execute Each cryptoprocessor has (issued by CA) Each cryptoprocessor has (issued by CA)  Two pair of keys  A serial number The Steps The Steps  Software is shipped by the vendor with binary executable encrypted  Cryptoprocesor decrypts with private key  Machine code is placed in a special memory not software- accessable Similar to macrocode for each machine instruction Only accessed by the process itself. Any other access by OS or IDT or any software is a violation.  Code is executed from this (theoretically) inaccessable memory

40 K. Salah40 Is this unbreakable? Black box analysis Black box analysis Place a microprobing needle on pins Place a microprobing needle on pins  Can be sealed off  If tampered with, may self-destruct or zero-out keys Use photodiodes or thermite DPA (Differential Power Analysis)  Real killer for smartcards these days DPA (Differential Power Analysis)  Real killer for smartcards these daysby Paul Kocher, Joshu Jaffe, and Benjamin Jun (in 1996) Watch the power consumption (by measuring with great precision the current withdrawn) and correlating this to the computations being performed to deduce the value of the crypto keys. 

41 K. Salah41 DEA (Differential Electromagnetic Analysis) DEA (Differential Electromagnetic Analysis)  A tiny coil (magnetic sensor) is place on top of chip to deduce internal activities  Combined with DPA to deduce more information Optical and Laser Probing  Focused light/laser/X-ray  change state of transistor  turn memory bits on/off  cause faults  and therefore break protections and deduce information Radiation Analysis Radiation Analysis  Non-CRT monitor and gas plasma TV’s are recommended recommended as it has less radiated emissions Usenix 2009 Paper Usenix 2009 Paper

42 K. Salah42

43 K. Salah43

44 K. Salah44 Antireversing Techniques Lets get one thing out of the way Lets get one thing out of the way  It is never possible to entirely prevent reversing  What is possible is to hinder and obstruct reversers by wearing them out and making it so slow and painful. drown the reverser into a sea of irrelevant information Outcome: they give up

45 K. Salah45 Basic Techniques to Antireversing Eliminate Symbolic Antireversing Eliminate Symbolic Antireversing  Use inlining & addresses for function names  Ship executable without debugging information  Watch for shipping products with byte-code (Java and.NET) Intermediate code of a compiler Uses internal cross-references than addresses Make it so easy to decompile to source code! More: class names, class member names, names of instantiated global objects Code Obfuscation or Randomization Code Obfuscation or Randomization  Functionally identical yet far less readable  E.g. rename all symbolic links into meaningless symbols Check Sum Check Sum  Part of the binary executable is encrypted and decrypted at runtime Code Encryption or Check Sum Code Encryption or Check Sum  Part of the binary executable is encrypted and decrypted at runtime  Only hinders static analysis Embedding Antidebuggers Code Embedding Antidebuggers Code  A separate thread to check for debugger existence or long stoppage time  If so, terminate or better yet confuse it!  Workaround is using static analysis  Powerful combination  Powerful combination: use encryption with antidebugging.

46 K. Salah46 JAVA, C#, and.NET Reversing is incredibly trivial task Reversing is incredibly trivial task Bytecode and MSIL is highly detailed Bytecode and MSIL is highly detailed  Full definition of every data structure  Name of every symbol Every object, data member, and member function Decompile and modify Decompile and modify Countermeasures Countermeasures  Renaming symbols  Control flow obfuscation  Breaking decompilation and disassembly Corrupting the metadata with bogus references, strings, methods, etc.

47 K. Salah47 Decompiling C# function

48 K. Salah48 Example of Obfuscation – Code Interleaving 

49 K. Salah49 Detecting Debugger TF set in EFLAGS TF set in EFLAGS Check handlers in IDT Check handlers in IDT Enable TF and see if exception is swallowed by debugger Enable TF and see if exception is swallowed by debugger Code checksum Code checksum  Slowdown runtime  Use only for sensitive functions Use a special thread to see if program was stopped for long time Use a special thread to see if program was stopped for long time Inserting junk bytes into code (opaque predicate) Inserting junk bytes into code (opaque predicate)  Use opcode: db or ds to insert junk bytes Data is confused as code, e.g. ‘mov’ may become ‘ret’  Disassemblers get confused Linear sweep vs. recursive traversal

50 K. Salah50 Summary Open Architecture is the real enemy of software protection. Open Architecture is the real enemy of software protection. To-date, it is virtually impossible to create a totally uncrackable copy protection scheme or tamper-resistant software To-date, it is virtually impossible to create a totally uncrackable copy protection scheme or tamper-resistant software “Using non-traditional cryptanalysis attacks (like timing, power consumption, radiation, …) force cryptography to do something new, different and unexpected,” Bruce Schneier. “Using non-traditional cryptanalysis attacks (like timing, power consumption, radiation, …) force cryptography to do something new, different and unexpected,” Bruce Schneier. Most effective/practical technique Most effective/practical technique  A combination of encryption and anti-debugging


Download ppt "1 Software Reverse Engineering Dr. Khaled Salah Associate Professor of Computer Science October."

Similar presentations


Ads by Google