CS266 Software Reverse Engineering (SRE) Applying Anti-Reversing Techniques to Machine Code Teodoro (Ted) Cipresso,

Slides:



Advertisements
Similar presentations
Chapter 11 Introduction to Programming in C
Advertisements

Software & Services Group, Developer Products Division Copyright© 2010, Intel Corporation. All rights reserved. *Other brands and names are the property.
Systems Software.
CS266 Software Reverse Engineering (SRE) Applying Anti-Reversing Techniques to Java Bytecode Teodoro (Ted) Cipresso,
.NET IL Obfuscation Presented by: Sarath Chandra Dorbala.
VBA Modules, Functions, Variables, and Constants
16/13/2015 3:30 AM6/13/2015 3:30 AM6/13/2015 3:30 AMIntroduction to Software Development What is a computer? A computer system contains: Central Processing.
Computers: Tools for an Information Age
Run-Time Storage Organization
Software Uniqueness: How and Why? Puneet Mishra Dr. Mark Stamp Department of Computer Science San José State University, San José, California.
1.3 Executing Programs. How is Computer Code Transformed into an Executable? Interpreters Compilers Hybrid systems.
Compiled by Benjamin Muganzi 3.2 Functions and Purposes of Translators Computing 9691 Paper 3 1.
1 Introduction to Tool chains. 2 Tool chain for the Sitara Family (but it is true for other ARM based devices as well) A tool chain is a collection of.
CS102 Introduction to Computer Programming
Chapter Seven Advanced Shell Programming. 2 Lesson A Developing a Fully Featured Program.
Dr. Pedro Mejia Alvarez Software Testing Slide 1 Software Testing: Building Test Cases.
Debugging, Build and Version Control Rudra Dutta CSC Spring 2007, Section 001.
Unit Testing & Defensive Programming. F-22 Raptor Fighter.
Trying to like a boss… REVERSE ENGINEERING. WHAT EVEN IS… REVERSE ENGINEERING?? Reverse engineering is the process of disassembling and analyzing a particular.
® IBM Software Group © 2009 IBM Corporation Rational Publishing Engine RQM Multi Level Report Tutorial David Rennie, IBM Rational Services A/NZ
JIT in webkit. What’s JIT See time_compilation for more info. time_compilation.
Improving Program Performance Function Visibility in z/TPF C++ Load Modules October 2, /2/20151American Express Public.
Chapter 1 What is Programming? Lecture Slides to Accompany An Introduction to Computer Science Using Java (2nd Edition) by S.N. Kamin, D. Mickunas, E.
9 Chapter Nine Compiled Web Server Programs. 9 Chapter Objectives Learn about Common Gateway Interface (CGI) Create CGI programs that generate dynamic.
KEVIN COOGAN, GEN LU, SAUMYA DEBRAY DEPARTMENT OF COMUPUTER SCIENCE UNIVERSITY OF ARIZONA 報告者:張逸文 Deobfuscation of Virtualization- Obfuscated Software.
Binary Auditing Geller Bedoya Michael Wozniak. Background  Binary auditing is a technique used to test the security and discover the inner workings of.
Chapter 12 Recursion, Complexity, and Searching and Sorting
Old Chapter 10: Programming Tools A Developer’s Candy Store.
1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
CS266 Software Reverse Engineering (SRE) Reversing and Patching Java Bytecode Teodoro (Ted) Cipresso,
EECS 354 Network Security Reverse Engineering. Introduction Preventing Reverse Engineering Reversing High Level Languages Reversing an ELF Executable.
CSC 230: C and Software Tools Rudra Dutta Computer Science Department Course Introduction.
Debugging in Java. Common Bugs Compilation or syntactical errors are the first that you will encounter and the easiest to debug They are usually the result.
Testing and Debugging Version 1.0. All kinds of things can go wrong when you are developing a program. The compiler discovers syntax errors in your code.
Replay Compilation: Improving Debuggability of a Just-in Time Complier Presenter: Jun Tao.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Memory: Relocation.
Debugging and Profiling With some help from Software Carpentry resources.
Chapter 1 Introduction. Chapter 1 - Introduction 2 The Goal of Chapter 1 Introduce different forms of language translators Give a high level overview.
1. 2 Preface In the time since the 1986 edition of this book, the world of compiler design has changed significantly 3.
CS 460/660 Compiler Construction. Class 01 2 Why Study Compilers? Compilers are important – –Responsible for many aspects of system performance Compilers.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2009 Overview of Compilers and JikesRVM John.
Buffer Overflow Proofing of Code Binaries By Ramya Reguramalingam Graduate Student, Computer Science Advisor: Dr. Gopal Gupta.
The Software Development Process
 Programming - the process of creating computer programs.
Introduction to OOP CPS235: Introduction.
Quick Test Professional 9.2. Testing Process Preparing to Record Recording Enhancing a Test Debugging Running the Test and Analyzing the Results Reporting.
Defensive Programming. Good programming practices that protect you from your own programming mistakes, as well as those of others – Assertions – Parameter.
Lecture #1: Introduction to Algorithms and Problem Solving Dr. Hmood Al-Dossari King Saud University Department of Computer Science 6 February 2012.
PROGRAMMING FUNDAMENTALS INTRODUCTION TO PROGRAMMING. Computer Programming Concepts. Flowchart. Structured Programming Design. Implementation Documentation.
LECTURE 3 Translation. PROCESS MEMORY There are four general areas of memory in a process. The text area contains the instructions for the application.
Some of the utilities associated with the development of programs. These program development tools allow users to write and construct programs that the.
Debugging and Testing Hussein Suleman March 2007 UCT Department of Computer Science Computer Science 1015F.
Chapter 1 Introduction Samuel College of Computer Science & Technology Harbin Engineering University.
INTRO. To I.T Razan N. AlShihabi
Lecture 3 Translation.
Advanced Computer Systems
Visit for more Learning Resources
Application of Obfuscation Techniques on Android Applications
Static and dynamic analysis of binaries
Introduction to Compiler Construction
Types for Programs and Proofs
Testing and Debugging PPT By :Dr. R. Mall.
CSCI-235 Micro-Computer Applications
Compiler Construction (CS-636)
Un</br>able’s MySecretSecrets
Compiler Construction
Software testing strategies 2
Test Case Test case Describes an input Description and an expected output Description. Test case ID Section 1: Before execution Section 2: After execution.
CISC 7120X Programming Languages and Compilers
SPL – PS1 Introduction to C++.
Presentation transcript:

CS266 Software Reverse Engineering (SRE) Applying Anti-Reversing Techniques to Machine Code Teodoro (Ted) Cipresso, Department of Computer Science San José State University Spring 2015 The information in this presentation is taken from the thesis “Software reverse engineering education” available at where all citations can be found.

Applying Anti-Reversing Techniques to Machine Code Introduction, Motivation, and Considerations  Extreme care must be taken when applying anti-reversing techniques because most techniques ultimately change the machine code.  In the end, if a program does not run correctly, measuring how efficient or difficult to reverse engineer it is becomes meaningless [18].18  Anti-reversing transformations performed on source code make a program more difficult to understand in both source and executable formats. These transforms can expose compiler bugs because the program no longer looks like something a human would write. [18] states “any compiler is going to have at least some pathological programs which it will not compile correctly.”18 2

Applying Anti-Reversing Techniques to Machine Code Introduction, Motivation, and Considerations  Compiler failures on “pathological” programs occur because compiler test cases are most often coded by people.  Test cases are not typically generated by some sophisticated tool that knows how to try every fringe case and surface every bug.  Therefore we should not be surprised if some compilers have difficulty with obfuscated source code.  We now investigate the technique Eliminating Symbolic Information as it applies to machine code. We previously looked at this technique in Applying Anti-Reversing Techniques to Java Bytecode.Applying Anti-Reversing Techniques to Java Bytecode 3

Applying Anti-Reversing Techniques to Machine Code Eliminating Symbolic Information in Machine Code  Eliminating Symbolic Information  Eliminating Symbolic Information calls for the removal of any meaningful symbolic information in machine code that is not important to the execution of the program, but serves to ease debugging or reuse of it by another program..idata For example, if a Windows program references functions (methods) exported by one or more libraries (DLLs), the names of those methods will appear in the.idata (import data) section of the program binary.  In production versions of a program, the machine code doesn't directly contain any symbolic information from the original source code. In the executable: Method names, variable names (etc..), and line numbers are all absent. Only the machine instructions produced directly or indirectly by the compiler [9] are present.9 4

Applying Anti-Reversing Techniques to Machine Code Eliminating Symbolic Information in Machine Code  This lack of information about the connection between the machine instructions and HLL source is unacceptable for purposes of debugging. Therefore most modern compilers, like GCC, include an option to include debugging information into an executable. The included debugging information allows a debugger to map one or more machine instructions back to a particular HLL statement [9].9  To demonstrate symbolic information that is inserted into machine code to enable debugging of an application: Calculator.cpp Calculator.cpp was compiled using the GNU C++ compiler (g++) with options to include debugging information and to generate assembly source instead of an executable (machine code).debugging information 5

Applying Anti-Reversing Techniques to Machine Code Eliminating Symbolic Information in Machine Code 6 Calculator.cpp

 The GNU compiler stores debug information in the symbol tables (.stabs) section of the Windows PE header so that it will be loaded into memory as part of the program image. CalculatorDebug.s  The generated assembly language files CalculatorDebug.s on the next slide shows some of the debugging information inserted by GCC. While this information is not a replacement for the original source code it does provide some helpful information to a reverser. The GNU Project Debugger (GDB) is a source-level debugger and therefore must access the HLL source file to make use of the debugging information. 7

Applying Anti-Reversing Techniques to Machine Code Eliminating Symbolic Information in Machine Code 8 CalculatorDebug.cpp

 Debugging information can give plenty of hints to a reverse engineer, such as the count and type of parameters one must pass to a given method.  An obvious recommendation is to ensure the code is not compiled with debugging information before shipment to end-users.  The hunt for symbolic information doesn't end with information embedded by debuggers; actually it rarely begins there.  Eliminating symbolic information in machine code is difficult, therefore we’ll manually implement some familiar techniques to obfuscate the information.  First, we will take a detour to look at obfuscation of source code and what use cases it might address. 9

Applying Anti-Reversing Techniques to Machine Code Source Code Obfuscation  Obfuscating the Program  Obfuscating the Program calls for performing transformations to the source or machine code that would render either code extremely difficult to understand but functionally equivalent to the original.  When delivering software to customers, they may require the source code so that the product can be compiled using in-house build and audit procedures.  In the likely event that the source code contains intellectual property, it can be obfuscated without changing the the resultant machine code. COBF  To demonstrate source code obfuscation, COBF [23], a free C/C++ source code obfuscator was configured and given Calculator.cpp.23 Stunnix C/C++ is a commercial source code obfuscator. Stunnix C/C++ 10

11 cobf.h CalculatorObf.cpp

Applying Anti-Reversing Techniques to Machine Code Basic Obfuscation of Machine Code  [19] states “Obfuscation of Java bytecode is possible for the same reasons that decompiling is possible: Java bytecode is standardized and well documented.”19  Machine code is not standardized; instruction sets, formats, and program image layouts vary depending on the target platform architecture.  Tools to assist with obfuscating machine code are much more challenging to implement and expensive to acquire. (I have not found any free tools).  EXECryptor is an industrial-strength machine code obfuscator. EXECryptor When applied to the machine code for the Password Vault application, EXECryptor rendered it extremely difficult to understand.  EXECryptor fails to start in Windows 8.1 due to an anti-debug error. 12

Applying Anti-Reversing Techniques to Machine Code Basic Obfuscation of Machine Code  Direct machine code obfuscations can be hard to analyze or follow, so we'll perform obfuscations at the source code level and observe differences in the assembly code generated by the GNU C/C++ compiler.  Success is achieved when an obfuscated program has the same functionality as the original, but is more difficult to understand during live or static analysis.  There are no standards for code obfuscation, but it's relatively important to ensure that the obfuscations applied to a program are not easily undone because deobfuscation tools can be used to eliminate easily identified obfuscations [5].5 VerifyPassword.cpp  We now look at the source and disassembly for VerifyPassword.cpp, a simple C++ program that contains a simple if-test for a password. We then embed a simple cipher to protect sensitive constants. 13

Applying Anti-Reversing Techniques to Machine Code Basic Obfuscation of Machine Code 14 VerifyPassword.cpp

15.text section.text section.rdata section.rdata section

Applying Anti-Reversing Techniques to Machine Code Basic Obfuscation of Machine Code 16 VerifyPasswordObf.cpp

17.text section.text section.rdata section.rdata section

18

19 Run As Admin Run As Admin Python Editor Python Editor

20 Dump of.rdata 0x Run exe in new window

21 Run Frida Script Run Frida Script Plaintext extracted Plaintext extracted

22 Run As Admin Run As Admin Python Editor Python Editor

23 Run exe in new window Dump of.rdata 0x445000

24 Run Frida Script Run Frida Script Ciphertext extracted Ciphertext extracted

25

26 max.c We want to call max() dynamically using Frida We want to call max() dynamically using Frida Keep the program resident Keep the program resident

27 Debug statements from int max(x,y) function Debug statements from int max(x,y) function

28 Run max.exe using OllyDbg File->Open Specify arguments

29 int max(x,y) 0x4012F0 int max(x,y) 0x4012F0 X >= Y ?

30

31 Run max.exe in new window max.exe stays resident

32 Run max.py 2x to invoke max(x,y) in max.exe

33 Applying Anti-Reversing Techniques to Machine Code Password Vault Anti-Reversing Exercise  The machine code anti-reversing exercise asks one to create a new executable version of the Password Vault application where the following transformations are applied: data obfuscation Encryption of string literals (data obfuscation). computation obfuscation Obfuscation of the numeric representation of the password record limit (computation obfuscation). control flow obfuscation Obfuscation of the method that performs the record limit check (control flow obfuscation). Contains both unobfuscated and Obfuscated source

34 Applying Anti-Reversing Techniques to Machine Code Data Obfuscation data obfuscation  Encryption of String Literals (data obfuscation): Strings are decrypted each time they are used using a bundled cipher.

35 Applying Anti-Reversing Techniques to Machine Code Computation Obfuscation computation obfuscation  Obfuscation of the numeric representation of the password record limit (computation obfuscation): Complex evaluations obscure the actual condition.

36 Applying Anti-Reversing Techniques to Machine Code Computation Obfuscation (cont’d)  Obfuscation of the numeric representation of the password record limit (computation obfuscation) (cont’d): Testing for a function of a number can slow a reverser down.

37 Applying Anti-Reversing Techniques to Machine Code Control Flow Obfuscation control flow obfuscation  Obfuscation of the method that performs the record limit check (control flow obfuscation): We introduce some non-essential, recursive, and randomized logic to the password limit check to make it more difficult for a reverser to perform static and/or live analysis. Since no standards exist for control flow obfuscation, a custom algorithm was designed to hinder live and static analysis through use of recursive and randomized procedure calls. Recursion grows the stack considerably, making stepping through the code difficult, while randomization makes execution unpredictable (breakpoints may not trigger & run traces differ).

38 Applying Anti-Reversing Techniques to Machine Code Control Flow Obfuscation (cont’d) A control flow obfuscation algorithm for the record limit check. Depth of the recursion is randomized on each check of the limit. Random procedure call targets generate and return a number that is added to an instance variable, preventing the procedures from being identified as NOOPs by a code optimizer.

39 Applying Anti-Reversing Techniques to Machine Code Control Flow Obfuscation (cont’d)  To measure the effectiveness of the control flow algorithm in hindering analysis, three execution traces of the section of the code containing the record limit check were compared.  The Levenshtein Distance (LD) was computed between the three traces where each instruction in the trace was compared. LD was modified to consider each line as opposed to each character.Levenshtein Distance  The execution traces were collected using OllyDbg and had to be cleaned of disassembly artifacts such as line numbers, base addresses, and comments in order to ensure that the analysis was fair.

40 Applying Anti-Reversing Techniques to Machine Code Control Flow Obfuscation (cont’d) Comparison of executions of record limit check on identical program input.

41