RIVERSIDE RESEARCH INSTITUTE Deobfuscator: An Automated Approach to the Identification and Removal of Code Obfuscation Eric Laspe, Reverse Engineer Jason.

Slides:



Advertisements
Similar presentations
Practical Malware Analysis
Advertisements

Saumya Debray The University of Arizona Tucson, AZ
A simple register allocation optimization scheme.
Smita Thaker 1 Polymorphic & Metamorphic Viruses Presented By : Smita Thaker Dated : Nov 18, 2003.
PRESENTED BY: © Mandiant Corporation. All rights reserved. X86 Binary Rewriting Many Binaries. Such Secure. Wow. Richard Wartell 06/29/14.
RIVERSIDE RESEARCH INSTITUTE Helikaon Linux Debugger: A Stealthy Custom Debugger For Linux Jason Raber, Team Lead - Reverse Engineer.
1 Detection of Injected, Dynamically Generated, and Obfuscated Malicious Code (DOME) Subha Ramanathan & Arun Krishnamurthy Nov 15, 2005.
IDA and obfuscated code Hex-Rays Ilfak Guilfanov.
Chapter 5 Anti-Anti-Virus. Anti-Anti-Virus  All viruses self-replicate  Anti-anti-virus means it’s “openly hostile” to AV  Anti-anti-virus techniques?
Binary Obfuscation Using Signals Igor V. Popov ( University of Arizona)‏ Saumya K. Debray (University of Arizona)‏ Gregory R. Andrews (University of Arizona)
TK 2633 Microprocessor & Interfacing Lecture 3: Introduction to 8085 Assembly Language Programming (2) 1 Prepared By: Associate Prof. Dr Masri Ayob.
Accessing parameters from the stack and calling functions.
X86 ISA Compiler Baojian Hua Front End source code abstract syntax tree lexical analyzer parser tokens IR semantic analyzer.
1. 2 FUNCTION INLINE FUNCTION DIFFERENCE BETWEEN FUNCTION AND INLINE FUNCTION CONCLUSION 3.
C Prog. To Object Code text text binary binary Code in files p1.c p2.c
SRE  Introduction 1 Software Reverse Engineering (SRE)
2  Problem Definition  Project Purpose – Building Obfuscator  Obfuscation Quality  Obfuscation Using Opaque Predicates  Future Planning.
Software Analysis & Deobfuscation Engine. Page  2  Project Name: SADE  Project Members: Faiza Khalid, Komal Babar and Abdul Wahab  Project Supervisor.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
Application Security Tom Chothia Computer Security, Lecture 14.
Abstraction Interpretation Abstract Interpretation is a general theory for approximating the semantics of dynamic systems (Cousot & Cousot 1977) Abstract.
Dr. José M. Reyes Álamo 1.  The 80x86 memory addressing modes provide flexible access to memory, allowing you to easily access ◦ Variables ◦ Arrays ◦
A Model for Self-Modifying Code Bertrand Anckaert, Matias Madou and Koen De Bosschere 8 th Information Hiding Conference, July 11 th 2006.
KEVIN COOGAN, GEN LU, SAUMYA DEBRAY DEPARTMENT OF COMUPUTER SCIENCE UNIVERSITY OF ARIZONA 報告者:張逸文 Deobfuscation of Virtualization- Obfuscated Software.
Discovering Similarity of Short Programs by Canonical Form Baohua Wu University of Pennsylvania.
Learning, Monitoring, and Repair in Application Communities Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts Institute.
Windows PE files Infections and Heuristic Detection Nicolas BRULEZ / Digital River PACSEC '04.
Binary Auditing Geller Bedoya Michael Wozniak. Background  Binary auditing is a technique used to test the security and discover the inner workings of.
EECS 354 Network Security Reverse Engineering. Introduction Preventing Reverse Engineering Reversing High Level Languages Reversing an ELF Executable.
Introduction to Information Security מרצים : Dr. Eran Tromer: Prof. Avishai Wool: מתרגלים : Itamar Gilad
Auther: Kevian A. Roudy and Barton P. Miller Speaker: Chun-Chih Wu Adviser: Pao, Hsing-Kuo.
Christopher Kruegel University of California Engin Kirda Institute Eurecom Clemens Kolbitsch Thorsten Holz Secure Systems Lab Vienna University of Technology.
Exploitation Of Windows Buffer Overflows. What is a Buffer Overflow A buffer overflow is when memory is copied to a location that is outside of its allocated.
Analyzing Memory Accesses in Obfuscated x86 Executables Michael Venable Mohamed R. Choucane Md. Enamul Karim Arun Lakhotia (Presenter) DIMVA 2005 Wien.
Buffer Overflow Proofing of Code Binaries By Ramya Reguramalingam Graduate Student, Computer Science Advisor: Dr. Gopal Gupta.
Disclaimer The Content, Demonstration, Source Code and Programs presented here is "AS IS" without any warranty or conditions.
A Generic Approach to Automatic Deobfuscation of Executable Code Paper by Babak Yadegari, Brian Johannesmeyer, Benjamin Whitely, Saumya Debray.
Where’s the FEEB?: Effectiveness of Instruction Set Randomization Nora Sovarel, David Evans, Nate Paul University of Virginia Computer Science USENIX Security.
Operating System Protection Through Program Evolution Fred Cohen Computers and Security 1992.
Buffer Overflow Attack- proofing of Code Binaries Ramya Reguramalingam Gopal Gupta Gopal Gupta Department of Computer Science University of Texas at Dallas.
Introduction to Information Security מרצים : Dr. Eran Tromer: Prof. Avishai Wool: מתרגלים : Itamar Gilad
Formal Refinement of Obfuscated Codes Hamidreza Ebtehaj 1.
Lecture 11 Example Rootkit. Intel internship Intel CTG (Corporate Technology Group) –Advanced research & development –System integrity services using.
Calling Procedures C calling conventions. Outline Procedures Procedure call mechanism Passing parameters Local variable storage C-Style procedures Recursion.
Improvements to the Compiler Lecture 27 Mon, Apr 26, 2004.
Practical Session 8. Position Independent Code- self sufficiency of combining program Position Independent Code (PIC) program has everything it needs.
Preocedures A closer look at procedures. Outline Procedures Procedure call mechanism Passing parameters Local variable storage C-Style procedures Recursion.
Correct RelocationMarch 20, 2016 Correct Relocation: Do You Trust a Mutated Binary? Drew Bernat
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 12-14, 2010 Paradyn Project Safe and Efficient Instrumentation Andrew Bernat.
Software Engineering Algorithms, Compilers, & Lifecycle.
Reverse Engineering Contemporary Countermeasures By: Joshua Schwartz.
Polymorphic Virus Analysis Nicolas BRULEZ Senior Virus Researcher Websense Security Labs IMPROVISED TALK MMMKAY?!
Section 5: Procedures & Stacks
Shellcode COSC 480 Presentation Alison Buben.
Return Oriented Programming
Static and dynamic analysis of binaries
Harvesting Runtime Values in Android Applications That Feature Anti-Analysis Techniques Presented by Vikraman Mohan.
Techniques, Tools, and Research Issues
Un</br>able’s MySecretSecrets
Attacking Obfuscated Code with IDA Pro
Computer Architecture and Assembly Language
Ramblr Making Reassembly Great Again
C Prog. To Object Code text text binary binary Code in files p1.c p2.c
Understanding Program Address Space
Week 2: Buffer Overflow Part 2.
CMSC 491/691 Malware Analysis
Instruction Set Summary
Computer Architecture and System Programming Laboratory
Return-to-libc Attacks
Presentation transcript:

RIVERSIDE RESEARCH INSTITUTE Deobfuscator: An Automated Approach to the Identification and Removal of Code Obfuscation Eric Laspe, Reverse Engineer Jason Raber, Lead Reverse Engineer

RECON2008 Overview The Problem: Obfuscation Malware Example: RustockB The Solution: Deobfuscator Demonstration RustockB: Before & After Sample Source Code Summary

RECON2008 Overview The Problem: Obfuscation Malware Example: RustockB The Solution: Deobfuscator Demonstration RustockB: Before & After Sample Source Code Summary

RECON2008 The Problem: Obfuscated Code Malware authors use code obfuscation techniques to hide their malicious code Obfuscation costs reverse engineers time: –Complicates instruction sequences –Disrupts control flow –Makes algorithms difficult to understand Manual obfuscation removal is a tedious and error-prone process

RECON2008 Example: PUSH_POP_MATH PUSH an immediate, then POP into a register and do some math on it Obfuscated code: Resolves to: Emulate Result NOP Unnecessary Instructions Math on EDX POP it into EDX PUSH a value

RECON2008 Overview The Problem: Obfuscation Malware Example: RustockB The Solution: Deobfuscator Demonstration RustockB: Before & After Sample Source Code Summary

RECON2008 Malware Example: RustockB Good malware example that implemented obfuscation patterns to hide a decryption routine Many useless and confusing instructions –Push regs, math, pop regs –Pushes and pops in various obfuscated forms Control flow obscured –Mangled jumps –Unnecessary data cross-references

RECON2008 RustockB Control Flow

RECON2008 RustockB Control Flow Obfuscated Jump Obfuscated Pop Unref’d Instruction Obfuscated Jump Obfuscated Push Obfuscated Jump

RECON2008 Overview The Problem: Obfuscation Malware Example: RustockB The Solution: Deobfuscator Demonstration RustockB: Before & After Sample Source Code Summary

RECON2008 The Solution: The Deobfuscator IDA Pro Plug-in Combines instruction emulation and pattern recognition Determines proper code control flow Interprets and transforms instruction sequences to enhance code readability Uses a binary injector to make both static and dynamic analysis easier

RECON2008 Modes of Operation The plug-in has six modes: –Anti-disassembly – replaces anti-disassembly with simplified code –Passive – simple peep-hole rules –Aggressive – uses aggressive assumptions about memory contents –Ultra – more aggressive assumptions –Remove NOPs – jumps over slack space –Collapse – moves consecutive code blocks together to eliminate NOPs and JMPs N E W !

RECON2008 IDA Pro Integration Deobfuscator plug-in invoked with Alt-Z Uses structures created by IDA Pro disassembly and analysis Depending on the mode selected, it can: –Follow jumps and calls –Track registers and emulate the stack

RECON2008 Overview The Problem: Obfuscation Malware Example: RustockB The Solution: Deobfuscator Demonstration RustockB: Before & After Sample Source Code Summary

RECON2008 Demonstration Demo code protected with anti-disassembly code and obfuscation Note the obfuscated jump at the end of this graph Run iteratively, the Deobfuscator will remove obfuscation and improve code flow readability

RECON2008 Run 1 – Anti-Disassembly Two matching patterns –JZ_JMP –CALL_MATH

RECON2008 Pattern: JZ_JMP Before Deobfuscation: After Deobfuscation: Useless Jumps NOP’d Jumps Two useless jumps

RECON2008 Pattern: CALL_MATH Before Deobfuscation: After Deobfuscation: EDX gets the return address of the CALL $5 Then, there is some math on EDX EDX = NOP’d Pop & Math Emulated Result

RECON2008 Output Injection A text file is generated by the Deobfuscator plug-in Then, we inject the binary with a PERL script Or just modify the IDA Pro database N E W !

RECON2008 Reload Now, we see the obfuscated code begin to disappear The Deobfuscator replaces obfuscation patterns and injects NOPs over useless code to create slack space

RECON2008 Slack Space Slack space is useful for patterns that need additional bytes to create a simplified instruction Example: Obfuscated Code PUSHEAX NOPNOPNOPNOP POPEBX Transformed Code 2 MOVEBX, IMMED NOP Transformed Code 1 MOVEBX, EAX NOPNOPNOPNOP * Code that was removed by an earlier run of the Deobfuscator * Needs two bytes Needs five bytes

RECON2008 Run 2 – Passive, Aggressive, & Ultra Three matching patterns –MOV_MATH –MATH_MOV_OR_POP

RECON2008 Pattern: MOV_MATH Before Deobfuscation: After Deobfuscation: Move into EAX EAX Math Move an immediate into EAX and XOR it with another known register value Emulated Result NOP’d Math

RECON2008 Pattern: MATH_MOV_OR_POP Do math on EDX, then MOV an immediate or POP from the stack into EDX before using it again Before Deobfuscation: After Deobfuscation: NOP’d Math EDX Math

RECON2008 Finishing Up The Deobfuscator has finished matching obfuscation patterns Slack space is no longer needed, so we run one of the clean-up modes to simplify the appearance of the control flow “NOP Remove” injects JMPs to remove NOPs from control flow “Collapse” mode moves code to slack space to eliminate NOPs and JMPs

RECON2008 NOP Remove Before:After:

RECON2008 Overview The Problem: Obfuscation Malware Example: RustockB The Solution: Deobfuscator Demonstration RustockB: Before & After Sample Source Code Summary

RECON2008 RustockB: Before & After Deobfuscated!

RECON2008 RustockB Decryption Pseudo-code for (i = 7; i > 0; i--) { Address = 0x00401B82// Starting address of encrypted region Key1 = 0x4DFEE1C0// Decryption key 1 Key2 = 0x0869ECC5// Decryption key 2 Key3 = 0// Decryption key 3 Key4 = 0// Decryption key 4 (Accumulator) for (j = 0x44DC; j > 0; j--, Address += 4)// 0x44DC = size of encrypted region { for (k = 2; k > 0; k--) { Key4 = k * 4 XOR Key4, 0x5E57B7DE XOR Key4, Key3 Key4 += Key2 XOR Key1, k [Address] -= Key4 Key3 += Key1 } for (i = 0x44DC, Address = 0x00401B82, Sum = 0; i > 0; i--, Address += 4) Sum += [Address]// Add up the encrypted region (a DWORD at a time) in EAX for (i = 0x44DC, Address = 0x00401B82; i > 0; i--, Address += 4) XOR [Address], Sum// XOR each DWORD of the encrypted region with the sum in EAX

RECON2008 Overview The Problem: Obfuscation Malware Example: RustockB The Solution: Deobfuscator Demonstration RustockB: Before & After Sample Source Code Summary

RECON2008 // // CALL NULL - A function call that just returns // int CALL_NULL(insn_t call, FILE *outfile, int *instr_offset) { if (call.itype == NN_call && call.Operands[0].type == o_near) { if (!get_next_instruction(call.Operands[0].addr)) return 0; insn_t ret = cmd; // Function that just returns if (ret.itype == NN_retn) { *instr_offset = call.size; msg("\n%a CALL_NULL\n", call.ea); // NOP the call fprintf(outfile, "%X \n", get_fileregion_offset(call.ea)); // NOP the return fprintf(outfile, "%X 1 90\n", get_fileregion_offset(ret.ea)); return 1; } return 0; } Sample Source Code The Simple Solution: A Simple Problem:

RECON2008 Overview The Problem: Obfuscation Malware Example: RustockB The Solution: Deobfuscator Demonstration RustockB: Before & After Sample Source Code Summary

RECON2008 Summary Most malware authors that wish to protect their IP use obfuscation techniques The Deobfuscator detects and simplifies many of these obfuscation and anti- disassembly patterns Over time, the repository of patterns will be developed to characterize most generic cases of obfuscation

RECON2008 Future Development Iterative patching of IDA database

RECON2008 Future Development Iterative patching of IDA database Code collapsing

RECON2008 Future Development Iterative patching of IDA database Code collapsing Grammar Black-box control flow

RECON2008 Contact For more information on this and other tools, contact: Eric Laspe, Reverse Engineer Visit us online: