RAID 2010 Hybrid Analysis and Control of Malware Barton P. Miller 1 Hybrid Analysis of Program Binaries 1 Kevin A. Roundy

Slides:



Advertisements
Similar presentations
Pokas x86 Emulator for Generic Unpacking By Amr Thabet
Advertisements

Saumya Debray The University of Arizona Tucson, AZ
Sample chapter from Reverse Engineering Course.
© 2006 Nathan RosenblumMarch 2006Unconventional Code Constructs The New Dyninst Code Parser: Binary Code Isn't as Simple as it Used to Be Nathan Rosenblum.
Pin : Building Customized Program Analysis Tools with Dynamic Instrumentation Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geoff.
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin May 2-4, 2011 ProcControlAPI and StackwalkerAPI Integration into Dyninst Todd Frederick and Dan.
Paradyn Project Paradyn / Dyninst Week College Park, Maryland March 26-28, 2012 Paradyn Project Upcoming Features in Dyninst and its Components Bill Williams.
Paradyn Project Paradyn / Dyninst Week College Park, Maryland March 26-28, 2012 Self-propelled Instrumentation Wenbin Fang.
Binary Obfuscation Using Signals Igor V. Popov ( University of Arizona)‏ Saumya K. Debray (University of Arizona)‏ Gregory R. Andrews (University of Arizona)
Model Checking x86 Executables with CodeSurfer/x86 and WPDS++ G. Balakrishnan 1, T. Reps 1,2, N. Kidd 1, A. Lal 1, J. Lim 1, D. Melski 2, R. Gruian 2,
Run time vs. Compile time
1 Run time vs. Compile time The compiler must generate code to handle issues that arise at run time Representation of various data types Procedure linkage.
Branch Regulation: Low-Overhead Protection from Code Reuse Attacks Mehmet Kayaalp, Meltem Ozsoy, Nael Abu-Ghazaleh and Dmitry Ponomarev Department of Computer.
CMSC 414 Computer and Network Security Lecture 20 Jonathan Katz.
Address Obfuscation: An Efficient Approach to Combat a Broad Range of Memory Error Exploits Sandeep Bhatkar, Daniel C. DuVarney, and R. Sekar Stony Brook.
Fast Dynamic Binary Translation for the Kernel Piyus Kedia and Sorav Bansal IIT Delhi.
Experimental Computer Systems Lab A Binary Rewriting Defense Against Stack-based Buffer Overflow Attacks Manish Prasad, Tzi-cker Chiueh SUNY Stony Brook.
Software Analysis & Deobfuscation Engine. Page  2  Project Name: SADE  Project Members: Faiza Khalid, Komal Babar and Abdul Wahab  Project Supervisor.
Eureka: A Framework for Enabling Static Analysis on Malware
Vulnerability-Specific Execution Filtering (VSEF) for Exploit Prevention on Commodity Software Authors: James Newsome, James Newsome, David Brumley, David.
Automated Tracing and Visualization of Software Security Structure and Properties Symposium on Visualization for Cyber Security 2012 (VizSec’12) Seattle,
CS 501: Software Engineering Fall 1999 Lecture 16 Verification and Validation.
Computer Architecture and Operating Systems CS 3230 :Assembly Section Lecture 7 Department of Computer Science and Software Engineering University of Wisconsin-Platteville.
Paradyn Project Dyninst/MRNet Users’ Meeting Madison, Wisconsin August 7, 2014 The Evolution of Dyninst in Support of Cyber Security Emily Gember-Jacobson.
Behavior-based Spyware Detection By Engin Kirda and Christopher Kruegel Secure Systems Lab Technical University Vienna Greg Banks, Giovanni Vigna, and.
1 A Static Analysis Approach for Automatically Generating Test Cases for Web Applications Presented by: Beverly Leung Fahim Rahman.
A Model for Self-Modifying Code Bertrand Anckaert, Matias Madou and Koen De Bosschere 8 th Information Hiding Conference, July 11 th 2006.
KEVIN COOGAN, GEN LU, SAUMYA DEBRAY DEPARTMENT OF COMUPUTER SCIENCE UNIVERSITY OF ARIZONA 報告者:張逸文 Deobfuscation of Virtualization- Obfuscated Software.
Ether: Malware Analysis via Hardware Virtualization Extensions Author: Artem Dinaburg, Paul Royal, Monirul Sharif, Wenke Lee Presenter: Yi Yang Presenter:
Analysis Of Stripped Binary Code Laune Harris University of Wisconsin – Madison
1 Malware Analysis and Instrumentation Andrew Bernat and Kevin Roundy Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin May 2-4, 2011.
1 Malware Analysis and Instrumentation Andrew Bernat and Kevin Roundy Paradyn Project Center for Computing Science June 14, 2011.
The Deconstruction of Dyninst: Experiences and Future Directions Drew Bernat, Madhavi Krishnan, Bill Williams, Bart Miller Paradyn Project 1.
Assembly Language for Intel-Based Computers, 6 th Edition Chapter 8: Advanced Procedures (c) Pearson Education, All rights reserved. You may.
Detecting Code Reuse Attacks with a Model of Conformant Program Execution Emily R. Jacobson, Andrew R. Bernat, William R. Williams, Barton P. Miller Computer.
Auther: Kevian A. Roudy and Barton P. Miller Speaker: Chun-Chih Wu Adviser: Pao, Hsing-Kuo.
MICHALIS POLYCHRONAKIS(COLUMBIA UNIVERSITY,USA), KOSTAS G. ANAGNOSTAKIS(NIOMETRICS, SINGAPORE), EVANGELOS P. MARKATOS(FORTH-ICS, GREECE) ACSAC,2010 Comprehensive.
Stamping out worms and other Internet pests Miguel Castro Microsoft Research.
1 OmniUmpack: Fast, Generic, and Safe Unpacking of Malware Authors: Lerenzo Martignoni, Mihai Christodorescu and Somesh Jha Computer Security Applications.
RIVERSIDE RESEARCH INSTITUTE Deobfuscator: An Automated Approach to the Identification and Removal of Code Obfuscation Eric Laspe, Reverse Engineer Jason.
Analyzing Memory Accesses in Obfuscated x86 Executables Michael Venable Mohamed R. Choucane Md. Enamul Karim Arun Lakhotia (Presenter) DIMVA 2005 Wien.
DETECTING TARGETED ATTACKS USING SHADOW HONEYPOTS AUTHORS: K. G. Anagnostakisy, S. Sidiroglouz, P. Akritidis, K. Xinidis, E. Markatos, A. D. Keromytisz.
LOGOPolyUnpack: Automating the Hidden-Code Extraction of Unpack-Executing Malware Royal, P.; Halpin, M.; Dagon, D.; Edmonds, R.; Wenke Lee; Computer Security.
Processes and Virtual Memory
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 12-14, 2010 Binary Concolic Execution for Automatic Exploit Generation Todd Frederick.
April 2007The Deconstruction of Dyninst: Part 1- the SymtabAPI The Deconstruction of Dyninst Part 1: The SymtabAPI Giridhar Ravipati University of Wisconsin,
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 29-May 1, 2013 Detecting Code Reuse Attacks Using Dyninst Components Emily Jacobson, Drew.
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin May 2-4, 2011 Paradyn Project Deconstruction of Dyninst: Best Practices and Lessons Learned Bill.
1 Xen and the Art of Binary Modification Lies, Damn Lies, and Page Frame Addresses Greg Cooksey and Nate Rosenblum, March 2007.
CNIT 127: Exploit Development Ch 8: Windows Overflows Part 1.
Correct RelocationMarch 20, 2016 Correct Relocation: Do You Trust a Mutated Binary? Drew Bernat
Automatic Diagnosis and Response to Memory Corruption Vulnerabilities Authors: Jun Xu, Peng Ning, Chongkyung Kil, Yan Zhai, Chris Bookholt Cyber Defense.
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 12-14, 2010 Paradyn Project Safe and Efficient Instrumentation Andrew Bernat.
Qin Zhao1, Joon Edward Sim2, WengFai Wong1,2 1SingaporeMIT Alliance 2Department of Computer Science National University of Singapore
Dissecting complex code-reuse attacks with ROPMEMU
Machine-Level Programming 2 Control Flow
Kernel Code Coverage Nilofer Motiwala Computer Sciences Department
Security and Programming Language Work on SmartPhones
Junyuan Zeng1, Yangchun Fu1, Kenneth A. Miller1, Zhiqiang Lin1
Olatunji Ruwase* Shimin Chen+ Phillip B. Gibbons+ Todd C. Mowry*
CompSci 725 Presentation by Siu Cho Jun, William.
Procedures – Overview Lecture 19 Mon, Mar 28, 2005.
White-Box Testing.
Machine-Level Programming 2 Control Flow
Machine-Level Programming 2 Control Flow
Hyesoon Kim Onur Mutlu Jared Stark* Yale N. Patt
Hiding Malware Rootkits
Efficient x86 Instrumentation:
Morgan Kaufmann Publishers Computer Organization and Assembly Language
Dynamic Binary Translators and Instrumenters
Presentation transcript:

RAID 2010 Hybrid Analysis and Control of Malware Barton P. Miller 1 Hybrid Analysis of Program Binaries 1 Kevin A. Roundy Computer Science Department

RAID Need for forensic analysis  Malware attacks cost billions of dollars annually [1]  65% of users feel effect of cyber crime [2]  28 days to resolve an average cybercrime [2] 90% of malware resists analysis [3] 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e c0 73 1c a d8 6a d0 56 4b fe 92 malware binary Our approach  analyze code before executing it  CFG-based interface for instrumentation  bring malware under analyst’s control [1] Computer Economics [2] Norton [3] McAfee. 2008

RAID 2010 malware binary 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e c0 73 1c a d8 6a d0 56 4b fe af 40 0c b6 f f5 07 b Malware analysis factory Hybrid Analysis of Program Binaries 3 SD-Dyninst code coverage instrumentation network call instrumentation Stack trace at 1 st network communication Control flow graph showing code coverage Defensive tactics report  unpacked code  overwritten code  control flow obfuscations Trace of Win API calls

RAID 2010 storm worm Obfuscated control flow Hybrid Analysis of Program Binaries 4 Entry Point obfuscated control flow a0b0c0d e80300 e9eb045d4555c3 CALLJMP 40d00a459dd4f7 JMPPOPINCPUSHRET 40d00eebp 40d002 CALL ptr[eax] ? XOR eax,eax MOV ecx,*[eax] exceptionhandler ? handler-based ctrl flow unpacked code overwritten code obfuscated control flow handler-based ctrl flow

RAID 2010 storm worm Unpacked code Hybrid Analysis of Program Binaries 5 Entry Point 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e c0 73 1c a d8 6a d0 56 4b fe af 40 0c b6 f f5 07 b c 85 a5 94 2b 20 fd 5b 95 e7 c a d9 83 a1 37 1b 2f b c 22 8e obfuscated control flow handler-based ctrl flow unpacked code overwritten code

RAID 2010 Overwritten code Hybrid Analysis of Program Binaries 6 Upack packer 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e c0 73 1c a d8 6a d0 56 4b fe af 40 0c b6 f f5 07 b c 85 a5 94 2b 20 fd 5b 95 e7 c a d9 83 a1 37 1b 2f b c 22 8e Entry Point obfuscated control flow handler-based ctrl flow unpacked code overwritten code

RAID 2010 Factory results for Conficker A 7 initial bootstrap code packed payload Hybrid Analysis of Program Binaries

RAID 2010 Hybrid Analysis of Program Binaries Factory results for Conficker A 8 API func non executed block static block unpacked block

RAID 2010 Factory results for Conficker A 9 Hybrid Analysis of Program Binaries Stack-walk of Conficker’s communications thread Frame pc=0x7c func: DbgBreakPoint at 7x901230[Win DLL] Frame pc=0x10003c83 func: DYNbreakPoint at 0x100003c70[instrument.] Frame pc=0x100016f7 func: DYNstopThread at 0x [instrument.] Frame pc=0x71ab2dc0 func: select at 0x71ab2dc0[Win DLL] Frame pc=0x401f34 func: nosym1f058 at 0x41f058[Conficker] Instrument select and perform a stack-walk

RAID 2010 Outline Hybrid Analysis of Program Binaries 10 R.W. Par. Related work Hybrid analysis algorithm Parsing Dynamic analysis components Results D.A. H.A. Res.

RAID 2010 Non-Defensive Binary Analysis 11 Hybrid Analysis of Program Binaries program binary Process Dynamic instrumenter Static tool static code CFG un-controlled executionpre-execution R.W.  parsing  value-set analysis  binary slicing e.g., Dyninst, CodeSurfer-x86  CFG-based API for instrument- ation e.g., ATOM, Vulcan (static) Dyninst (dynamic)

RAID 2010 Static tool analysis resistant binary Hybrid Analysis of Program Binaries 12 obfuscated code static code un-controlled execution Dynamic instrumenter dynamic code Process pre-execution CFG R.W. Non-Defensive Binary Analysis  parsing  value-set analysis  binary slicing e.g., Dyninst, CodeSurfer-x86  CFG-based API for instrument- ation e.g., ATOM, Vulcan (static) Dyninst (dynamic)

RAID 2010 un-controlled execution analysis resistant binary Dynamic instrumenter 13 Hybrid Analysis of Program Binaries obfuscated code static code dynamic code Process pre-execution post-execution analysis CFG Trace analysis Trace R.W. Non-Defensive Binary Analysis  Instruction- filter based API for instrument- ation e.g.: PIN, Valgrind, DynamoRIO, DIOTA e.g.: Madou et al Quist, Liebrock. 2009

RAID 2010 un-controlled execution Our approach 14 Hybrid Analysis of Program Binaries SD-Dyninst obfuscated code static code analysis resistant binary Parser pre-execution Dynamic instrumenter Parser (source,dest) CFG dynamic code Process R.W.  CFG-based API for instrument- ation

RAID 2010 Outline 15 Hybrid Analysis of Program Binaries Related work Hybrid analysis algorithm Parsing Dynamic analysis components Results D.A. Res. R.W. P. H.A.

RAID 2010 Code discovery algorithm 16 Hybrid Analysis of Program Binaries Hybrid algorithm: ? ? Parse from known entry points Instrument control flow that may lead to new code Resume execution H.A. instrumentexceptionoverwrite CALL ptr[eax] DIV eax, 0

RAID 2010 Code discovery algorithm 17 Hybrid Analysis of Program Binaries ? Parse from known entry points Instrument control flow that may lead to new code Resume execution ? Hybrid algorithm: H.A. instrumentexceptionoverwrite CALL ptr[eax] DIV eax, 0

RAID 2010 Code discovery algorithm 18 Hybrid Analysis of Program Binaries ? Parse from known entry points Instrument control flow that may lead to new code Resume execution ? Hybrid algorithm: H.A. instrumentexceptionoverwrite CALL ptr[eax] DIV eax, 0

RAID 2010 Code discovery algorithm 19 Hybrid Analysis of Program Binaries ? Parse from known entry points Instrument control flow that may lead to new code Resume execution ? Hybrid algorithm: H.A. instrumentexceptionoverwrite CALL ptr[eax] DIV eax, 0

RAID 2010 Code discovery algorithm 20 Hybrid Analysis of Program Binaries Parse from known entry points Instrument control flow that may lead to new code Resume execution ? Hybrid algorithm: H.A. instrumentexceptionoverwrite CALL ptr[eax] DIV eax, 0

RAID 2010 Outline 21 Hybrid Analysis of Program Binaries Related work Hybrid analysis algorithm Parsing Dynamic analysis components Results D.A. H.A. Res. R.W. P.

RAID 2010  Standard control-flow traversal [1]  start from known entry points  follow control flow to find code  New conservative assumption  un-analyzed calls (pointer-based) may not return  New stack tamper detection  backwards slice at return instruction call 40d00a pop ebp inc ebp push ebp ret garbage 22 Hybrid Analysis of Program Binaries Accurate parsing P. [1] Sites et al., Binary Translation

RAID 2010 Outline 23 Hybrid Analysis of Program Binaries Related work Hybrid analysis algorithm Parsing Dynamic analysis components Results H.A. Res. R.W. P. D.A.

RAID Invalid control transfers Indirect jumps/calls Abnormal return instructions push eax ret call Invalid Region call ptr [eax] ? jmp eax ? Instrumentation-based discovery D.A. Hybrid Analysis of Program Binaries

RAID 2010 ? call ptr[eax] findTarget (ptr[eax]) SD-Dyninst process findTarget (ptr[eax]) new target 0x402d8a resume execution call ptr[eax] Instrumentation-based discovery D.A. 25 Hybrid Analysis of Program Binaries

RAID SD-Dyninst Overwritten code discovery Overwrite Detection Possible strategies  Check each executed instruction for changes [1]  Monitor writes to code Page-level write detection [2]  Remove write permissions from code pages  Write to code causes exception  Handle exception [1] Royal et al. PolyUnpack. ACSAC ’06 [2] Maebe, De Bosschere. AADEBUG ’03 code write handler write RWE R E RWER E D.A. Hybrid Analysis of Program Binaries

RAID 2010 Hybrid Analysis of Program Binaries 27 write SD-Dyninst Overwritten code discovery When to update Cases to consider  large incremental overwrites  writes to data  writes to own page R E code write handler CFG update routine D.A.

RAID 2010 Hybrid Analysis of Program Binaries 28 SD-Dyninst Overwritten code discovery When to update Cases to consider  large incremental overwrites  writes to data  writes to own page Delaying the update  until write routine terminates R E CFG update routine code write handler D.A. write

RAID 2010 Delayed updates Two components 1.Handle overwrite signal a)instrument write loop b)copy overwritten page c)restore write permissions 2.Update CFG when writes end a)remove overwritten and unreachable blocks b)parse at entry points to overwritten regions c)remove write permissions Hybrid Analysis of Program Binaries 29 SD-Dyninst Overwritten code discovery R E code write handler CFG update routine D.A. write Delayed updates Two components 1.Handle overwrite signal a)instrument write loop b)copy overwritten page c)restore write permissions 2.Update CFG when writes end a)remove overwritten and unreachable blocks b)parse at entry points to overwritten regions c)remove write permissions cb RWE cb R E

RAID 2010 Hybrid Analysis of Program Binaries 30 SD-Dyninst Overwritten code discovery Delayed updates Two components 1.Handle overwrite signal a)instrument write loop b)copy overwritten page c)restore write permissions 2.Update CFG when writes end a)remove overwritten and unreachable blocks b)parse at entry points to overwritten regions c)remove write permissions R E RWE code write handler CFG update routine cb D.A. write cb

RAID 2010 Exception State eip eip 402d8a 31 xoreax,eax movecx,*[eax] pusheax... Operating System Handler-based CF obfuscations [1] [1] Popov, Debray, Andrews. Usenix Danekhar Monitored Program D.A. access violation handler … mov *[ebp+10],eax mov 402d8a,edx mov edx,*[eax+b8] Hybrid Analysis of Program Binaries

RAID 2010 Exception State eip eip 402d8a 32 xoreax,eax movecx,*[eax] pusheax... Operating System [1] Popov, Debray, Andrews. Usenix Danekhar Monitored Program D.A. access violation handler … mov *[ebp+10],eax mov 402d8a,edx mov edx,*[eax+b8] Resolving handler-based CF access violation handler … mov *[ebp+10],eax mov 402d8a,edx mov edx,*[eax+b8] SD-Dyninst instrument exit analyze code at new target Hybrid Analysis of Program Binaries

RAID Outline Related work Hybrid analysis algorithm Parsing Dynamic analysis components Results R.W. P. D.A. Res. H.A. Hybrid Analysis of Program Binaries

RAID 2010 yes 34 Fully analyzed packed programs Packer Malware market share [1] 0.13%MEW 0.17%WinUPack 0.33%Yoda's Protector 0.37%Armadillo 0.43%Asprotect 1.26%FSG 1.29%Aspack 1.74%nPack 2.08%Upack 2.59%PECompact 2.95%Themida 4.06%EXECryptor 6.21%PolyEnE 9.45%UPX 0.89%Nspack Res. Self check- summing yes Self- modifying yes Exception- based ctrl yes Obfuscated yes [1] Packer (r)evolution. Panda Research, Two-month average Feb-March 2008.

RAID 2010 Self-checksumming techniques Hybrid Analysis of Program Binaries [1] Packer (r)evolution. Panda Research, Two-month average Feb- March Fully analyzed packed programs Packer Malware market share [1] 0.13%MEW 0.17%WinUPack 0.33%Yoda's Protector 0.37%Armadillo 0.43%Asprotect 1.26%FSG 1.29%Aspack 1.74%nPack 2.08%Upack 2.59%PECompact 2.95%Themida 4.06%EXECryptor 6.21%PolyEnE 9.45%UPX SD- Dyninst yes 0.89%Nspackyes Time to unpack uninstrumented times are about.02 secs unoptimized overwrite detection expensive overwrite detection Res. 35

RAID 2010 Instrumentation costs 36 Hybrid Analysis of Program Binaries Res. Packer Pre-payload execution timeInstrumented locations SD- DyninstRenovo Saffron Intel-PIN Ether Unpack SD- DyninstRenovo Saffron Intel-PIN UPX ,2784,526 Aspack4.45fail ,0454,141 FSG ,82231,854 WinUpack ,82632,945 MEW4.06fail ,18635,466

RAID 2010 Conclusion 37 Hybrid Analysis of Program Binaries  Analysis before execution allows for  Understanding & control of before execution  Selective monitoring  Build-your-own analysis factory  Ongoing work  Handling self-checksumming code  Releasing Dyninst w/ SD-Dyninst inside