By David Brumley, James Newsome, Dawn Song and Hao Wang and Somesh Jha.

Slides:



Advertisements
Similar presentations
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
Advertisements

Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University.
Control Flow Analysis (Chapter 7) Mooly Sagiv (with Contributions by Hanne Riis Nielson)
The Assembly Language Level
Programming Languages Marjan Sirjani 2 2. Language Design Issues Design to Run efficiently : early languages Easy to write correctly : new languages.
The Big Picture Chapter 3. We want to examine a given computational problem and see how difficult it is. Then we need to compare problems Problems appear.
SMU SRG reading by Tey Chee Meng: Automatic Patch-Based Exploit Generation is Possible: Techniques and Implications by David Brumley, Pongsin Poosankam,
David Brumley, Pongsin Poosankam, Dawn Song and Jiang Zheng Presented by Nimrod Partush.
Bouncer securing software by blocking bad input Miguel Castro Manuel Costa, Lidong Zhou, Lintao Zhang, and Marcus Peinado Microsoft Research.
FIT FIT1002 Computer Programming Unit 19 Testing and Debugging.
Breno de MedeirosFlorida State University Fall 2005 Buffer overflow and stack smashing attacks Principles of application software security.
Vertically Integrated Analysis and Transformation for Embedded Software John Regehr University of Utah.
Cpeg421-08S/final-review1 Course Review Tom St. John.
Validating High-Level Synthesis Sudipta Kundu, Sorin Lerner, Rajesh Gupta Department of Computer Science and Engineering, University of California, San.
Overview of program analysis Mooly Sagiv html://
CSE S. Tanimoto Syntax and Types 1 Representation, Syntax, Paradigms, Types Representation Formal Syntax Paradigms Data Types Type Inference.
On Deriving Unknown Vulnerabilities from Zero-Day Polymorphic Worm Exploits.
Describing Syntax and Semantics
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Composing Dataflow Analyses and Transformations Sorin Lerner (University of Washington) David Grove (IBM T.J. Watson) Craig Chambers (University of Washington)
Dynamic Test Generation To Find Integer Bugs in x86 Binary Linux Programs David Molnar Xue Cong Li David Wagner.
Vulnerability-Specific Execution Filtering (VSEF) for Exploit Prevention on Commodity Software Authors: James Newsome, James Newsome, David Brumley, David.
High level & Low level language High level programming languages are more structured, are closer to spoken language and are more intuitive than low level.
Software (Program) Analysis. Automated Static Analysis Static analyzers are software tools for source text processing They parse the program text and.
Stamping out worms and other Internet pests Miguel Castro Microsoft Research.
Computer Security and Penetration Testing
Automatic Diagnosis and Response to Memory Corruption Vulnerabilities Authors: Jun Xu, Peng Ning, Chongkyung Kil, Yan Zhai, Chris Bookholt In ACM CCS’05.
Compiler Construction Lexical Analysis. The word lexical means textual or verbal or literal. The lexical analysis implemented in the “SCANNER” module.
Vulnerability-Specific Execution Filtering (VSEF) for Exploit Prevention on Commodity Software Authors: James Newsome, James Newsome, David Brumley, David.
CSC-682 Cryptography & Computer Security Sound and Precise Analysis of Web Applications for Injection Vulnerabilities Pompi Rotaru Based on an article.
Packet Vaccine: Black-box Exploit Detection and Signature Generation
Algorithms and Algorithm Analysis The “fun” stuff.
COMPUTER SECURITY MIDTERM REVIEW CS161 University of California BerkeleyApril 4, 2012.
Jose Sanchez 1 o Tielei Wang†, TaoWei†, Zhiqiang Lin‡, Wei Zou†. o Purdue University & Peking University o Proceedings of NDSS'09: Network and Distributed.
Automatic Diagnosis and Response to Memory Corruption Vulnerabilities Presenter: Jianyong Dai Jun Xu, Peng Ning, Chongkyung Kil, Yan Zhai, Chris Bookhot.
IEEE Communications Surveys & Tutorials 1st Quarter 2008.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
1 Limits of Learning-based Signature Generation with Adversaries Shobha Venkataraman, Carnegie Mellon University Avrim Blum, Carnegie Mellon University.
1.  10% Assignments/ class participation  10% Pop Quizzes  05% Attendance  25% Mid Term  50% Final Term 2.
Introduction to Problem Solving. Steps in Programming A Very Simplified Picture –Problem Definition & Analysis – High Level Strategy for a solution –Arriving.
Stamping out worms and other Internet pests Miguel Castro Microsoft Research.
Dealing with Malware By: Brandon Payne Image source: TechTips.com.
Buffer Overflow Attack Proofing of Code Binary Gopal Gupta, Parag Doshi, R. Reghuramalingam, Doug Harris The University of Texas at Dallas.
Java Basics Hussein Suleman March 2007 UCT Department of Computer Science Computer Science 1015F.
Compiler Introduction 1 Kavita Patel. Outlines 2  1.1 What Do Compilers Do?  1.2 The Structure of a Compiler  1.3 Compilation Process  1.4 Phases.
Polygraph: Automatically Generating Signatures for Polymorphic Worms Presented by: Devendra Salvi Paper by : James Newsome, Brad Karp, Dawn Song.
 Static Analysis ◦ Towards Automatic Signature Generation of Vulnerability-based Signature  Dynamic Analysis ◦ Unleashing Mayhem on Binary Code  Automatic.
Dynamic Taint Analysis for Automatic Detection, Analysis, and Signature Generation of Exploits on Commodity Software Paper by: James Newsome and Dawn Song.
Error Explanation with Distance Metrics Authors: Alex Groce, Sagar Chaki, Daniel Kroening, and Ofer Strichman International Journal on Software Tools for.
Network-based and Attack-resilient Length Signature Generation for Zero-day Polymorphic Worms Zhichun Li 1, Lanjia Wang 2, Yan Chen 1 and Judy Fu 3 1 Lab.
Compilers Computer Symbol Table Output Scanner (lexical analysis)
Polygraph: Automatically Generating Signatures for Polymorphic Worms Authors: James Newsome (CMU), Brad Karp (Intel Research), Dawn Song (CMU) Presenter:
Onlinedeeneislam.blogspot.com1 Design and Analysis of Algorithms Slide # 1 Download From
1 Test Coverage Coverage can be based on: –source code –object code –model –control flow graph –(extended) finite state machines –data flow graph –requirements.
MOPS: an Infrastructure for Examining Security Properties of Software Authors Hao Chen and David Wagner Appears in ACM Conference on Computer and Communications.
Overview of Compilation Prepared by Manuel E. Bermúdez, Ph.D. Associate Professor University of Florida Programming Language Principles Lecture 2.
Beyond Stack Smashing: Recent Advances In Exploiting Buffer Overruns Jonathan Pincus and Brandon Baker Microsoft Researchers IEEE Security and.
@Yuan Xue Worm Attack Yuan Xue Fall 2012.
Maitrayee Mukerji. INPUT MEMORY PROCESS OUTPUT DATA INFO.
Advanced Computer Systems
Chapter 1 Introduction.
POLYGRAPH: Automatically Generating Signatures for Polymorphic Worms
Representation, Syntax, Paradigms, Types
Chapter 1 Introduction.
Secure Software Development: Theory and Practice
High Coverage Detection of Input-Related Security Faults
Representation, Syntax, Paradigms, Types
Representation, Syntax, Paradigms, Types
Representation, Syntax, Paradigms, Types
CSC-682 Advanced Computer Security
Presentation transcript:

By David Brumley, James Newsome, Dawn Song and Hao Wang and Somesh Jha

Presenter: Xin Zhao

 Definition ◦ Vulnerability - A vulnerability is a type of bug that can be used by an attacker to alter the intended operation of the software in a malicious way. ◦ Exploit - An exploit is an actual input that triggers a software vulnerability, typically with malicious intent and devastating consequences

 Zero-day attacks that exploit unknown vulnerabilities represent a serious threat  No patch or signature available  Symantec:20 unknown vulnerabilities exploited 07/2005 – 06/2007  Current practice is new vulnerability analysis and protection generation is mostly manual  Our goal: automate the process of protection generation for unknown vulnerabilities

 Beware the lion ◦ New year 2001 ◦ 10,000 systems affected ◦ invades Linux systems through a network exploit ◦ infiltrates BIND DNS through TCP or UDP Protocol ◦ allows infiltration through a legit request, but then can execute arbitrary commands through additional string of characters ◦ incident report March 30 by CERT

 Software Patch: patch the binary of vulnerable application  Input Filter: a network firewall or a module on the I/O path  Data Patch: patch the data input instead of binary  Signature: signature-based input filtering Data Input Input Filter Vulnerable application Dropped

 Automatic signature generation  Reason: ◦ Manual signature generation is slow and error ◦ Fast generation is important – previously unknown or unpatched vulnerabilities can be exploited orders of magnitude faster than a human can respond ◦ More accurate

 There are usually several different polymorphic exploit variants that can trigger a software vulnerability  Exploit variants may differ syntactically but be semantically equivalent  To be effective -- the signature should be constructed based on the property of the vulnerability, instead of an exploit

 Require manual steps  Employ heuristics which may fail in many settings  Techniques rely on specific properties of an exploit – return addresses  Be limited by underlying signature representation they can generate  Only work for specific vulnerabilities in specific circumstances

 At a high level, our main contribution is a new class of signature, that is not specific to details such as whether an exploit successfully hijacks control of the program, but instead whether executing an input will (potentially) result in an unsafe execution state.

 vulnerability signature ◦ whether executing an input potentially results in an unsafe program state  T(P, x) ◦ the execution trace obtained by executing a program P on input x  Vulnerability condition ◦ representation (how to express a vulnerability as a signature) ◦ coverage (measured by false positive rate)

 vulnerability signature ◦ representation for set of inputs that define a specified vulnerability condition  trade-offs ◦ representation: matching accuracy vs. efficiency ◦ signature creation: creation time vs. coverage  {P,T,x,c} ◦ binary program (P), instruction trace (T), exploit string (x), vulnerability condition (c)

 (P,c) = (,c)  T(P,x) is the execution trace of running P with input x means T satisfies vulnerability condition c  L P,c consists of the set of all inputs x to a program P such that  Formally:  An exploit for a vulnerability (P,c) is an input

 P given in box  x = g/AAAA  T={1,2,3,4,6,7, 8,9,8,10,11,10, 11,10,11,10, 11,10,11}  c = heap overflow (on 5th iteration of line 11)

 A vulnerability signature is a matching function MATCH which for an input x returns either EXPLOIT or BENIGN for a program P without running the program  A perfect vulnerability signature satisfies  Completeness:  Soundness:

 C: Ґ×D×M×K×I ->{BENIGN, EXPLOIT}  Ґ is a memory  D is the set of variables defined  M is the program’s map from memory to values  K is the continuation stack  I is the next instruction to execute

 Turing machine signatures ◦ precise (no false positive or negatives) ◦ may not terminate (in presence of loops, e.g.)  symbolic constraint signatures ◦ approximates looping, aliasing ◦ guaranteed to terminate  regular expression signatures ◦ approximates elementary constructs (counting) ◦ very efficient

 Can provide a precise, even exact, characterization of the vulnerability condition in a particular program  A TM that exactly emulates the program has no error rate

 says that for 10-char input, the first char is ‘g’ or ‘G’, up to four of the next chars may be spaces and at least 5 chars are non-spaces

 says ‘g’ or ‘G’ followed by 0 or more spaces and at least 5 non-spaces  E.g: [g|G][ ]*[ˆ ]{5,}

 TM - inlining vulnerability condition takes poly time  Symb. Constraint - poly-time transformations on TM  Regexp - solve constraint (exp time; PSPACE- complete)  or data-flow on TM (poly time)

 MEP is a straight-line program -- e.g. the path that the exploit took to reach the vulnerability  PEP includes different paths to the vulnerability  a complete PEP coverage signature accepts all inputs in L P,c  complete coverage through a chop of the program includes all paths from the input read (v init ) to the vulnerability point (v final )

Presenter: Xitao Wen

 Input: ◦ Vulnerable program P ◦ Vul condition c ◦ Sample exploit x ◦ Instruction trace T  Output: ◦ TM sig ◦ Symbolic constraint sig ◦ RegEx sig

PPre-process ◦D◦Disassemble binary ◦C◦Convert to an intermediate representation (IR) CChop P w.r.t. trace T ◦A◦A chop is a partial program P’ that starts at T 0 and ends at exploit point ◦C◦Call-graph level CCompute the sig ◦G◦Get TM sig ◦T◦TM -> Symbolic constraint ◦S◦Symbolic constraint -> RegEx

 Chopping reduces the size of program to be analyzed  Performed on call- graph level  No function pointer support yet

 Replace outgoing JMP with RET BENIGN

 Statically estimate effects of memory updates and loops  Memory updates: SSA analysis  Loops: static unrolling

 Solution 1: Solve constraint system S and or- ing together all members  Solution 2: Data-flow analysis optimization

 9000 lines C++ code ◦ CBMC model checker to build/solve symbolic constraints, generate RegEx’s ◦ disassembler based on Kruegel; IR new  ATPhttpd ◦ various vulnerabilities; sprintf-style string too long ◦ 10 distinct subpaths to RegEx in sec  BIND ◦ stack overflow vulnerability; TSIG vulnerability ◦ 10 distinct graphs in symbolic constraint ◦ 30ms for chopping ◦ 88% of functions were reachable between entry and vulnerability