Edward J. Schwartz, Thanassis Avgerinos, David Brumley

Slides:



Advertisements
Similar presentations
All You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been afraid to ask) Edward J. Schwartz, ThanassisAvgerinos,
Advertisements

Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
Architecture-dependent optimizations Functional units, delay slots and dependency analysis.
Compiler Optimized Dynamic Taint Analysis James Kasten Alex Crowell.
Dynamic Programming.
Linear Obfuscation to Combat Symbolic Execution Zhi Wang 1, Jiang Ming 2, Chunfu Jia 1 and Debin Gao 3 1 Nankai University 2 Pennsylvania State University.
Unleashing Mayhem on Binary Code
Precise Inter-procedural Analysis Sumit Gulwani George C. Necula using Random Interpretation presented by Kian Win Ong UC Berkeley.
Algorithms and Problem Solving-1 Algorithms and Problem Solving.
Algorithms and Problem Solving. Learn about problem solving skills Explore the algorithmic approach for problem solving Learn about algorithm development.
AUTOMATIC CONCOLIC TEST GENERATION WITH VIRTUAL PROTOTYPES FOR POST-SILICON VALIDATION Reviewer: Shin-Yann Ho Instructor: Jie-Hong Jiang.
Dynamic Taint Analysis and Forward Symbolic Execution Ankush Tyagi.
Week 2 CS 361: Advanced Data Structures and Algorithms
Presented by: Tom Staley. Introduction Rising security concerns in the smartphone app community Use of private data: Passwords Financial records GPS locations.
Effective Real-time Android Application Auditing
EECS 583 – Class 21 Research Topic 3: Dynamic Taint Analysis University of Michigan December 5, 2012.
1 Hybrid-Formal Coverage Convergence Dan Benua Synopsys Verification Group January 18, 2010.
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
What's The Plan? Algorithmic Thinking Step-by-step directions for whatever someone, or the computer, needs to do © 2004 Lawrence Snyder.
Theory of Programming Languages Introduction. What is a Programming Language? John von Neumann (1940’s) –Stored program concept –CPU actions determined.
Application of Design Heuristics in the Designing and Implementation of Object Oriented Informational Systems.
Daniel Kroening and Ofer Strichman 1 Decision Procedures An Algorithmic Point of View BDDs.
1 Chapter 3 Complexity of Classical Planning. 2 Review: Classical Representation Function-free first-order language L Statement of a classical planning.
Static WCET Analysis vs. Measurement: What is the Right Way to Assess Real-Time Task Timing? Worst Case Execution Time Prediction by Static Program Analysis.
Algorithm Analysis (Big O)
Warm-up. Systems of Equations: Substitution Solving by Substitution 1)Solve one of the equations for a variable. 2)Substitute the expression from step.
Space Complexity. Reminder: P, NP classes P is the class of problems that can be solved with algorithms that runs in polynomial time NP is the class of.
1 Section 5.1 Analyzing Algorithms Let P be a problem and A an algorithm to solve P. The running time of A can be analyzed by counting the number of certain.
The NP class. NP-completeness
Theoretical analysis of time efficiency
Algorithms and Problem Solving
Software Testing.
Testing and Debugging PPT By :Dr. R. Mall.
Matching Logic An Alternative to Hoare/Floyd Logic
CS 3343: Analysis of Algorithms
Seminar in automatic tools for analyzing programs with dynamic memory
Backtracking And Branch And Bound
MobiSys 2017 Symbolic Execution of Android Framework with Applications to Vulnerability Discovery and Exploit Generation Qiang Zeng joint work with Lannan.
CSE373: Data Structures & Algorithms Lecture 7: AVL Trees
A Test Case + Mock Class Generator for Coding Against Interfaces
Taint tracking Suman Jana.
COSC160: Data Structures Linked Lists
Course Description Algorithms are: Recipes for solving problems.
State your reasons or how to keep proofs while optimizing code
Backtracking And Branch And Bound
Microsoft Visual Basic 2005 BASICS
Symbolic Implementation of the Best Transformer
CS 3343: Analysis of Algorithms
Software Testing (Lecture 11-a)
Multi-Way Search Trees
Problem Solving and Searching
Software Security Lesson Introduction
All You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been afraid to ask) Edward J. Schwartz, Thanassis.
What to do when you don’t know anything know nothing
Closure Representations in Higher-Order Programming Languages
Problem Solving and Searching
Chapter 11 Limitations of Algorithm Power
Recurrences (Method 4) Alexandra Stefan.
Algorithms and Problem Solving
Solving the Minimum Labeling Spanning Tree Problem
Teaching with angr: A Symbolic Execution Curriculum and CTF
CSE 373: Data Structures and Algorithms
CSC-682 Advanced Computer Security
CUTE: A Concolic Unit Testing Engine for C
Backtracking And Branch And Bound
Basic Concepts of Algorithm
CSE 373 Data Structures and Algorithms
CSE 373: Data Structures and Algorithms
Course Description Algorithms are: Recipes for solving problems.
Lab 8: GUI testing Software Testing LTAT
Presentation transcript:

Edward J. Schwartz, Thanassis Avgerinos, David Brumley All You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been afraid to ask) (Yes, we were trying to overflow the title length field on the submission server) Edward J. Schwartz, Thanassis Avgerinos, David Brumley 11/15/2018 Carnegie Mellon University

Edward J. Schwartz, Thanassis Avgerinos, David Brumley A Few Things You Need to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been afraid to ask) Edward J. Schwartz, Thanassis Avgerinos, David Brumley 11/15/2018 Carnegie Mellon University

The Root of All Evil Humans write programs This Talk: Computers Analyzing Programs Dynamically at Runtime 11/15/2018 Carnegie Mellon University

Two Essential Runtime Analyses Detect Exploits [Costa2005,Crandall2005, Newsome2005,Suh2004] Detect packing in malware [Bayer2009,Yin2007] Dynamic Taint Analysis: What values are derived from user input? Forward Symbolic Execution: What input will make execution reach this line of code? Input Filter Generation [Costa2007,Brumley2008] Automated Test Case Generation [Cadar2008,Godefroid2005,Sen2005] 11/15/2018 Carnegie Mellon University

Our Contributions 1: Turn English descriptions into an algorithm Computers Analyzing Programs Dynamically at Runtime 1: Turn English descriptions into an algorithm Operational Semantics 2: Algorithm highlights caveats, issues, and unsolved problems that are deceptively hard Dynamic Taint Analysis: Is this value affected by user input? Forward Symbolic Execution: What input will make execution reach this line of code? 11/15/2018 Carnegie Mellon University

Our Contributions (cont’d) 3: Systematize recurring themes in a wealth of previous work 11/15/2018 Carnegie Mellon University

Dynamic Taint Analysis: What values are derived from user input? How it works – example Desired properties Example issue. Paper has many more. 11/15/2018 Carnegie Mellon University

Carnegie Mellon University Δ Var Val tainted untainted x = get_input( ) y = x + 42 … goto y x 7 Input is tainted Tainted? Var τ Taint Introduction Input t = IsUntrusted(src) get_input(src)↓ t x T 11/15/2018 Carnegie Mellon University

Δ τ t1 = τ[x1] , t2 = τ[x2] BinOp x1 + x2 ↓ t1 v t2 Var Val x 7 tainted untainted x = get_input( ) y = x + 42 … goto y y 49 Data derived from user input is tainted Tainted? T Var x τ Taint Propagation BinOp t1 = τ[x1] , t2 = τ[x2] x1 + x2 ↓ t1 v t2 y T 11/15/2018 Carnegie Mellon University

Δ τ Pgoto(ta) = ¬ ta (Must be true to execute) Var Val x 7 y 49 tainted untainted x = get_input( ) y = x + 42 … goto y Policy Violation Detected Tainted? T Var x y τ Taint Checking Pgoto(ta) = ¬ ta (Must be true to execute) 11/15/2018 Carnegie Mellon University

x = get_input( ) y = … … goto y Real Use: Exploit Detection x = get_input( ) y = … … goto y Jumping to overwritten return address … strcpy(buffer,argv[1]) ; return ; 11/15/2018 Carnegie Mellon University

Carnegie Mellon University Memory Load Variables Memory Δ Var Val x 7 μ Addr Val 7 42 Tainted? T Var x τ Tainted? F Addr 7 τμ 11/15/2018 Carnegie Mellon University

Problem: Memory Addresses Δ Var Val x = get_input( ) y = load( x ) … goto y x 7 μ Addr Val 7 42 All values derived from user input are tainted?? Tainted? F Addr 7 τμ 11/15/2018 Carnegie Mellon University

Δ μ τμ Policy 1: Undertainting Failing to identify tainted values Taint depends only on the memory cell Δ Var Val x = get_input( ) y = load( x ) … goto y Undertainting Failing to identify tainted values - e.g., missing exploits x 7 Jump target could be any untainted memory cell value μ Addr Val 7 42 Taint Propagation Tainted? F Addr 7 τμ Load v = Δ[x] , t = τμ[v] load(x) ↓ t 11/15/2018 Carnegie Mellon University

Policy 2: Overtainting Unaffected values are tainted If either the address or the memory cell is tainted, then the value is tainted Memory printa printb x = get_input( ) y = load(jmp_table + x % 2 ) … goto y Overtainting Unaffected values are tainted - e.g., exploits on safe inputs Address expression is tainted jmp_table Policy Violation? Taint Propagation Load v = Δ[x] , t = τμ[v], ta = τ[x] load(x) ↓ t v ta 11/15/2018 Carnegie Mellon University

Research Challenge State-of-the-Art is not perfect for all programs Undertainting: Policy may miss taint Overtainting: Policy may wrongly detect taint 11/15/2018 Carnegie Mellon University

Carnegie Mellon University Forward Symbolic Execution: What input will make execution reach this line of code? How it works – example Inherent problems of symbolic execution Proposed solutions 11/15/2018 Carnegie Mellon University

Carnegie Mellon University The Challenge bad_abs(x is input) if (x < 0) then return -x if (x = 0x12345678) then return -x return x 232 possible inputs 0x12345678 Forward Symbolic Execution: What input will make execution reach this line of code? 11/15/2018 Carnegie Mellon University

A Simple Example x symbolic can have any value Interpreter bad_abs(x is input) If (x < 0) If x == 0x12345678 return -x return x What input will make execution reach this line of code? Interpreter Interpreter x ≥ 0 t f Interpreter Interpreter t f x < 0 x ≥ 0 Λ x != 0x12345678 x ≥ 0 Λ x == 0x12345678 11/15/2018 Carnegie Mellon University

One Problem: Exponential Blowup Due to Branches Interpreter Branch 1 Branch 2 Branch 3 Exponential Number of Interpreters/formulas in # of branches 11/15/2018 Carnegie Mellon University

Path Selection Heuristics Symbolic Execution Tree However, these are heuristics. In the worst case all create an exponential number of formulas in the tree height. Depth-First Search (bounded) ,Random Search [Cadar2008] Concolic Testing [Sen2005,Godefroid2008] … 11/15/2018 Carnegie Mellon University

Symbolic Execution is not Easy Exponential number of interpreters/formulas Exponentially-sized formulas Solving a formula is NP-Complete! branching s + s + s + s + s + s + s + s == 42 substitution 11/15/2018 Carnegie Mellon University

Other Important Issues Formalization Π = (s + s + s + s + s + s + s + s) == 42 More complex policies 11/15/2018 Carnegie Mellon University

Carnegie Mellon University Conclusion Dynamic taint analysis and forward symbolic execution used extensively in literature Formal algorithm and what is done for each possible step of execution often not emphasized We provided a formal definition and summarized Critical issues State-of-the-art solutions Common tradeoffs 11/15/2018 Carnegie Mellon University

Carnegie Mellon University Thank You! thanassis@cmu.edu Questions? 11/15/2018 Carnegie Mellon University