Finding bugs: Analysis Techniques & Tools Symbolic Execution & Constraint Solving CS161 Computer Security Cho, Chia Yuan
Lab Q1: Manual reasoning on code – Mergesort implementation published in Wikibooks Q2: Constraint Solving – ‘Solve’ for collisions in ELFHash function Q3: Whitebox & blackbox fuzzing – Use a dynamic symbolic execution tool to find bugs automatically Start early!
Big Picture Attacks & Defenses Mobile Security (Android) Web Security Network Security Crypto Program Analysis & Verification Symbolic Execution & Constraint Solving Why?
A little history … Can we build a machine that can automatically reason and prove mathematical facts about programs?
1967
1976 “From one simple view, it is an enhanced testing technique. Instead of executing a program on a set of sample inputs, a program is "symbolically" executed for a set of classes of inputs.”
Why now?
Advances in SAT Solvers Source: Sanjit Seshia
Advances in SAT Solvers Source: Sanjit Seshia
Significance
How do we know our program is “correct”? In general, we don’t know. Test it Let users test it for us Fuzz it Try to prove it’s correct Static analysis Symbolic Execution & Constraint Solving Precision Coverage
Dynamic Sym Exec is Directed Testing Path-by-path exploration buf=malloc (s); read(fd, buf, len); s = lens = len + 2 len = input + 3; if len < 10 if len % 2 == 0 s = len F T T F (len == input + 3) && !(len < 10) && !(len%2==0)
Dynamic Sym Exec is Directed Testing Path-by-path exploration buf=malloc (s); read(fd, buf, len); s = lens = len + 2 len = input + 3; if len < 10 if len % 2 == 0 s = len F T T F (len == input + 3) && !(len < 10) && (len%2==0) Can we combine all paths into 1 single formula? Bounded Model Checking How do we construct the formula & use a solver?
Q2 Goal: ‘Solve’ for Hash Collisions
Constructing Logic Formulas from Code Convert statements into Static Single Assignment (SSA) form Encode SSA into target solver input format
Static Single Assignment Equations Unroll loops to form loop-free program – for(i=0; i<2; i++){a=a+1;} a=a+1; a=a+1; Rename LHS of each assignment into a new local variable a1=a+1; a2=a+1; Whenever a variable is read (e.g., at RHS), replace it with last assigned variable name a1=a0+1; a2=a1+1;
Conditional (if) statements Dynamic Symbolic Execution: – 2 separate path formulas Bounded Model Checking: – Merge both branches into 1 formula
Conditional (if) statements
Example int example1(int x) { int ret; if (x > 0) ret = x; else ret = -x; assert(ret >= 0); return ret; } SSA ret1 = x0 ret2 = -x0 ret3 = (x0>0 ? ret1 : ret2) Q: Is !(ret3 >= 0) satisfiable? Is this program correct?
Constructing Logic Formulas from Code Convert statements into Static Single Assignment (SSA) form = Bit-vector Equations in quantifier-free 1 st order logic Encode SSA into target solver input format – Bit-vector arithmetic logic – “SMT” Solver – SMT-LIB 1.0 standard
Example SMT-LIB :extrafuns(x0 BitVec[32]) :extrafuns(ret1 BitVec[32]) :extrafuns(ret2 BitVec[32]) :extrafuns(ret3 BitVec[32]) :extrapreds(branchcond1) :assumption (= ret1 x0) :assumption (= ret2 (bvneg x0) :assumption (iff branchcond1 (bvsgt x0 bv0[32]) :assumption (= ret3 (ite branchcond1 ret1 ret2) (not (bvsge ret3 bv0[32]) :formula true SSA ret1 = x0 ret2 = -x0 ret3 = (x0>0 ? ret1 : ret2) Is !(ret3 >= 0) satisfiable?
Querying the Solver $./z3 example1.smt –m ret3 -> bv [32] ret1 -> bv [32] branchcond1 -> false ret2 -> bv [32] x0 -> bv [32] sat 0x int example1(int x) { … 32 bits Two’s Complement system – Positive range: [0.. 2 N-1 – 1] – Or: [0x00.. 0x7FFFFFFF] – 0x is a negative signed 32-bit value:
Example int example1(int x) { int ret; if (x > 0) ret = x; else ret = -x; assert(ret >= 0); return ret; } SSA ret1 = x0 ret2 = -x0 ret3 = (x0>0 ? ret1 : ret2) Q: Is !(ret3 >= 0) satisfiable? Assertion violated if x =
Slightly Modified Example int example1(char x) { int ret; if (x > 0) ret = x; else ret = -x; assert(ret >= 0); return ret; } SSA ret1 = x0 ret2 = -x0 ret3 = (x0>0 ? ret1 : ret2) Q: Is !(ret3 >= 0) satisfiable?
Example :extrafuns(x0 BitVec[32]) :extrafuns(ret1 BitVec[32]) :extrafuns(ret2 BitVec[32]) :extrafuns(ret3 BitVec[32]) :extrapreds(branchcond1) :assumption (= ret1 (sign_extend[24] x0)) :assumption (= ret2 (bvneg (sign_extend[24] x0)) :assumption (iff branchcond1 (bvsgt x0 bv0[32]) :assumption (= ret3 (ite branchcond1 ret1 ret2) (not (bvsge ret3 bv0[32]) :formula true SSA ret1 = x0 ret2 = -x0 ret3 = (x0>0 ? ret1 : ret2) Is !(ret3 >= 0) satisfiable?
Querying the Solver $./z3 example1.smt –m unsat int example1(char x) { int ret; if (x > 0) ret = x; else ret = -x; assert(ret >= 0); return ret; } No satisfying assignment exists ==> Assertion holds for all possible inputs!
SMT-LIB “Cheat” Sheet: Bit-vectors Declare 32-bit “variable” ‘a’: n-bits Sign Extension to ‘a’: :extrafuns( a BitVec[32] ) sign_extend[n] a 32-bit constant ‘1234’ bv1234[32] Unary functions: ~a bvnot (a) -a bvneg (a) Binary functions: Binary predicates: bvand bvor bvxor bvadd bvshl bvlshr bvsgt bvsge bvfoo (a b) & | ^ + > > >=
SMT-LIB “Cheat” Sheet: Booleans Declare a predicate ‘C’: :extrapreds( C ) Unary connectives: ! C not (C) Binary connectives: Implies and or xor iff foo (C D) => && || Ternary connectives: C ? a : b ite (C a b) where a, b can be bit-vectors +
Exercise: C Operator Precedence 1.SSA equations? 2.SMT-LIB formula? a = (b >> c) + d; b = -(a ^ ~c);
Exercise: C Operator Precedence int a,b; char d; a = (b >> 3) + d; b = -(a ^ ~d); SSA a1 = (b0 >> 3) + d0; b1 = -(a1 ^ ~d0); SMT-LIB :extrafuns(a1 BitVec[32]) :extrafuns(b0 BitVec[32]) :extrafuns(b1 BitVec[32]) :extrafuns(d0 BitVec[8]) :assumption(= a1 (bvadd (bvlshr b0 bv3[32]) (sign_extend[24] d0)) :assumption(= b1 (bvneg (bvxor (bvnot (sign_extend[24] d0) a1 )))
Additional References An enjoyable read on verification history: – Vijay D’Silva, Tales from Verification History More about “constraint solvers”: – Daniel Kroening & Ofer Strichman, Decision Procedures: An Algorithmic Point of View