Finding the Weakest Characterization of Erroneous Inputs Dzintars Avots and Benjamin Livshits.

Finding the Weakest Characterization of Erroneous Inputs Dzintars Avots and Benjamin Livshits

The Art of Hiding Your Sources Our approach: fleece as many papers as possible Our approach: fleece as many papers as possible You will most likely find similarities with: You will most likely find similarities with: Korat: Automated Testing Based on Java Predicates Korat: Automated Testing Based on Java Predicates Korat: Automated Testing Based on Java Predicates Korat: Automated Testing Based on Java Predicates Automatic Predicate Abstraction of C Programs Automatic Predicate Abstraction of C Programs Automatic Predicate Abstraction of C Programs Automatic Predicate Abstraction of C Programs From Symptom to Cause: Localizing Errors in Counterexample Traces From Symptom to Cause: Localizing Errors in Counterexample Traces From Symptom to Cause: Localizing Errors in Counterexample Traces From Symptom to Cause: Localizing Errors in Counterexample Traces Parametric shape analysis via 3-valued logic Parametric shape analysis via 3-valued logic Parametric shape analysis via 3-valued logic Parametric shape analysis via 3-valued logic Weakest precondition reasoning, etc. Weakest precondition reasoning, etc.

Problem Statement A lot of static tools produce error traces A lot of static tools produce error traces Metal Metal Intrinsa Intrinsa Others Others However, testing for false negatives in error traces is often hard However, testing for false negatives in error traces is often hard Why? Why? Need to determine if the error trace is feasible Need to determine if the error trace is feasible How to trigger that particular path? How to trigger that particular path? What conditions on the input and environment need to hold? What conditions on the input and environment need to hold?

More Concrete Examples Comes from (real) research motivation Comes from (real) research motivation Buffer overruns (last year’s FSE) Buffer overruns (last year’s FSE) A buffer overrun is a “tainted” user value copied to a statically sized buffer A buffer overrun is a “tainted” user value copied to a statically sized buffer Generated buffer overruns across many procedure invocations Generated buffer overruns across many procedure invocations How to test if it may actually be exploitable? How to test if it may actually be exploitable? Fault injection in Java (current research) Fault injection in Java (current research) Introduce “bad” values into the system Introduce “bad” values into the system Start with HttpRequest Start with HttpRequest Populate its fields Populate its fields Push the request through the system Push the request through the system See if we get an exception thrown See if we get an exception thrown

Exploring Possibilities Assume: varying the input influences the outcome Assume: varying the input influences the outcome Input: Input: string buffers string buffers elements of a Java structures elements of a Java structures Korat: Korat: try “small” inputs and see what happens try “small” inputs and see what happens Want: Want: weakest condition on the input that always causes a failure weakest condition on the input that always causes a failure

Observations Would be nice to have summarized representations of input which leads to definite failure, or definite success Would be nice to have summarized representations of input which leads to definite failure, or definite success Could use TVLA to show whether this input succeeds or fails, or both Could use TVLA to show whether this input succeeds or fails, or both Can we automatically derive classes of inputs through program analysis? Can we automatically derive classes of inputs through program analysis?

Properties: Int_val(u1) > 0, char_val(u2) >0, char_val(u3)=0 Properties: Int_val(u1) > 0, char_val(u2) >0, char_val(u3)=0 Edges: “is followed by” Edges: “is followed by” Represents: 5“abcde\0”, 1“x\0”, etc. Represents: 5“abcde\0”, 1“x\0”, etc. Current stream position also represented Current stream position also represented Stores describe program input u1u1 u2u2 stdin u3u3

Imitating Pred Abstraction Define predicate update formula using predicates satisfying weakest precondition Define predicate update formula using predicates satisfying weakest precondition pred’ = WP(pred)  ¬ WP( ¬ pred)1/2 pred’ = WP(pred)  ¬ WP( ¬ pred)  1/2 Enforce construct is taken care of by TVLA coerce optimization Enforce construct is taken care of by TVLA coerce optimization

Problems Length properties Length properties How to compare lengths of summarized lists with iterator position How to compare lengths of summarized lists with iterator position Deriving input shape Deriving input shape Input store properties are initially unknown Input store properties are initially unknown Reads “create” or reuse input nodes Reads “create” or reuse input nodes Branch conditions assert properties of input shape – which isn’t that interesting if “unknown” Branch conditions assert properties of input shape – which isn’t that interesting if “unknown”

Where do we need precision? Local pointer relations (same as before) Local pointer relations (same as before) Current stream position Current stream position Relevant branch condition predicates Relevant branch condition predicates If (x) { if (y) …; FAIL(); else…; } else { if (y) …; FAIL(); else…;} y is relevant, x is not ? y is relevant, x is not ? What if ( ¬ x,y) and (x, ¬ y) are both infeasible? What if ( ¬ x,y) and (x, ¬ y) are both infeasible?

Classifying Predicates Classify of all paths through program: Classify of all paths through program: Erroneous “evil” paths Erroneous “evil” paths Good paths Good paths Classify all predicates in the program: Classify all predicates in the program: P 1 : Located on erroneous paths only P 1 : Located on erroneous paths only P 0 : Located on good paths only P 0 : Located on good paths only P 1/2 : Located on both types of paths P 1/2 : Located on both types of paths (most fall in the last category) (most fall in the last category)

Iteratively Run TVLA I = P 0  P 1 ; // set of instrumentation predicates do { 1. use I as instrumentation predicates 1. use I as instrumentation predicates 2. run TVLA on the program 2. run TVLA on the program 3. add input TVLA structures leading to error to S 3. add input TVLA structures leading to error to S 4. include more predicates into I if have ½ values 4. include more predicates into I if have ½ values } while ( I changes && not tired yet ) ; // simplify structures leading to error w = empty foreach (configuration c in S){ OR c with w// w is the weakest input leading to error OR c with w// w is the weakest input leading to error}

Bottom Line Identify weakest input w leading to errors Identify weakest input w leading to errors TVLA provides a sound proof that it will always lead to an error TVLA provides a sound proof that it will always lead to an error Have a choice of which predicates to add to I next, can try heuristics Have a choice of which predicates to add to I next, can try heuristics Get a qualitatively much stronger answer that Korat Get a qualitatively much stronger answer that Korat

Finding the Weakest Characterization of Erroneous Inputs Dzintars Avots and Benjamin Livshits.

Similar presentations

Presentation on theme: "Finding the Weakest Characterization of Erroneous Inputs Dzintars Avots and Benjamin Livshits."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Finding the Weakest Characterization of Erroneous Inputs Dzintars Avots and Benjamin Livshits.

Similar presentations

Presentation on theme: "Finding the Weakest Characterization of Erroneous Inputs Dzintars Avots and Benjamin Livshits."— Presentation transcript:

Similar presentations

About project

Feedback