Optimistic Hybrid Analysis:

Slides:

Advertisements

Similar presentations

Dataflow Analysis for Datarace-Free Programs (ESOP 11) Arnab De Joint work with Deepak DSouza and Rupesh Nasre Indian Institute of Science, Bangalore.

Advertisements

Anshul Kumar, CSE IITD CSL718 : VLIW - Software Driven ILP Hardware Support for Exposing ILP at Compile Time 3rd Apr, 2006.

Register Allocation CS 671 March 27, CS 671 – Spring Register Allocation - Motivation Consider adding two numbers together: Advantages: Fewer.

Chair of Software Engineering From Program slicing to Abstract Interpretation Dr. Manuel Oriol.

OSDI ’10 Research Visions 3 October Epoch parallelism: One execution is not enough Jessica Ouyang, Kaushik Veeraraghavan, Dongyoon Lee, Peter Chen,

GATEKEEPER MOSTLY STATIC ENFORCEMENT OF SECURITY AND RELIABILITY PROPERTIES FOR JAVASCRIPT CODE Salvatore Guarnieri & Benjamin Livshits Presented by Michael.

ECE 720T5 Fall 2012 Cyber-Physical Systems Rodolfo Pellizzoni.

Recap from last time We were trying to do Common Subexpression Elimination Compute expressions that are available at each program point.

Background on Testing and Maintenance CISC 879 Fall 2008.

A Type System for Expressive Security Policies David Walker Cornell University.

Michael Ernst, page 1 Improving Test Suites via Operational Abstraction Michael Ernst MIT Lab for Computer Science Joint.

Soft. Eng. II, Spr. 2002Dr Driss Kettani, from I. Sommerville1 CSC-3325: Chapter 9 Title : Reliability Reading: I. Sommerville, Chap. 16, 17 and 18.

Overview of program analysis Mooly Sagiv html://

Parallelizing Data Race Detection Benjamin Wester Facebook David Devecsery, Peter Chen, Jason Flinn, Satish Narayanasamy University of Michigan.

Overview of program analysis Mooly Sagiv html://

Thread-modular Abstraction Refinement Thomas A. Henzinger, et al. CAV 2003 Seonggun Kim KAIST CS750b.

ECE 720T5 Winter 2014 Cyber-Physical Systems Rodolfo Pellizzoni.

15-740/ Oct. 17, 2012 Stefan Muller.  Problem: Software is buggy!  More specific problem: Want to make sure software doesn’t have bad property.

Accelerating Precise Race Detection Using Commercially-Available Hardware Transactional Memory Support Serdar Tasiran Koc University, Istanbul, Turkey.

Race Checking by Context Inference Tom Henzinger Ranjit Jhala Rupak Majumdar UC Berkeley.

Which Configuration Option Should I Change? Sai Zhang, Michael D. Ernst University of Washington Presented by: Kıvanç Muşlu.

Static Program Analysis of Embedded Software Ramakrishnan Venkitaraman Graduate Student, Computer Science Advisor: Dr. Gopal Gupta.

Static Program Analyses of DSP Software Systems Ramakrishnan Venkitaraman and Gopal Gupta.

Static Program Analysis of Embedded Software Ramakrishnan Venkitaraman Graduate Student, Computer Science Advisor: Dr. Gopal Gupta

Convergence of Model Checking & Program Analysis Philippe Giabbanelli CMPT 894 – Spring 2008.

Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein.

Design - programming Cmpe 450 Fall Dynamic Analysis Software quality Design carefully from the start Simple and clean Fewer errors Finding errors.

CSCI1600: Embedded and Real Time Software Lecture 33: Worst Case Execution Time Steven Reiss, Fall 2015.

Sampling Dynamic Dataflow Analyses Joseph L. Greathouse Advanced Computer Architecture Laboratory University of Michigan University of British Columbia.

GC Assertions: Using the Garbage Collector To Check Heap Properties Samuel Z. Guyer Tufts University Edward Aftandilian Tufts University.

PINTOS: An Execution Phase Based Optimization and Simulation Tool) PINTOS: An Execution Phase Based Optimization and Simulation Tool) Wei Hsu, Jinpyo Kim,

CSE 374 Programming Concepts & Tools Hal Perkins Fall 2015 Lecture 17 – Specifications, error checking & assert.

December 1, 2006©2006 Craig Zilles1 Threads & Atomic Operations in Hardware  Previously, we introduced multi-core parallelism & cache coherence —Today.

Optimistic Hybrid Analysis

Tonga Institute of Higher Education IT 141: Information Systems

Hardware & Software Reliability

Why Events Are A Bad Idea (for high-concurrency servers)

Approaches to ---Testing Software

Case Study 2- Parallel Breadth-First Search Using OpenMP

Presented by Xiaohui (Amy) Lin

CSE 374 Programming Concepts & Tools

Atomic Operations in Hardware

Atomic Operations in Hardware

Faster Data Structures in Transactional Memory using Three Paths

Ik-Soon Kim December 18, 2010 Embedded Software Platform Team

Reasoning about code CSE 331 University of Washington.

Threads and Memory Models Hal Perkins Autumn 2011

CSCI1600: Embedded and Real Time Software

Graph Paper Programming

Tonga Institute of Higher Education IT 141: Information Systems

Inlining and Devirtualization Hal Perkins Autumn 2011

Inlining and Devirtualization Hal Perkins Autumn 2009

Threads and Memory Models Hal Perkins Autumn 2009

Lecture 22: Consistency Models, TM

Background and Motivation

Where Intelligence Lives & Intelligence Management

Tonga Institute of Higher Education IT 141: Information Systems

Language-based Security

Searching, Sorting, and Asymptotic Complexity

Dongyun Jin, Patrick Meredith, Dennis Griffith, Grigore Rosu

Design and Implementation Issues for Atomicity

Amir Kamil and Katherine Yelick

Data Structures & Algorithms

CSE 153 Design of Operating Systems Winter 19

Why we have Counterintuitive Memory Models

Security Principles and Policies CS 236 On-Line MS Program Networks and Systems Security Peter Reiher.

C. M. Overstreet Old Dominion University Fall 2005

CSCI1600: Embedded and Real Time Software

Software Development Techniques

Sampling Dynamic Dataflow Analyses

Presentation transcript:

Optimistic Hybrid Analysis: Accelerating Dynamic Analysis through Predicated Static Analysis Do you know what software’s doing We’d like to think X… We Really Don’t know! Its <time>; do you know what your software is doing? Sure, you know what its supposed to be doing… email, webpage, etc, but do you actually know what its doing? Is it <vuln>? Unfortunately, today’s SW runs almost completely unmonitored, so the software on your computer, could be doing any of these things, and as we’ve seen time and time again, the results of these behaviors are disasterous. David Devecsery, Peter Chen, Jason Flinn, Satish Narayanasamy

Systems Today are Unsafe Runtime systems are largely unmonitored. Consequently they can misbehave to catastrophic effect Reality Ideal World David Devecsery

Unmonitored Software is Dangerous 143,000,000 users data leaked Therac-25 Transportation 1 Billion accounts 209,000 credit cards Banking Users Dial 911 Medical Internet traffic drop 40% Social Networking Shopping David Devecsery

Why didn’t the Therac-25 use data-race detection? At Runtime: Monitor every single memory access This is called dynamic data-race detection Why didn’t the Therac-25 use data-race detection? Execution Therac-25 + check() + check() + check() + check() + check() David Devecsery

But, Fine-Grained Monitoring is Slow Instruments most instructions Causes order-of-magnitude overheads! Analysis Cost Execution ThreadSanitizer ~23x [Serebryany et al.] FastTrack ~10x [Flanagan et al.] + check() + check() Transition: Let me give you two examples. + check() + check() + check() Baseline Runtime David Devecsery

Many Analysis Checks are Unnecessary Observe: By analyzing code, we can prove some misbehaviors are impossible Then remove unneeded checks Known as: Hybrid Analysis Analysis Cost Execution Code + check() + check() + check() + check() + check() Baseline Runtime David Devecsery

Many Analysis Checks are Unnecessary Observe: By looking at code, we can prove some misbehaviors are impossible Then remove unneeded checks Execution + check() + check() Static Analysis Point out on slide that this is ideal Sound + check() + check() + check() Analysis Cost Baseline Runtime David Devecsery

Sound Whole Program Analysis is Overly Conservative Known hard problems Reachable – All reachable program states Sound – Over-approximation used by analysis Complex language features Undecidable problems Sound Reachable Define program state: State of variables, function callstack “The way program analysis deals with these problems, is to overapproximate the state” – Over-approximation is both needed and harmful Sound -> Sound Approximation Talking Points: Can Consider more states As it gets larger, its harder to prove absence of errors David Devecsery

Static analysis is inefficient Sound static analysis over-approximates to retain soundness How can we create a more efficient static analysis? Do I need to check if this is a race? Maybe! No, its not racy! Execution Analysis Cost + check() Sound Static Analysis + check() Add description/overview of my work here. I need a roadmap. Reachable + check() + check() + check() Analysis Cost Baseline Runtime David Devecsery

Preview: Optimistic Hybrid Analysis 2) Use speculative execution to retain soundness 1) Replace sound static analysis with unsound static analysis Execution Analysis Cost + check() Sound Static Analysis + check() Reachable Predicated Insert overview slide, b/c this is state today. + check() Spec. Checks + check() + check() Analysis Cost Baseline Runtime David Devecsery

Outline Motivation / vision Background Predicated static analysis Optimistic hybrid analysis Profiling Speculative dynamic analysis Example predicate Evaluation Race detection (OptFT) Program slicing (OptSlice) David Devecsery

Most Reachable States are Never Reached The vast majority of executions only visit a small set of states If we’re not using many states, why analyze them? Reachable – All reachable program states Sound – Over-approximation used by analysis Reachable Sound – An execution C David Devecsery

Key Insight Insight: To optimize a dynamic analysis we only need our analysis to be correct for the states we’re actually going to visit during execution David Devecsery

Use Unsound Analysis Sound Reachable If we’re not using many states, why analyze them? For hybrid analysis, unsound analysis is perfectly fine. Reachable – All reachable program states Sound – Over-approximation used by analysis Reachable Sound Unsound OPTIMISTIC STATIC ANALSSIS CHANGE – An execution Unsound – Unsound static analysis David Devecsery

Using unsound static analysis Execution Analysis Cost + check() Sound Static Analysis + check() Reachable Unsound Unsound static analysis removes more checks bullet? + check() + check() + check() Analysis Cost Baseline Runtime David Devecsery

Using unsound static analysis How unsound can our analysis be? Execution 1) Needs to be correct in the common case + check() + check() Reachable Unsound Try switching order on limits? 2) Know when we are running unanalyzed states + check() + check() + check() Analysis Cost Baseline Runtime David Devecsery

Predicated Static Analysis Predicated Static Analysis – unsound analysis Start with a sound static analysis Assume additional predicates which may or may not be true As long as predicates are true, we’re within analyzed states Fix this slide, we can learn invariants many ways… Predicated Static Analysis Sound Static Analysis Reachable Predicates Reachable David Devecsery

Predicated Static Analysis Requirements: 1) Needs to be correct in the common case 2) Know when we are running unanalyzed states Insight: Some things that are difficult to prove statically are simple to observe dynamically. If our predicates are observed in common executions, analysis will target common states Fix this slide, we can learn invariants many ways… We can identify if an execution leaves analyzed states by monitoring predicates at runtime Predicated Predicates Reachable David Devecsery

You may have been wondering What happens if an execution falls outside of our analyzed state sets? Use dynamic runtime to recover, and retain sound analysis Predicated Reachable OPTIMISTIC STATIC ANALSSIS CHANGE David Devecsery

Maintaining soundness of dynamic analysis Verify our predicates hold true dynamically. Execution + check() + check() Reachable Predicated SAY: Predicate checks are cheaper than the checks we’ve removed. – More time, slower transition to next slide, “we have bounded unsoundness…” + check() Pred. Checks + check() + check() Analysis Cost Baseline Runtime David Devecsery

Speculation and Recovery If execution outside Predicated, speculation checks fail. We roll-back and re-execute with conservative analysis What about executions outside of our analyzed state set? Fast in the common case, still correct in uncommon Analysis Cost Execution + check() + check() Talking Points: Conservative analysis was what old analysis used, so its really not that bad. Predicated Reachable + check() Pred. Checks + check() Analysis Cost + check() Baseline Runtime Baseline Runtime David Devecsery

Optimistic Hybrid Analysis 1 Profiling 2 Predicated Static Analysis 3 Speculative Dynamic Analysis Binary + Reachable Predicated Profiling Dynamic Checks Predicate Checks David Devecsery

Outline Motivation / vision Background Predicated static analysis Optimistic hybrid analysis Profiling Speculative dynamic analysis Example predicate Evaluation Race detection (OptFT) Program slicing (OptSlice) David Devecsery

Example Predicate: Likely Unreachable Code (LUC) Identifies code that is unlikely to be executed Think of as “likely dead code elimination” if (getyear() > 1980) { a = foo } else { a = bar } Likely Unused Code David Devecsery

Profiling: Likely Unreachable Code Profiling Results Thread 1 Thread 2 a = new() Profile: Find Unused Instructions Line Used? a = new() a = null Unused a.foo() a = null a.foo() David Devecsery

Predicated Analysis: LUC Reduce checks using instruction counts Because Thread 2 doesn’t access “a”, We don’t need to monitor it Code Profiling Results Profiling shows Unused Thread 1 Thread 2 a = new() Line Used? a = new() a = null Unused a.foo() check() Reachable Predicated a = null check() a.foo() check() David Devecsery

Predicated Analysis: LUC Reduce checks using instruction counts Also: add runtime speculation checks. Code Profiling Results Thread 1 Thread 2 a = new() Line Used? a = new() a = null Unused a.foo() check() pred_check() Reachable Predicated a = null check() a.foo() check() David Devecsery

Outline Motivation / vision Background Predicated static analysis Optimistic hybrid analysis Profiling Speculative dynamic analysis Example predicate Evaluation Race detection (OptFT) Program slicing (OptSlice) David Devecsery

Evaluation Two prototypes Use six types of predicates OptFT – Fast data-race detector OptSlice – Dynamic backwards slicer Use six types of predicates Likely Unused Code Likely Callee Sites Likely Unused Call Contexts Likely Singleton Threads Likely Guarding Locks No Custom Synchronization Optimistic Hybrid Analysis is a general application, To demonstrate effectiveness I’m going to give you two examples of how to use OHA David Devecsery

Runtime - OptFT 1.8x Speedup Vs Trad. Hybrid Graph David Devecsery

Runtime - OptSlice On Average: 8.3x Speedup Graph David Devecsery

Conclusion Optimistic Hybrid Analysis accelerates analysis Uses bounded unsoundness to accelerate analysis No loss of soundness in dynamic results Two implementations OptFT – Fastest data-race detector OptSlice – 8.3x speedup for backward slicing We’re excited to explore the limits of this technique As far as we know, OHA is useful for any analysis. David Devecsery

Questions? David Devecsery