Artemis: Practical Runtime Monitoring of Applications for Execution Anomalies Long Fei and Samuel P. Midkiff School of Electrical and Computer Engineering.

Artemis: Practical Runtime Monitoring of Applications for Execution Anomalies Long Fei and Samuel P. Midkiff School of Electrical and Computer Engineering Purdue University, West Lafayette PLDI 2006 Subproject of PROBE

PLDI06 Motivation Bugs are expensive! Cost in 2002: 60 billion dollars, 0.6% GDP Debugging Approaches Manual debugging – inefficient, impossible Extensive testing – inefficient, path explosion Static analysis – conservative, false alarms Manual annotation – inefficient, not scalable Runtime debugging – high overhead

PLDI06 What is Artemis? Is not a bug detection tool Makes existing tools more efficient in bug detection program execution runtime analysis program execution Artemis runtime analysis Existing schemes: a few times slowdown is common, can be up to 2 orders of magnitude With Artemis, much less data is examined by runtime analysis, reducing overhead to <10% in long-running programs

PLDI06 Outline for the Rest of the Talk Birds eye view of related work Artemis framework Experimental results Conclusions

PLDI06 Birds Eye View of Compiler-Aided Debugging Techniques compiler techniques software debugging static dynamic runtime overhead faster parallel selective no program information use program information sampling random adaptive Artemis More efficient design Problem specific Usually involves assumptions about OS, compiler, or hardware Exploit parallelism Shadow checking process (Patil SPE97) Thread-level speculation (Oplinger ASPLOS02) Perform fewer checks Liblit PLDI03 Chilimbi ASPLOS04

PLDI06 Artemis Design Goals General Work with multiple pre-existing debugging schemes Pure software approach that works with general hardware, OS and compiler Effective Improve overhead in general Have low asymptotic overhead in long-running programs Adaptive Adjust coverage of monitoring to system load

PLDI06 Key Idea Because runtime monitoring is expensive … want to monitor only when a bug occurs Our goal is to approximate this avoid re-monitoring executions whose outcome has been previously observed

PLDI06 How to Determine Where Bugs are Likely to be Seen Code region behavior (and bug behavior) is determined by regions context Monitor the 1 st time a region executes under a context If buggy, the bug is monitored if not buggy, only monitor this region with this context once Over time, almost all executions of a region have a previously seen context – yields low asymptotic monitoring overhead Hard part – efficiently representing, storing and comparing contexts for a region The context could be the whole program state! Region Context Outcome

PLDI06 Decision to Monitor first entrance ? context seen before ? initialize context update context record, add current context use monitored version use unmonitored version NY YN code segment entrance

PLDI06 Target Programs Our prototype targets sequential code regions Determined by how contexts are defined Can be used with race-free programs without loss of precision Target the sequential regions of these programs Use with programs with races is ongoing research

PLDI06 Implementation Issues Define code regions Represent and compare contexts Interface with existing runtime debugging schemes Adhere to overhead constraints Adapt to system load

PLDI06 Defining Code Regions Spatial granularity Temporal granularity: Context check frequency Context check efficiency Ideal case: a small context dominates the behavior of a large piece of code Our choice: Procedure: natural logic boundary

PLDI06 Approximating Context for Efficiency Exact Context Too large to store and check (might be entire program state) Represent approximately – tradeoff between precision and efficiency Approximated Context In-scope global variables, method parameters, in-scope pointers Values of non-pointer variables are mapped into a compact form (value invariant – as in DIDUCE ICSE 02) Requires 2 integer fields; 2 bitwise operations for each check; 3 bitwise operations for each update Pointers tracked by declared (not actual) types argv approximated by vector length Correlations between context elements are lost If {a=4,b=3} and {a=5,b=8} are two contexts of a region, we track {a=(4,5), b=(3,8)}

PLDI06 Simulating Monitoring Schemes We need to measure performance on a wide range of runtime monitoring schemes A generic monitoring scheme Inserts instrumentation into application at probability p Calls a dummy monitoring function, which simulates the overhead of some real monitoring scheme Can adjust overhead from zero to arbitrarily large Disable dummy monitoring to reveal the asymptotic overhead of Artemis Only performs the context checks associated with the cost of monitoring, but not the monitoring Allows measuring context checking overhead only

PLDI06 Experiment – Asymptotic Overhead Measured by Simulation

PLDI06 Two Findings Performance floor As monitoring scheme overhead approaches zero, Artemis overhead is 5.57% of unmonitored program execution time When can we use Artemis to improve overhead ? Break even baseline monitoring overhead Monitoring overhead > 5.6%, Artemis helps By solving x=0.0045x+0.0557, x = 5.60% This covers most of the monitoring techniques

PLDI06 An Optimization: Reuse Context Sets Across Runs Eliminates the initial building of sets of observed contexts Converges faster to the asymptotic overhead Invariant profile: Dump the context invariants into a file at program exit Load dumped invariants at the next run Invariant profile size is 0.4 ~ 4.7% of program binary size (average 1.7%, std 0.95%)

PLDI06 Using Artemis (with invariant profile) sourceinstrumentationbuild training Artemis production runbug report baselineArtemis invariant profile

PLDI06 Convergence to Asymptotic Overhead – e.g. bzip2 from SPECint Asymptotic overhead reduced to < 7.5% (from ~280%) ~7.3%

PLDI06 Experiments with Real Monitoring Schemes Measuring how well does (monitoring scheme guided by Artemis) approximates the capabilities of original monitoring scheme Artemis with hardware-based monitoring (AccMon) – detected 3/3 bugs, 2.67 times improvement, in very short-running programs Artemis with value invariant detection and checking (C-DIDUCE) – Source-level instrumentation – covered 75% of violations, 4.6 times improvement, in short-running programs Full results and details are in the paper

PLDI06 Conclusions General framework Eliminate redundant runtime monitoring with context checking Improve overhead, low asymptotic overhead in long-running programs Small precision loss Have enabled practical runtime monitoring of long-running programs

Artemis: Practical Runtime Monitoring of Applications for Execution Anomalies Long Fei and Samuel P. Midkiff School of Electrical and Computer Engineering.

Similar presentations

Presentation on theme: "Artemis: Practical Runtime Monitoring of Applications for Execution Anomalies Long Fei and Samuel P. Midkiff School of Electrical and Computer Engineering."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Artemis: Practical Runtime Monitoring of Applications for Execution Anomalies Long Fei and Samuel P. Midkiff School of Electrical and Computer Engineering.

Similar presentations

Presentation on theme: "Artemis: Practical Runtime Monitoring of Applications for Execution Anomalies Long Fei and Samuel P. Midkiff School of Electrical and Computer Engineering."— Presentation transcript:

Similar presentations

About project

Feedback