Download presentation
Presentation is loading. Please wait.
Published byJulian Bates Modified over 11 years ago
1
Artemis: Practical Runtime Monitoring of Applications for Execution Anomalies Long Fei and Samuel P. Midkiff School of Electrical and Computer Engineering Purdue University, West Lafayette PLDI 2006 Subproject of PROBE
2
PLDI06 Motivation Bugs are expensive! Cost in 2002: 60 billion dollars, 0.6% GDP Debugging Approaches Manual debugging – inefficient, impossible Extensive testing – inefficient, path explosion Static analysis – conservative, false alarms Manual annotation – inefficient, not scalable Runtime debugging – high overhead
3
PLDI06 What is Artemis? Is not a bug detection tool Makes existing tools more efficient in bug detection program execution runtime analysis program execution Artemis runtime analysis Existing schemes: a few times slowdown is common, can be up to 2 orders of magnitude With Artemis, much less data is examined by runtime analysis, reducing overhead to <10% in long-running programs
4
PLDI06 Outline for the Rest of the Talk Birds eye view of related work Artemis framework Experimental results Conclusions
5
PLDI06 Birds Eye View of Compiler-Aided Debugging Techniques compiler techniques software debugging static dynamic runtime overhead faster parallel selective no program information use program information sampling random adaptive Artemis More efficient design Problem specific Usually involves assumptions about OS, compiler, or hardware Exploit parallelism Shadow checking process (Patil SPE97) Thread-level speculation (Oplinger ASPLOS02) Perform fewer checks Liblit PLDI03 Chilimbi ASPLOS04
6
PLDI06 Artemis Design Goals General Work with multiple pre-existing debugging schemes Pure software approach that works with general hardware, OS and compiler Effective Improve overhead in general Have low asymptotic overhead in long-running programs Adaptive Adjust coverage of monitoring to system load
7
PLDI06 Key Idea Because runtime monitoring is expensive … want to monitor only when a bug occurs Our goal is to approximate this avoid re-monitoring executions whose outcome has been previously observed
8
PLDI06 How to Determine Where Bugs are Likely to be Seen Code region behavior (and bug behavior) is determined by regions context Monitor the 1 st time a region executes under a context If buggy, the bug is monitored if not buggy, only monitor this region with this context once Over time, almost all executions of a region have a previously seen context – yields low asymptotic monitoring overhead Hard part – efficiently representing, storing and comparing contexts for a region The context could be the whole program state! Region Context Outcome
9
PLDI06 Decision to Monitor first entrance ? context seen before ? initialize context update context record, add current context use monitored version use unmonitored version NY YN code segment entrance
10
PLDI06 Target Programs Our prototype targets sequential code regions Determined by how contexts are defined Can be used with race-free programs without loss of precision Target the sequential regions of these programs Use with programs with races is ongoing research
11
PLDI06 Implementation Issues Define code regions Represent and compare contexts Interface with existing runtime debugging schemes Adhere to overhead constraints Adapt to system load
12
PLDI06 Defining Code Regions Spatial granularity Temporal granularity: Context check frequency Context check efficiency Ideal case: a small context dominates the behavior of a large piece of code Our choice: Procedure: natural logic boundary
13
PLDI06 Approximating Context for Efficiency Exact Context Too large to store and check (might be entire program state) Represent approximately – tradeoff between precision and efficiency Approximated Context In-scope global variables, method parameters, in-scope pointers Values of non-pointer variables are mapped into a compact form (value invariant – as in DIDUCE ICSE 02) Requires 2 integer fields; 2 bitwise operations for each check; 3 bitwise operations for each update Pointers tracked by declared (not actual) types argv approximated by vector length Correlations between context elements are lost If {a=4,b=3} and {a=5,b=8} are two contexts of a region, we track {a=(4,5), b=(3,8)}
14
PLDI06 Simulating Monitoring Schemes We need to measure performance on a wide range of runtime monitoring schemes A generic monitoring scheme Inserts instrumentation into application at probability p Calls a dummy monitoring function, which simulates the overhead of some real monitoring scheme Can adjust overhead from zero to arbitrarily large Disable dummy monitoring to reveal the asymptotic overhead of Artemis Only performs the context checks associated with the cost of monitoring, but not the monitoring Allows measuring context checking overhead only
15
PLDI06 Experiment – Asymptotic Overhead Measured by Simulation
16
PLDI06 Two Findings Performance floor As monitoring scheme overhead approaches zero, Artemis overhead is 5.57% of unmonitored program execution time When can we use Artemis to improve overhead ? Break even baseline monitoring overhead Monitoring overhead > 5.6%, Artemis helps By solving x=0.0045x+0.0557, x = 5.60% This covers most of the monitoring techniques
17
PLDI06 An Optimization: Reuse Context Sets Across Runs Eliminates the initial building of sets of observed contexts Converges faster to the asymptotic overhead Invariant profile: Dump the context invariants into a file at program exit Load dumped invariants at the next run Invariant profile size is 0.4 ~ 4.7% of program binary size (average 1.7%, std 0.95%)
18
PLDI06 Using Artemis (with invariant profile) sourceinstrumentationbuild training Artemis production runbug report baselineArtemis invariant profile
19
PLDI06 Convergence to Asymptotic Overhead – e.g. bzip2 from SPECint Asymptotic overhead reduced to < 7.5% (from ~280%) ~7.3%
20
PLDI06 Experiments with Real Monitoring Schemes Measuring how well does (monitoring scheme guided by Artemis) approximates the capabilities of original monitoring scheme Artemis with hardware-based monitoring (AccMon) – detected 3/3 bugs, 2.67 times improvement, in very short-running programs Artemis with value invariant detection and checking (C-DIDUCE) – Source-level instrumentation – covered 75% of violations, 4.6 times improvement, in short-running programs Full results and details are in the paper
21
PLDI06 Conclusions General framework Eliminate redundant runtime monitoring with context checking Improve overhead, low asymptotic overhead in long-running programs Small precision loss Have enabled practical runtime monitoring of long-running programs
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.