Probabilistic Calling Context Michael D. Bond Kathryn S. McKinley University of Texas at Austin.

Slides:



Advertisements
Similar presentations
Department of Computer Sciences Dynamic Shape Analysis via Degree Metrics Maria Jump & Kathryn S. McKinley Department of Computer Sciences The University.
Advertisements

Uncovering Performance Problems in Java Applications with Reference Propagation Profiling PRESTO: Program Analyses and Software Tools Research Group, Ohio.
Michael Bond Milind Kulkarni Man Cao Minjia Zhang Meisam Fathi Salmi Swarnendu Biswas Aritra Sengupta Jipeng Huang Ohio State Purdue.
Efficient, Context-Sensitive Dynamic Analysis via Calling Context Uptrees Jipeng Huang, Michael D. Bond Ohio State University.
Goldilocks: Efficiently Computing the Happens-Before Relation Using Locksets Tayfun Elmas 1, Shaz Qadeer 2, Serdar Tasiran 1 1 Koç University, İstanbul,
A Randomized Dynamic Program Analysis for Detecting Real Deadlocks Koushik Sen CS 265.
Building a Better Backtrace: Techniques for Postmortem Program Analysis Ben Liblit & Alex Aiken.
Building a Better Backtrace: Techniques for Postmortem Program Analysis Ben Liblit & Alex Aiken.
KLIMAX: Profiling Memory Write Patterns to Detect Keystroke-Harvesting Malware Stefano Ortolani 1, Cristiano Giuffrida 1, and Bruno Crispo 2 1 Vrije Universiteit.
Resurrector: A Tunable Object Lifetime Profiling Technique Guoqing Xu University of California, Irvine OOPSLA’13 Conference Talk 1.
Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley
CORK: DYNAMIC MEMORY LEAK DETECTION FOR GARBAGE- COLLECTED LANGUAGES A TRADEOFF BETWEEN EFFICIENCY AND ACCURATE, USEFUL RESULTS.
SOS: Saving Time in Dynamic Race Detection with Stationary Analysis Du Li, Witawas Srisa-an, Matthew B. Dwyer.
Heap Shape Scalability Scalable Garbage Collection on Highly Parallel Platforms Kathy Barabash, Erez Petrank Computer Science Department Technion, Israel.
Free-Me: A Static Analysis for Individual Object Reclamation Samuel Z. Guyer Tufts University Kathryn S. McKinley University of Texas at Austin Daniel.
The Use of Traces for Inlining in Java Programs Borys J. Bradel Tarek S. Abdelrahman Edward S. Rogers Sr.Department of Electrical and Computer Engineering.
Automatic Extraction of Object-Oriented Component Interfaces John Whaley Michael C. Martin Monica S. Lam Computer Systems Laboratory Stanford University.
Automatic Detection of Previously-Unseen Application States for Deployment Environment Testing and Analysis Chris Murphy, Moses Vaughan, Waseem Ilahi,
Comparison of JVM Phases on Data Cache Performance Shiwen Hu and Lizy K. John Laboratory for Computer Architecture The University of Texas at Austin.
Quarantine: A Framework to Mitigate Memory Errors in JNI Applications Du Li , Witawas Srisa-an University of Nebraska-Lincoln.
Bell: Bit-Encoding Online Memory Leak Detection Michael D. Bond Kathryn S. McKinley University of Texas at Austin.
Presenter: Chi-Hung Lu 1. Problems Distributed applications are hard to validate Distribution of application state across many distinct execution environments.
Improving the Performance of Object-Oriented Languages with Dynamic Predication of Indirect Jumps José A. Joao *‡ Onur Mutlu ‡* Hyesoon Kim § Rishi Agarwal.
Tolerating Memory Leaks Michael D. Bond Kathryn S. McKinley.
Michael Bond Kathryn McKinley The University of Texas at Austin.
Taking Off The Gloves With Reference Counting Immix
Impact Analysis of Database Schema Changes Andy Maule, Wolfgang Emmerich and David S. Rosenblum London Software Systems Dept. of Computer Science, University.
An Efficient Runtime for Detecting Defects in Deployed Systems Matt ArnoldMartin VechevEran Yahav.
Reverse Engineering State Machines by Interactive Grammar Inference Neil Walkinshaw, Kirill Bogdanov, Mike Holcombe, Sarah Salahuddin.
Michael Bond Varun Srivastava Kathryn McKinley Vitaly Shmatikov University of Texas at Austin.
Understanding Parallelism-Inhibiting Dependences in Sequential Java Programs Atanas (Nasko) Rountev Kevin Van Valkenburgh Dacong Yan P. Sadayappan Ohio.
P ath & E dge P rofiling Michael Bond, UT Austin Kathryn McKinley, UT Austin Continuous Presented by: Yingyi Bu.
1 Fast and Efficient Partial Code Reordering Xianglong Huang (UT Austin, Adverplex) Stephen M. Blackburn (Intel) David Grove (IBM) Kathryn McKinley (UT.
Dynamic Object Sampling for Pretenuring Maria Jump Department of Computer Sciences The University of Texas at Austin Stephen M. Blackburn.
Free-Me: A Static Analysis for Automatic Individual Object Reclamation Samuel Z. Guyer, Kathryn McKinley, Daniel Frampton Presented by: Dimitris Prountzos.
DoubleChecker: Efficient Sound and Precise Atomicity Checking Swarnendu Biswas, Jipeng Huang, Aritra Sengupta, and Michael D. Bond The Ohio State University.
Cloud Testing Haryadi Gunawi Towards thousands of failures and hundreds of specifications.
Breadcrumbs: Efficient Context Sensitivity for Dynamic Bug Detection Analyses Michael D. Bond University of Texas at Austin Graham Z. Baker Tufts / MIT.
Finding Your Cronies: Static Analysis for Dynamic Object Colocation Samuel Z. Guyer Kathryn S. McKinley T H E U N I V E R S I T Y O F T E X A S A T A U.
Aritra Sengupta, Swarnendu Biswas, Minjia Zhang, Michael D. Bond and Milind Kulkarni ASPLOS 2015, ISTANBUL, TURKEY Hybrid Static-Dynamic Analysis for Statically.
Efficient Deterministic Replay of Multithreaded Executions in a Managed Language Virtual Machine Michael Bond Milind Kulkarni Man Cao Meisam Fathi Salmi.
1 Test Selection for Result Inspection via Mining Predicate Rules Wujie Zheng
Targeted Path Profiling : Lower Overhead Path Profiling for Staged Dynamic Optimization Systems Rahul Joshi, UIUC Michael Bond*, UT Austin Craig Zilles,
Highly Scalable Distributed Dataflow Analysis Joseph L. Greathouse Advanced Computer Architecture Laboratory University of Michigan Chelsea LeBlancTodd.
Drinking from Both Glasses: Adaptively Combining Pessimistic and Optimistic Synchronization for Efficient Parallel Runtime Support Man Cao Minjia Zhang.
Practical Path Profiling for Dynamic Optimizers Michael Bond, UT Austin Kathryn McKinley, UT Austin.
Michael Bond Katherine Coons Kathryn McKinley University of Texas at Austin.
1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),
Targeted Path Profiling : Lower Overhead Path Profiling for Staged Dynamic Optimization Systems Rahul Joshi, UIUC Michael Bond*, UT Austin Craig Zilles,
CoCo: Sound and Adaptive Replacement of Java Collections Guoqing (Harry) Xu Department of Computer Science University of California, Irvine.
BarrierWatch: Characterizing Multithreaded Workloads across and within Program-Defined Epochs Socrates Demetriades and Sangyeun Cho Computer Frontiers.
Object-Relative Addressing: Compressed Pointers in 64-bit Java Virtual Machines Kris Venstermans, Lieven Eeckhout, Koen De Bosschere Department of Electronics.
Aritra Sengupta, Man Cao, Michael D. Bond and Milind Kulkarni PPPJ 2015, Melbourne, Florida, USA Toward Efficient Strong Memory Model Support for the Java.
Tracking Bad Apples: Reporting the Origin of Null & Undefined Value Errors Michael D. Bond UT Austin Nicholas Nethercote National ICT Australia Stephen.
GC Assertions: Using the Garbage Collector To Check Heap Properties Samuel Z. Guyer Tufts University Edward Aftandilian Tufts University.
Dynamic Bug Detection & Tolerance Kathryn S McKinley The University of Texas at Austin.
Prescient Memory: Exposing Weak Memory Model Behavior by Looking into the Future MAN CAO JAKE ROEMER ARITRA SENGUPTA MICHAEL D. BOND 1.
Optimistic Hybrid Analysis
Cork: Dynamic Memory Leak Detection with Garbage Collection
No Bit Left Behind: The Limits of Heap Data Compression
Aritra Sengupta Man Cao Michael D. Bond and Milind Kulkarni
Ravi Mangal Mayur Naik Hongseok Yang
Efficient, Context-Sensitive Detection of Real-World Semantic Attacks
Jipeng Huang, Michael D. Bond Ohio State University
Inlining and Devirtualization Hal Perkins Autumn 2011
Inlining and Devirtualization Hal Perkins Autumn 2009
Correcting the Dynamic Call Graph Using Control Flow Constraints
No Bit Left Behind: The Limits of Heap Data Compression
Garbage Collection Advantage: Improving Program Locality
Presentation transcript:

Probabilistic Calling Context Michael D. Bond Kathryn S. McKinley University of Texas at Austin

Why Context Sensitivity? at com.mckoi.db.jdbcserver.JDBCInterface.execQuery():213 Static program location not enough

at com.mckoi.db.jdbcserver.JDBCInterface.execQuery():213 at com.mckoi.db.jdbc.MConnection.executeQuery():348 at com.mckoi.db.jdbc.MStatement.executeQuery():110 at com.mckoi.db.jdbc.MStatement.executeQuery():127 at Test.main():48 Static program location not enough Why Context Sensitivity?

at com.mckoi.db.jdbcserver.JDBCInterface.execQuery():213 at com.mckoi.db.jdbc.MConnection.executeQuery():348 at com.mckoi.db.jdbc.MStatement.executeQuery():110 at com.mckoi.db.jdbc.MStatement.executeQuery():127 at Test.main():48 Motivated by  Complex programs  Small methods  Virtual dispatch Static program location not enough Why Context Sensitivity?

at com.mckoi.db.jdbcserver.JDBCInterface.execQuery():213 at com.mckoi.db.jdbc.MConnection.executeQuery():348 at com.mckoi.db.jdbc.MStatement.executeQuery():110 at com.mckoi.db.jdbc.MStatement.executeQuery():127 at Test.main():48 Motivated by  Complex programs  Small methods  Virtual dispatch Static program location not enough C/Fortran method Java/C# method return call return Why Context Sensitivity?

Context Is Nontrivial API calls ProgramCall sitesDistinct contexts antlr4,184128,627 bloat3,306600,947 chart2,335202,603 eclipse9,611226,020 fop2,22537,710 hsqldb94716,050 jython1,830628,048 luindex654102,556 lusearch pmd1,890847,108 xalan1,53017,905

Example: Residual Testing class EditorWindow { close() {... } Does behavior occur at production time that did not occur at testing time? class SimpleWindow { close() {... }

Example: Residual Testing class EditorWindow { close() {... } Does behavior occur at production time that did not occur at testing time? class SimpleWindow { close() {... } inputHandler() {... case CLICK_EXIT: w.checkUnsaved(); w.close();... } autoUpdate() {... for all windows w w.close();... }

class EditorWindow { close() {... } class SimpleWindow { close() {... } Example: Residual Testing Does behavior occur at production time that did not occur at testing time? inputHandler() {... case CLICK_EXIT: w.checkUnsaved(); w.close();... } autoUpdate() {... for all windows w w.close();... } Bug!

class EditorWindow { close() {... } class SimpleWindow { close() {... } Example: Residual Testing Does behavior occur at production time that did not occur at testing time? inputHandler() {... case CLICK_EXIT: w.checkUnsaved(); w.close();... } autoUpdate() {... for all windows w w.close();... } Bug! New behavior indicates bugs Context sensitivity helps find new behavior

Two-Phase Dynamic Analyses TrainingProduction Behavior observed New or anomalous behavior detected

Two-Phase Dynamic Analyses Residual testing Anomaly-based intrusion detection Anomaly-based bug detection TrainingProduction Behavior observed New or anomalous behavior detected What behavior occurs at production time that did not occur at testing time? [Vaswani et al. ’07] What new behavior occurs during a buggy program run? [Hangal & Lam ’02] Does a program exhibit anomalous behavior? [Inoue ’05]

Probabilistic Calling Context TrainingProduction Behavior observed New or anomalous behavior detected Adds context sensitivity to dynamic analyses Maintains value representing context  Unique with high probability  New value  new context  walk stack High accuracy: <0.1% false negatives Low overhead: 3% overhead, 0-8% for clients

Outline Introduction Previous approaches Maintaining the PCC value Evaluation  Accuracy  Performance

Previous Approaches Tracking context [Ammons et al. ’97] [Spivey ‘04]  Maintain CCT position at each call/return Walking the stack [Nethercote & Seward ‘07] Path profiling [Ball & Larus ’96] [Melski & Reps ’99]  Call graphs large  path explosion  Virtual dispatch complicates instrumentation

Previous Approaches Tracking context [Ammons et al. ’97] [Spivey ‘04]  Maintain CCT position at each call/return Walking the stack [Nethercote & Seward ‘07] Path profiling [Ball & Larus ’96] [Melski & Reps ’99]  Call graphs large  path explosion  Virtual dispatch complicates instrumentation Sampling [Zhuang et al. ’06]  Sacrifices coverage for low overhead

Outline Introduction Previous approaches Maintaining the PCC value Evaluation  Accuracy  Performance

PCC Function f ( V, cs )  V is PCC value  cs is call site ID

PCC Function f ( V, cs ) V ← f ( V, cs 1 ) V ← V saved V ← f ( V, cs 2 ) cs 2 cs 1  V is PCC value  cs is call site ID

PCC Function f ( V, cs ) ≡ 3V + cs (mod 2 32 )  V is PCC value  cs is call site ID

PCC Function f ( V, cs ) ≡ 3V + cs (mod 2 32 ) Motivated by MPI datatype hashing [Langou et al. ’05] [Gropp ’00] Cheap to compute Desirable properties:  Non-commutative  Composition efficient to compute

Differentiating Similar Contexts C A B V ← 3V + cs 1 V ← 3V + cs 2 C A B V ← 3V + cs 1 V ← 3V + cs 2 …  A()  B()  … …  B()  A()  …

Differentiating Similar Contexts Non-commutative f ( f (V, cs 1 ), cs 2 ) ≠ f ( f (V, cs 2 ), cs 1 ) C A B V ← 3V + cs 1 V ← 3V + cs 2 C A B V ← 3V + cs 1 V ← 3V + cs 2

Efficiency at Inlined Calls C A B V ← 3V + cs 1 V ← 3V + cs 2

Efficiency at Inlined Calls C A B V ← 3 ( 3V + cs 1 ) + cs 2 C A B V ← 3V + cs 1 V ← 3V + cs 2

Efficiency at Inlined Calls C A B V ← 9V + 3cs 1 + cs 2 C A B V ← 3V + cs 1 V ← 3V + cs 2

Efficiency at Inlined Calls Composition efficient to compute C A B V ← 9V + 3cs 1 + cs 2 C A B V ← 3V + cs 1 V ← 3V + cs 2

Outline Introduction Previous approaches Maintaining the PCC value Evaluation  Methodology  Evaluating potential clients  Accuracy  Performance

Methodology Implementation in Jikes RVM  Available on Jikes RVM Research Archive Deterministic calling context profiling  Maintains CCT node at each call & return Benchmarks: DaCapo, SPEC JBB2000, SPEC JVM98 Platform: 3.6 GHz Pentium 4 w/Linux

TrainingProduction Behavior observed New or anomalous behavior detected Record values New value  new context  walk stack How Clients Use PCC

TrainingProduction Behavior observed New or anomalous behavior detected Evaluating Potential Clients Record values Check values (no new values) Global hash table

TrainingProduction Behavior observed New or anomalous behavior detected Evaluating Potential Clients Record values Check values (no new values) Global hash table Memory overhead: proportional to contexts

Evaluating Potential Clients Residual testing Check PCC value at Java API calls (calls to java.* ) Check PCC value at system calls (Network, I/O, OS) Anomaly-based intrusion detection Upper bound Check PCC value at all calls

Ideal Accuracy PCC maps context to value  New PCC value  new context  Familiar PCC value  probably familiar context

Ideal Accuracy Expected conflicts (false negatives) Distinct contexts32-bit values64-bit values 100,0001 (0.0%)0 (0.0%) 1,000, (0.0%)0 (0.0%) 10,000,00011,632 (0.1%)0 (0.0%) 100,000,0001,155,170 (1.2%)0 (0.0%) 1,000,000,000107,882,641 (10.8%)0 (0.0%) 10,000,000,0006,123,623,065 (61.2%)3 (0.0%) PCC maps context to value  New PCC value  new context  Familiar PCC value  probably familiar context

Ideal Accuracy Expected conflicts (false negatives) Distinct contexts32-bit values64-bit values 100,0001 (0.0%)0 (0.0%) 1,000, (0.0%)0 (0.0%) 10,000,00011,632 (0.1%)0 (0.0%) 100,000,0001,155,170 (1.2%)0 (0.0%) 1,000,000,000107,882,641 (10.8%)0 (0.0%) 10,000,000,0006,123,623,065 (61.2%)3 (0.0%) PCC maps context to value  New PCC value  new context  Familiar PCC value  probably familiar context API calls

Ideal Accuracy Expected conflicts (false negatives) Distinct contexts32-bit values64-bit values 100,0001 (0.0%)0 (0.0%) 1,000, (0.0%)0 (0.0%) 10,000,00011,632 (0.1%)0 (0.0%) 100,000,0001,155,170 (1.2%)0 (0.0%) 1,000,000,000107,882,641 (10.8%)0 (0.0%) 10,000,000,0006,123,623,065 (61.2%)3 (0.0%) PCC maps context to value  New PCC value  new context  Familiar PCC value  probably familiar context All calls

Ideal Accuracy Expected conflicts (false negatives) Distinct contexts32-bit values64-bit values 100,0001 (0.0%)0 (0.0%) 1,000, (0.0%)0 (0.0%) 10,000,00011,632 (0.1%)0 (0.0%) 100,000,0001,155,170 (1.2%)0 (0.0%) 1,000,000,000107,882,641 (10.8%)0 (0.0%) 10,000,000,0006,123,623,065 (61.2%)3 (0.0%) PCC maps context to value  New PCC value  new context  Familiar PCC value  probably familiar context Near-perfect accuracy

PCC’s Accuracy System callsJava API calls ProgramDynamicDistinctConf.DynamicDistinctConf. antlr211,4901,567024,422,013128,6273 bloat121001,159,281,573600,94740 chart ,891,525202,6034 eclipse14, ,507,343226,0205 fop181709,918,27537,7100 hsqldb12 081,161,54116,0500 jython5,9294, ,845,772628,04848 luindex2, ,733,214102,5560 lusearch ,511, pmd1, ,017,118847,10879 xalan137, ,105,838,67017,9050

PCC’s Accuracy System callsJava API calls ProgramDynamicDistinctConf.DynamicDistinctConf. antlr211,4901,567024,422,013128,6273 bloat121001,159,281,573600,94740 chart ,891,525202,6034 eclipse14, ,507,343226,0205 fop181709,918,27537,7100 hsqldb12 081,161,54116,0500 jython5,9294, ,845,772628,04848 luindex2, ,733,214102,5560 lusearch ,511, pmd1, ,017,118847,10879 xalan137, ,105,838,67017,9050

PCC’s Accuracy All calls ProgramDynamicDistinctConf. antlr490,363,2111,006, bloat6,276,446,0591,980, chart908,459,469845,43291 eclipse1,266,810,5044,815,9012,652 fop44,200,446174,9552 hsqldb877,680,667110,7951 jython5,326,949,1583,859,5451,738 luindex740,053,104374,20112 lusearch1,439,034,3366,0390 pmd2,726,876,9578,043,0967,653 xalan10,083,858,546163,2056

PCC’s Execution Time Overhead 3%

PCC’s Execution Time Overhead 3%

Summary PCC maintains calling context value  New value indicates new behavior Low overhead  Maintaining PCC value adds 3%  Checking PCC value 0-8%  Memory overhead proportional to contexts High accuracy  Less than 0.1% false negative rate PCC adds context sensitivity to clients that detect anomalous behavior

Summary PCC maintains calling context value  New value indicates new behavior Low overhead  Maintaining PCC value adds 3%  Checking PCC value 0-8%  Memory overhead proportional to contexts High accuracy  Less than 0.1% false negative rate PCC adds context sensitivity to clients that detect anomalous behavior Thank you!

Extra slides

Do paths capture enough behavior? C/Fortran method Java/C# method return call return Context Sensitivity Mostly Unused