Aditya Thakur Rathijit Sen Ben Liblit Shan Lu University of Wisconsin–Madison Workshop on Dynamic Analysis 2009 Cooperative Crug Isolation.

Slides:



Advertisements
Similar presentations
On-the-fly Healing of Race Conditions in ARINC-653 Flight Software
Advertisements

An Case for an Interleaving Constrained Shared-Memory Multi-Processor Jie Yu and Satish Narayanasamy University of Michigan.
Gwendolyn Voskuilen, Faraz Ahmad, and T. N. Vijaykumar Electrical & Computer Engineering ISCA 2010.
Goldilocks: Efficiently Computing the Happens-Before Relation Using Locksets Tayfun Elmas 1, Shaz Qadeer 2, Serdar Tasiran 1 1 Koç University, İstanbul,
Building a Better Backtrace: Techniques for Postmortem Program Analysis Ben Liblit & Alex Aiken.
Building a Better Backtrace: Techniques for Postmortem Program Analysis Ben Liblit & Alex Aiken.
1 Bug Isolation via Remote Program Sampling Ben LiblitAlex Aiken Alice X. ZhengMichael Jordan Presented By : Arpita Gandhi.
Ensuring Operating System Kernel Integrity with OSck By Owen S. Hofmann Alan M. Dunn Sangman Kim Indrajit Roy Emmett Witchel Kent State University College.
Guoliang Jin, Linhai Song, Wei Zhang, Shan Lu, and Ben Liblit University of Wisconsin–Madison Automated Atomicity- Violation Fixing.
Spark: Cluster Computing with Working Sets
Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng.
An efficient data race detector for DIOTA Michiel Ronsse, Bastiaan Stougie, Jonas Maebe, Frank Cornelis, Koen De Bosschere Department of Electronics and.
Atomicity in Multi-Threaded Programs Prachi Tiwari University of California, Santa Cruz CMPS 203 Programming Languages, Fall 2004.
/ PSWLAB Atomizer: A Dynamic Atomicity Checker For Multithreaded Programs By Cormac Flanagan, Stephen N. Freund 24 th April, 2008 Hong,Shin.
ADVERSARIAL MEMORY FOR DETECTING DESTRUCTIVE RACES Cormac Flanagan & Stephen Freund UC Santa Cruz Williams College PLDI 2010 Slides by Michelle Goodstein.
Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.
Parallelizing Data Race Detection Benjamin Wester Facebook David Devecsery, Peter Chen, Jason Flinn, Satish Narayanasamy University of Michigan.
A. Frank - P. Weisberg Operating Systems Introduction to Tasks/Threads.
A. Frank - P. Weisberg Operating Systems Introduction to Cooperating Processes.
University of Michigan Electrical Engineering and Computer Science 1 Practical Lock/Unlock Pairing for Concurrent Programs Hyoun Kyu Cho 1, Yin Wang 2,
0 Deterministic Replay for Real- time Software Systems Alice Lee Safety, Reliability & Quality Assurance Office JSC, NASA Yann-Hang.
Microsoft Research Asia Ming Wu, Haoxiang Lin, Xuezheng Liu, Zhenyu Guo, Huayang Guo, Lidong Zhou, Zheng Zhang MIT Fan Long, Xi Wang, Zhilei Xu.
Introduction Overview Static analysis Memory analysis Kernel integrity checking Implementation and evaluation Limitations and future work Conclusions.
Scalable Statistical Bug Isolation Ben Liblit, Mayur Naik, Alice Zheng, Alex Aiken, and Michael Jordan, 2005 University of Wisconsin, Stanford University,
Scalable Statistical Bug Isolation Ben Liblit, Mayur Naik, Alice Zheng, Alex Aiken, and Michael Jordan University of Wisconsin, Stanford University, and.
15-740/ Oct. 17, 2012 Stefan Muller.  Problem: Software is buggy!  More specific problem: Want to make sure software doesn’t have bad property.
Analyzing parallel programs with Pin Moshe Bach, Mark Charney, Robert Cohn, Elena Demikhovsky, Tevi Devor, Kim Hazelwood, Aamer Jaleel, Chi- Keung Luk,
- 1 - Dongyoon Lee †, Mahmoud Said*, Satish Narayanasamy †, Zijiang James Yang*, and Cristiano L. Pereira ‡ University of Michigan, Ann Arbor † Western.
Bug Localization with Machine Learning Techniques Wujie Zheng
Thread Criticality Predictors for Dynamic Performance, Power, and Resource Management in Chip Multiprocessors Abhishek Bhattacharjee and Margaret Martonosi.
AADEBUG MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium.
Buffered dynamic run-time profiling of arbitrary data for Virtual Machines which employ interpreter and Just-In-Time (JIT) compiler Compiler workshop ’08.
Scalable Statistical Bug Isolation Authors: B. Liblit, M. Naik, A.X. Zheng, A. Aiken, M. I. Jordan Presented by S. Li.
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
DoubleChecker: Efficient Sound and Precise Atomicity Checking Swarnendu Biswas, Jipeng Huang, Aritra Sengupta, and Michael D. Bond The Ohio State University.
COMP 111 Threads and concurrency Sept 28, Tufts University Computer Science2 Who is this guy? I am not Prof. Couch Obvious? Sam Guyer New assistant.
CINT C++ Interpreter update ROOT2001 at Fermi-Lab Masaharu Goto.
PMI: A Scalable Process- Management Interface for Extreme-Scale Systems Pavan Balaji, Darius Buntinas, David Goodell, William Gropp, Jayesh Krishna, Ewing.
Cooperative Concurrency Bug Isolation Guoliang Jin, Aditya Thakur, Ben Liblit, Shan Lu University of Wisconsin–Madison Instrumentation and Sampling Strategies.
Basic of Parallel Programming Library Subroutine : library functions that support parallelism New Constructs : language is extended to support parallelism.
Accelerating Dynamic Software Analyses Joseph L. Greathouse Ph.D. Candidate Advanced Computer Architecture Laboratory University of Michigan December 1,
A Methodology for Creating Fast Wait-Free Data Structures Alex Koganand Erez Petrank Computer Science Technion, Israel.
Seminar of “Virtual Machines” Course Mohammad Mahdizadeh SM. University of Science and Technology Mazandaran-Babol January 2010.
CISC Machine Learning for Solving Systems Problems Presented by: Suman Chander B Dept of Computer & Information Sciences University of Delaware Automatic.
Bug Isolation via Remote Sampling. Lemonade from Lemons Bugs manifest themselves every where in deployed systems. Each manifestation gives us the chance.
Sampling Dynamic Dataflow Analyses Joseph L. Greathouse Advanced Computer Architecture Laboratory University of Michigan University of British Columbia.
A Binary Agent Technology for COTS Software Integrity Anant Agarwal Richard Schooler InCert Software.
On-Demand Dynamic Software Analysis Joseph L. Greathouse Ph.D. Candidate Advanced Computer Architecture Laboratory University of Michigan November 29,
4.1 Introduction to Threads Overview Multithreading Models Thread Libraries Threading Issues Operating System Examples Windows XP Threads Linux Threads.
Demand-Driven Software Race Detection using Hardware Performance Counters Joseph L. Greathouse †, Zhiqiang Ma ‡, Matthew I. Frank ‡ Ramesh Peri ‡, Todd.
Cooperative Bug Isolation CS Outline Something different today... Look at monitoring deployed code –Collecting information from actual user runs.
Automated Adaptive Bug Isolation using Dyninst Piramanayagam Arumuga Nainar, Prof. Ben Liblit University of Wisconsin-Madison.
ECE 297 Concurrent Servers Process, fork & threads ECE 297.
1 Introduction to Threads Race Conditions. 2 Process Address Space Revisited Code Data OS Stack (a)Process with Single Thread (b) Process with Two Threads.
Kendo: Efficient Deterministic Multithreading in Software M. Olszewski, J. Ansel, S. Amarasinghe MIT to be presented in ASPLOS 2009 slides by Evangelos.
Qin Zhao1, Joon Edward Sim2, WengFai Wong1,2 1SingaporeMIT Alliance 2Department of Computer Science National University of Singapore
December 1, 2006©2006 Craig Zilles1 Threads & Atomic Operations in Hardware  Previously, we introduced multi-core parallelism & cache coherence —Today.
Processes and threads.
Background on the need for Synchronization
Effective Data-Race Detection for the Kernel
Lazy Diagnosis of In-Production Concurrency Bugs
RDE: Replay DEbugging for Diagnosing Production Site Failures
Heming Cui, Jingyue Wu, John Gallagher, Huayang Guo, Junfeng Yang
Chapter 4 Multithreading programming
Public Deployment of Cooperative Bug Isolation
Changing thread semantics
Outline System architecture Current work Experiments Next Steps
Foundations and Definitions
Don Porter Portions courtesy Emmett Witchel
Sampling Dynamic Dataflow Analyses
Presentation transcript:

Aditya Thakur Rathijit Sen Ben Liblit Shan Lu University of Wisconsin–Madison Workshop on Dynamic Analysis 2009 Cooperative Crug Isolation

Cooperative Crug Isolation read(x) write(x) Thread 1 Thread 2 Race ! read(x) write(x) Thread 1 write(x) Thread 2 Atomicity violation! (concurrency bug)

Cooperative Crug Isolation threaded.exe file.in threaded.exe file.in  developer user Non-determinism! More cores More threads       More crugs

Cooperative Crug Isolation

 unlock(mut); lock(mut); Thread 1 mut = NULL; Thread 2 Global variables are shown in bold. Simplified crug from PBZIP2

Cooperative Crug Isolation Global variables are shown in bold. Identify root cause of crug unlock(mut); lock(mut); Thread 1 mut = NULL; Thread 2

Cooperative Crug Isolation Not scalable, High overhead Report benign crugs Target specific type of crugs and synchronization Current techniques

Cooperative Crug Isolation Scalable, Low overhead Does not report benign crugs Multiple types of crugs and synchronization

Shipping Application Cooperative Crug Isolation Bug Isolation Program Source Compiler Sampler Predicates Counts & /  Statistical Debugging Top bugs with likely causes

Cooperative Crug Isolation Bug Isolation unlock(mut); lock(mut); Thread 1 mut = NULL; Thread 2 unlock(mut); lock(mut); Thread 1 mut = NULL; Thread 2 CBI predicates inadequate for crug isolation. Values of predicates same for successful and failing runs.

Cooperative Crug Isolation Bug Isolation unlock(mut); lock(mut); Thread 1 mut = NULL; Thread 2 CBI sampling inadequate for crug isolation. Sampling thread-local, independent.

Cooperative Crug Isolation Bug Isolation CBI was unable to diagnose crugs in any of the benchmarks used. No bug predictors reported!

Cooperative Crug Isolation CCI extends the CBI framework to target crugs  New predicate capturing interleaving events  New cross-thread sampling scheme

Cooperative Crug Isolation Predicate Design unlock(mut); S: lock(mut); Thread 1 mut = NULL; Thread 2 remote S is true  local S is true

Predicate Instrumentation At runtime, maintain hashtable which maps addresses to thread id which last accessed it AddressThread Id 0xb1ab1a1 0xf00f002 0xb1af001

Predicate Instrumentation access(x); record(S, differs); differs = test_and_insert(&x, curTid); lock(glock); unlock(glock);

Predicate Instrumentation access(x); record(S, differs); differs = test_and_insert(&x, curTid); lock(glock); unlock(glock); curTid is thread id of currently executing thread

Predicate Instrumentation access(x); record(S, differs); differs = test_and_insert(&x, curTid); lock(glock); unlock(glock); Check if curTid was the thread which previously accessed x

Predicate Instrumentation access(x); record(S, differs); differs = test_and_insert(&x, curTid); lock(glock); unlock(glock); Set differs to true if it was not

Predicate Instrumentation access(x); record(S, differs); differs = test_and_insert(&x, curTid); lock(glock); unlock(glock); Update the hashtable

Predicate Instrumentation access(x); record(S, differs); differs = test_and_insert(&x, curTid); lock(glock); unlock(glock); Increment counter for predicate at S

Predicate Instrumentation access(x); record(S, differs); differs = test_and_insert(&x, curTid); lock(glock); unlock(glock); Execute block atomically

Predicate Instrumentation access(x); record(S, differs); differs = test_and_insert(&x, curTid); lock(glock); unlock(glock); Handles accesses through pointers. No need for static pointer analysis.

Predicate Instrumentation access(x); record(S, differs); differs = test_and_insert(&x, curTid); lock(glock); unlock(glock); curTid is thread id of currently executing thread Check if curTid was the thread which previously accessed x Set differs to true if it was not Increment counter for predicate at S Execute block atomically Update the hashtable

Sampling Mechanism access(x); record(S, differs); differs = test_and_insert(&x, curTid); lock(glock); unlock(glock); If(gsample == 0) access(x); gsample = curTid; insert(&x, curTid); else if(gsample == curTid) gsample = 0; clear(); Is sampling on? Turn on sampling Update hashtable Stop sampling, clear hashtable Did current thread initiate sampling Sampling not on Sampling already on

Sampling Mechanism lock(mut); Thread 1 AddressThread Id Hashtable gsample = 0

Sampling Mechanism lock(mut); Thread 1Thread 2 AddressThread Id &mut1 Hashtable gsample = 1

Sampling Mechanism lock(mut); Thread 1 mut = NULL; Thread 2 AddressThread Id &mut2 Hashtable gsample = 1

Sampling Mechanism unlock(mut); lock(mut); Thread 1 mut = NULL; Thread 2 S: AddressThread Id &x2 Hashtable gsample = 1 Record remote S is true

Sampling Mechanism unlock(mut); lock(mut); Thread 1 mut = NULL; Thread 2 S: AddressThread Id Hashtable gsample = 0 Stop sampling

Experimental Evaluation  Benchmarks used  Apache HTTP server, PBZIP2  SPLASH-2: FFT, LU  Machine used  dual-core Intel P4  Questions answered  Runtime overhead  Accuracy of predictors

Runtime Overhead BenchmarkNo samplingSampling Apache25%2% PBZIP2200%7% FFT650%25% LU1,300%800% Overhead compared to uninstrumented code Low overheads for both real-world applications Large difference between no sampling and sampling.

Predictor Accuracy PredictorFunction R: buf->outcnt += len ap_buffered_log_writer() Apache PredictorFunction R : pthread_mutex_unlock(fifo->mut); consumer_decompress() PBZIP2 remote predicate

Predictor Accuracy PredictorFunction R: G lobal->finishtime=finish SlaveStart() R: G lobal->initdonetime=initdone SlaveStart() R: printf(“..”,Global->transtime[0]…) main() L: malloc(2*(rootN-1)*sizeof(double)); SlaveStart() FFT PredictorFunction R: G lobal->rf=rf OneSolve() L: (Global->start).gsense=-lsense; OneSolve() LU local predicate

Conclusion CCI is a low-overhead, scalable approach for root cause analysis of crugs Effective on two widely-deployed applications Simple predicates are effective because of the use of statistical models

Next time on What other events are useful for crug isolation? Scope for static analysis to help? Other cross-thread sampling mechanisms (e.g. bursty sampling)? Crug isolation to crug tolerance? Thank you!