Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.

Slides:

Advertisements

Similar presentations

More on Processes Chapter 3. Process image _the physical representation of a process in the OS _an address space consisting of code, data and stack segments.

Advertisements

Mutual Exclusion.

Limits on ILP. Achieving Parallelism Techniques – Scoreboarding / Tomasulo’s Algorithm – Pipelining – Speculation – Branch Prediction But how much more.

Presented By: Krishna Balasubramanian

1 Cost Effective Dynamic Program Slicing Xiangyu Zhang Rajiv Gupta The University of Arizona.

CS590F Software Reliability What is a slice? S: …. = f (v)  Slice of v at S is the set of statements involved in computing v’s value at S. [Mark Weiser,

Recording Inter-Thread Data Dependencies for Deterministic Replay Tarun GoyalKevin WaughArvind Gopalakrishnan.

1 S. Tallam, R. Gupta, and X. Zhang PACT 2005 Extended Whole Program Paths Sriraman Tallam Rajiv Gupta Xiangyu Zhang University of Arizona.

Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng.

Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.

Program Slicing Xiangyu Zhang. CS590F Software Reliability What is a slice? S: …. = f (v)  Slice of v at S is the set of statements involved in computing.

Pruning Dynamic Slices With Confidence Xiangyu Zhang Neelam Gupta Rajiv Gupta The University of Arizona.

Computer Architecture 2011 – Out-Of-Order Execution 1 Computer Architecture Out-Of-Order Execution Lihu Rappoport and Adi Yoaz.

Continuously Recording Program Execution for Deterministic Replay Debugging.

Process Description and Control

Fault Location via State Alteration CS 206 Fall 2009.

Page 1 Processes and Threads Chapter Processes 2.2 Threads 2.3 Interprocess communication 2.4 Classical IPC problems 2.5 Scheduling.

Rajiv Gupta Chen Tian, Min Feng, Vijay Nagarajan Speculative Parallelization of Applications on Multicores.

Chapter 2 Instruction-Level Parallelism and Its Exploitation

CS590 Z Software Defect Analysis Xiangyu Zhang. CS590F Software Reliability What is Software Defect Analysis  Given a software program, with or without.

1 Threads Chapter 4 Reading: 4.1,4.4, Process Characteristics l Unit of resource ownership - process is allocated: n a virtual address space to.

U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 2006 Exterminator: Automatically Correcting Memory Errors Gene Novark, Emery Berger.

Automated Diagnosis of Software Configuration Errors

0 Deterministic Replay for Real- time Software Systems Alice Lee Safety, Reliability & Quality Assurance Office JSC, NASA Yann-Hang.

CS527: (Advanced) Topics in Software Engineering Overview of Software Quality Assurance Tao Xie ©D. Marinov, T. Xie.

Introduction to Embedded Systems

1 Lecture 4: Threads Operating System Fall Contents Overview: Processes & Threads Benefits of Threads Thread State and Operations User Thread.

Presentation of Failure- Oblivious Computing vs. Rx OS Seminar, winter 2005 by Lauge Wullf and Jacob Munk-Stander January 4 th, 2006.

Bug Localization with Machine Learning Techniques Wujie Zheng

School of Electrical Engineering and Computer Science University of Central Florida Anomaly-Based Bug Prediction, Isolation, and Validation: An Automated.

Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.

Lecture 3 Process Concepts. What is a Process? A process is the dynamic execution context of an executing program. Several processes may run concurrently,

Interactive Debugging QuickZoom: A State Alteration and Inspection-based Interactive Debugger 1.

Which Configuration Option Should I Change? Sai Zhang, Michael D. Ernst University of Washington Presented by: Kıvanç Muşlu.

Implicitly-Multithreaded Processors Il Park and Babak Falsafi and T. N. Vijaykumar Presented by: Ashay Rane Published in: SIGARCH Computer Architecture.

© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Memory: Relocation.

Title of Selected Paper: IMPRES: Integrated Monitoring for Processor Reliability and Security Authors: Roshan G. Ragel and Sri Parameswaran Presented by:

CS 211: Computer Architecture Lecture 6 Module 2 Exploiting Instruction Level Parallelism with Software Approaches Instructor: Morris Lancaster.

1 Test Selection for Result Inspection via Mining Predicate Rules Wujie Zheng

References: “Pruning Dynamic Slices With Confidence’’, by X. Zhang, N. Gupta and R. Gupta (PLDI 2006). “Locating Faults Through Automated Predicate Switching’’,

“Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore.

CISC Machine Learning for Solving Systems Problems Presented by: Suman Chander B Dept of Computer & Information Sciences University of Delaware Automatic.

1 Computer Systems II Introduction to Processes. 2 First Two Major Computer System Evolution Steps Led to the idea of multiprogramming (multiple concurrent.

Bug Localization with Association Rule Mining Wujie Zheng

Multithreaded Programing. Outline Overview of threads Threads Multithreaded Models  Many-to-One  One-to-One  Many-to-Many Thread Libraries  Pthread.

Consider Starting with 160 k of memory do: Starting with 160 k of memory do: Allocate p1 (50 k) Allocate p1 (50 k) Allocate p2 (30 k) Allocate p2 (30 k)

Pruning Dynamic Slices With Confidence Original by: Xiangyu Zhang Neelam Gupta Rajiv Gupta The University of Arizona Presented by: David Carrillo.

Evaluating the Fault Tolerance Capabilities of Embedded Systems via BDM M. Rebaudengo, M. Sonza Reorda Politecnico di Torino Dipartimento di Automatica.

Testing Overview Software Reliability Techniques Testing Concepts CEN 4010 Class 24 – 11/17.

Automated Adaptive Bug Isolation using Dyninst Piramanayagam Arumuga Nainar, Prof. Ben Liblit University of Wisconsin-Madison.

Self Recovery in Server Programs The University of California, Riverside Vijay Nagarajan Dennis JeffreyRajiv Gupta International Symposium on Memory Management.

Memory Management What if pgm mem > main mem ?. Memory Management What if pgm mem > main mem ? Overlays – program controlled.

Process Management Deadlocks.

Presented by: Daniel Taylor

YAHMD - Yet Another Heap Memory Debugger

Static Slicing Static slice is the set of statements that COULD influence the value of a variable for ANY input. Construct static dependence graph Control.

Effective Data-Race Detection for the Kernel

System Structure and Process Model

Process & its States Lecture 5.

Operating Systems.

Process Description and Control

Threads Chapter 4.

Process Description and Control

Test Case Test case Describes an input Description and an expected output Description. Test case ID Section 1: Before execution Section 2: After execution.

Process Description and Control

50.530: Software Engineering

Foundations and Definitions

Programming with Shared Memory Specifying parallelism

COMP755 Advanced Operating Systems

Software Testing and QA Theory and Practice (Chapter 5: Data Flow Testing) © Naik & Tripathy 1 Software Testing and Quality Assurance Theory and Practice.

Presentation transcript:

Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft Research

Motivation  Software bugs cost the U.S. economy about $59.5 billion each year [NIST 02].  Embedded Systems Mission Critical / Safety Critical Tasks A failure can lead to Loss of Mission/Life. (Ariane 5) arithmetic overflow led to shutdown of guidance computer. (Mars Climate Orbiter) missed unit conversion led to faulty navigation data. (Mariner I) missing superscripted bar in the specification for the guidance program led to its destruction 293 seconds after launch. (Mars Pathfinder) priority inversion error causing system reset. (Boeing ) loss of engine & flight displays while in flight. (Toyota hybrid Prius) VSC, gasoline-powered engine shut off. (Therac-25) wrong dosage during radiation therapy. …….

Overview Dynamic Slicing Offline Environment Faults Online Program Execution Fault Location Fault Avoidance Scalability Tracing + Logging Long-running Multi-threaded

Fault Location Goal: Assist the programmer in debugging by automatically narrowing the fault to a small section of the code.  Dynamic Information Data dependences Control dependences Values  Execution Runs One failed execution & Its perturbations

Dynamic Information …… ExecutionProgram Dynamic Dependence Graph Data Control

Approach Detect execution of statement s such that  Faulty code Affects the value computed by s; or  Faulty code is Affected-by the value computed by s through a chain of dependences. Estimate the set of potentially faulty statements from s:  Affects: statements from which s is reachable in the dynamic dependence graph. (Backward Slice)  Affected-by: statements that are reachable from s in the dynamic dependence graph. (Forward Slice)  Intersect slices to obtain a smaller fault candidate set.

Backward & Forward Slices Erroneous Output Failure inducing Input Backward Slice [Korel&Laski,1988] Forward Slice [ASE-05]

Failure Inducing Input Erroneous Output [ASE-05]  For memory bugs the number of statements is very small (< 5). Backward & Forward Slices

Bidirectional Slices Backward Slice of CP Forward Slice of CP + Bidirectional Slice Combined Slice [ICSE-06] Found critical predicates in 12 out of 15 bugs Search for critical predicate: Brute force: 32 predicates to 155K predicates; After Filtering and Ordering: 1 to 7K predicates. Critical Predicate: An execution instance of a predicate such that changing its outcome “repairs” the program state.

Pruning Slices  Confidence in v  v C( v ): [0,1] 0 - all values of v produce same 1 - any change in v will change How? Value profiles.  [PLDI-06]

Test Programs Real Reported BugsInjected Bugs  Nine logical bugs (incorrect ouput) Unix utilities  grep 2.5, grep 2.5.1, flex , make  Six memory bugs (program crashes) Unix utilities  gzip, ncompress, polymorph, tar, bc, tidy.  Siemens Suite (numerous versions)  schedule, schedule2, replace, print_tokens.. Unix utilities  gzip, flex

Dynamic Slice Sizes Buggy RunsBSFSBiS flex (a) flex (b) NA flex (c) NA grep 2.5NA73188 grep 2.5.1(a)NA32111 grep 2.5.1(b)NA599NA grep 2.5.1(c)NA12453 make 3.80(a) make 3.80(b) gzip ncompress polymorph tar bc tidy

Combined Slices Buggy Runs BS BS^FS^BiS (%BS) flex (a)69527 (3.9%) flex (b) (37.5%) flex (c)505 (10%) grep 2.5NA86 (7.4%*EXEC) grep 2.5.1(a)NA25 (4.9%*EXEC) grep 2.5.1(b)NA599 (53.3%*EXEC) grep 2.5.1(c)NA12 (0.9%*EXEC) make 3.80(a) (81.4%) make 3.80(b) (75.3%) gzip (8.8%) ncompress (14.3%) polymorph (14.3%) tar (42.9%) bc (50%) tidy (29.1%)

Evaluation of Pruning ProgramDescriptionLOCVersionsTests print_tokensLexical analyzer print_tokens2Lexical analyzer replacePattern replacement schedulePriority scheduler schedule2Priority scheduler gzipUnix utility flexUnix utility Siemen’s Suite Single error is injected in each version. All the versions are not included:  No output or the very first output is wrong;  Root cause is not contained in the BS (code missing error).

ProgramBSPruned SlicePruned Slice / BS print_tokens % print_tokens % replace % schedule % schedule % gzip % flex % Evaluation of Pruning

Effectiveness Backward Slice [AADEBUG-05] ≈ 31% of Executed Statements Combined Slice [ASE-05,ICSE-06] ≈ 36% of Backward Slice ≈ 11% of Exec. Erroneous output Failure inducing input Critical predicate Pruned Slice [PLDI-06] ≈ 41% of Backward Slice ≈ 13% of Exec. Confidence Analysis

Effectiveness  Slicing is effective in locating faults.  No more than 10 static statements had to be inspected. Program-bugInspected Stmts. mutt – heap overflow8 pine – stack overflow3 pine – heap overflow10 mc – stack overflow2 squid – heap overflow5 bc – heap overflow3

Execution Omission Errors X= =X A = A<0 X= =X A = A<0 X= =X A = A<0 Inspect pruned slice. Dynamically detect an Implicit dependence. Incrementally expand the pruned slice. [PLDI-07] Implicit dependence

Scalability of Tracing Dynamic Information Needed Dynamic Dependences  for all slicing Values for Confidence Analysis  for pruning slices  annotates the static program representation Whole Execution Trace (WET)  Trace Size ≈ 15 Bytes / Instruction

Trace Sizes & Collection Overheads  Trace sizes are very large for even 10s of execution. ProgramRunning Time Dep. Trace Collection Time mysql13 s21 GB2886 s prozilla8 s6 GB2640 s proxyC10 s456 MB880 s mc10 s55 GB418 s mutt20 s388 GB3238 s pine14 s156 GB2088 s squid15 s88 GB1132 s

Compacting Whole Execution Traces  Explicitly remember dynamic control flow trace.  Infer as many dynamic dependences as possible from control flow (94%), remember the remaining dependences explicitly (≈ 6%).  Specialized graph representation to enable inference.  Explicitly remember value trace.  Use context-based method to compress dynamic control flow, value, and address trace.  Bidirectional traversal with equal ease [MICRO-04, TACO-05]

Input: N=2 5 1 : for I=1 to N do 6 1 : if (i%2==0) then 7 1 : p=&a 8 1 : a=a : z=2*(*p) 10 1 : print(z) 1 1 : z=0 2 1 : a=0 3 1 : b=2 4 1 : p=&b 5 2 : for I=1 to N do 6 2 : if (i%2==0) then 8 2 : a=a : z=2*(*p) 1: z=0 2: a=0 3: b=2 4: p=&b 5: for i = 1 to N do 6: if ( i %2 == 0) then 7: p=&a endif endfor 8: a=a+1 9: z=2*(*p) 10: print(z) Dependence Graph Representation

5:for i=1 to N 6:if (i%2==0) then 7: p=&a 8: a=a+1 9: z=2*(*p) 10: print(z) T F 1: z=0 2: a=0 3: b=2 4: p=&b T Input: N=2 1 1 : z=0 2 1 : a=0 3 1 : b=2 4 1 : p=&b 5 1 : for i = 1 to N do 6 1 : if ( i %2 == 0) then 8 1 : a=a : z=2*(*p) 5 2 : for i = 1 to N do 6 2 : if ( i %2 == 0) then 7 1 : p=&a 8 2 : a=a : z=2*(*p) 10 1 : print(z) T F Dependence Graph Representation

Transform: Traces of Blocks

Infer: Local Dependence Labels X = Y= X X = Y= X (10,10) (20,20) (30,30) 10,20,30 X = Y= X =Y 21 (20,21)... (...,20)...

Transform: Local Dep. Labels X = *P = Y= X X = *P = Y= X (10,10) (20,20) 10,20 X = *P = Y= X (20,20)

Transform: Local Dep. Labels X = *P = Y= X X = *P = Y= X (10,10) (20,20) X = *P = Y= X 10,20 X = *P = Y= X ,21 (10,11) (20,21) =Y 11,21 =Y (20,21) (10,11)

Group: Non-Local Dep. Edges X = Y = = Y = X X = Y = X = Y = = Y = X X = Y = (10,21) (20,11) X = Y = = Y = X X = Y = (20,11) (10,21) 11,

Compacted WET Sizes Program Statements Executed (Millions) WET Size (MB)Before / After BeforeAfter 300.twolf 256.bzip2 255.vortex 197.parser 181.mcf 164.gzip 130.li 126.gcc 099.go Average ,666 11,921 8,748 8,730 10,541 9,688 10,399 5,238 10,369 9, ≈ 4 Bits / Instruction

[PLDI-04] vs. [ICSE-03] Slicing Times

Dep. Graph Generation Times  Offline post-processing after collecting address and control flow traces  ≈ 35x of execution time  Online techniques [ICSM 2007]  Information Flow: 9x to18x slowdown  Basic block Opt.: 6x to10x slowdown  Trace level Opt.: 5.5x to 7.5x slowdown  Dual Core: ≈1.5x slowdown  Online Filtering techniques  Forward slice of all inputs  User-guided bypassing of functions

Reducing Online Overhead  Record non-deterministic events online Less than 2x overhead Deterministic replay of executions  Trace faulty executions off-line Replay the execution Switch on tracing Collect and inspect traces  Trace analysis is still a problem The traces correspond to huge executions Off-line overhead of trace collection is still significant

Reducing Trace Sizes Checkpointing Schemes  Trace from the most recent checkpoint Checkpoints are of the order of minutes. Better but the trace sizes are still very large. Exploiting Program Characteristics  Multithreaded and server-like [ISSTA-07, FSE-06] Examples : mysql, apache. Each request spawns a new thread. Do not trace irrelevant threads.

Beyond Tracing  Checkpoint: capture memory image.  Execute and Record (log) Events.  Upon Crash, Rollback to checkpoint.  Reduce log and Replay execution using reduced log.  Turn on tracing during replay. x Checkpointlog x Trace Reduced log  Applicable to Multithreaded Programs [ISSTA-07]

An Example  A mysql bug “ load …” command will crash the server if database is not specified Without typing “use database_name”, thd->db is Null.

Example – Execution and log file open path=/etc/my.cnf … Wait for connection Create Thread 1 Wait for command Create Thread 2 Wait for command Recv “show databases” Handle command Recv “load data …” Handle -- ( server crashes ) Recv “use test; select * from b” Handle command Run mysql server User 1 connects to the server User 2 connects to the server User 1: “show databases” User 2: “use test” “ select * from b” User 1: “load data into table1” Time Blue – T0 Red – T1 Green – T2 Gray - Scheduler

Execution Replay using Reduced log open path=/etc/my.cnf … Wait for connection Create Thread 1 Recv “load data …” Handle -- ( server crashes ) Run mysql server User 1 connects to the server User 2 connects to the server User 1: “show databases” User 2: “show databases” “ select * from b” User 1: “load data into table1” Time

Execution Reduction  Effects of Reduction  Irrelevant Threads  Replay-only vs. Replay & Trace  How? By identifying Inter-thread Dependences  Event Dependences - found using the log  File Dependences - found using the log  Shared-Memory Dependences - found using replay  Space requirement reduced by 4x  Time requirement reduced by 2x  Naïve approach requires thread id of last writer of each address  Space and time efficient detection o Memory Regions: Non-shared vs shared o Locality of References to Regions

Experimental Results

Original Optimized Program-bug Trace Sizes Num. of dependences Experimental Results

Orig. OPT. Program-bug Execution Times (seconds) Logging Experimental Results

Debugging System Slicing Module WET Slices Application binary Execution Engine Valgrind InputOutput Traces Instrument code Compressed Trace Static Binary Analyzer Diablo Control Dependence Reduced Log Record Replay Jockey Checkpoint + log

Fault Avoidance Large number of faults in server programs are caused by the environment. 56 % of faults in Apache server. Types of Faults Handled  Atomicity Violation Faults. Try alternate scheduling decisions.  Heap Buffer Overflow Faults. Pad memory requests.  Bad User Request Faults. Drop bad requests. Avoidance Strategy  Recover first time, Prevent later. Record the change that avoided the fault.

Experiments ProgramType of BugEnv. Change# of Trials Time taken (secs.) mysql-1Atomicity Violn.Scheduler1130 mysql-2Atomicity Violn.Scheduler165 mysql-3Atomicity Violn.Scheduler165 mysql-4Buffer Overflow.Mem. Padding1700 pine-1Buffer Overflow.Mem. Padding1325 pine-2Buffer Overflow.Mem. Padding1270 mutt-1Bad User Req.Drop Req.3205 bc-1Bad User Req.Drop Req.3290 bc-2Bad User Req.Drop Req.3195

Summary Dynamic Slicing Offline Environment Faults Online Program Execution Fault Location Fault Avoidance Scalability Tracing + Logging Long-running Multi-threaded

Dissertations Xiangyu Zhang, Purdue University  Fault Location Via Precise Dynamic Slicing, SIGPLAN Outstanding Doctoral Dissertation Award Sriraman Tallam, Google  Fault Location and Avoidance in Long-Running Multithreaded Programs, 2007.

Fault Location via State Alteration CS 206 Fall 2011

48 Value Replacement: Overview INPUT: Faulty program and test suite (1+ failing runs) TASK: (1) Perform value replacements in failing runs (2) Rank program statements according to collected information OUTPUT: Ranked list of program statements  Aggressive state alteration to locate faulty program statements [Jeffrey et. al., ISSTA 2008]

49 Alter State by Replacing Values Passing ExecutionFailing Execution Correct Output Failing Execution: Altered State ERROR REPLACE VALUES ERROR Incorrect Output Correct? / Incorrect?

50 1: read (x, y); 2: a := x - y; 3: if (x < y) 4: write (a); else 5: write (a + 1); Example of a Value Replacement (output: ?)PASSING EXECUTION: (F)

51 Example of a Value Replacement FAILING EXECUTION:(expected output: 1) (actual output: ?) 1: read (x, y); 2: a := x + y; 3: if (x < y) 4: write (a); else 5: write (a + 1); ERROR: plus should be minus (F) ERROR 3

52 Example of a Value Replacement STATE ALTERATION: 1: read (x, y); 2: a := x + y; 3: if (x < y) 4: write (a); else 5: write (a + 1); ERROR (expected output: 1) (actual output: ?) Original Values 0 11 Alternate Values REPLACE VALUES (T) 3 4 Interesting Value Mapping Pair (IVMP): Location: statement 2, instance 1 Original: {a = 2, x = 1, y = 1} Alternate: {a = 1, x = 0, y = 1}

53 Searching for IVMPs in a Failing Run  Step 1: Compute the Value Profile Set of values used at each statement with respect to all available test case executions  Step 2: Replace values to search for IVMPs For each statement instance in failing run For each alternate set of values in value profile Replace values to see if an IVMP is found Endfor

54 Searching for IVMPs: Example 1: read (x, y); 2: a := x + y; // + should be – 3: if (x < y) 4: write (a); else 5: write (a + 1); Test Case (x, y) Actual Output Expected Output (1, 1)31 (-1, 0) (0, 0)11 x = 1 y = 1 \ -1 \ 0 VALUE REPLACEMENT RESULTING OUTPUT IVMP? 1: read (x, y); VALUE PROFILE a = 2 output = 3 x = 0 y = 0 branch = F x = -1 y = 0 branch = T x = 1 y = 1 branch = F x = 0 y = 0 a = 0 x = -1 y = 0 a = -1 x = 1 y = 1 a = 2 x = 0 y = 0 x = -1 y = 0 x = 1 y = 1 1: 2: 3: 4: 5: a = -1 output = -1 a = 0 output = 1 (1,1)(-1,0) (0,0)

55 Searching for IVMPs: Example 1: read (x, y); 2: a := x + y; // + should be – 3: if (x < y) 4: write (a); else 5: write (a + 1); Test Case (x, y) Actual Output Expected Output (1, 1)31 (-1, 0) (0, 0)11 1: read (x, y); IVMPs Identified: stmt 1, inst 1: ( {x=1, y=1}  {x=0, y=0} ) x = 1 y = 1 \ 0 1 VALUE REPLACEMENT RESULTING OUTPUT IVMP? a = 2 output = 3 x = 0 y = 0 branch = F x = -1 y = 0 branch = T x = 1 y = 1 branch = F x = 0 y = 0 a = 0 x = -1 y = 0 a = -1 x = 1 y = 1 a = 2 x = 0 y = 0 x = -1 y = 0 x = 1 y = 1 1: 2: 3: 4: 5: VALUE PROFILE a = -1 output = -1 a = 0 output = 1

56 Searching for IVMPs: Example 1: read (x, y); 2: a := x + y; // + should be – 3: if (x < y) 4: write (a); else 5: write (a + 1); Test Case (x, y) Actual Output Expected Output (1, 1)31 (-1, 0) (0, 0)11 IVMPs Identified: stmt 1, inst 1: ( {x=1, y=1}  {x=0, y=0} ) 2: a := x + y; stmt 2, inst 1: ( {x=1, y=1, a=2}  {x=0, y=0, a=0} ) x = 1 y = 1 a = 2 \ -1 \ 0 \ -1 x = 1 y = 1 a = 2 \ 0 a = 2 output = 3 x = 0 y = 0 branch = F x = -1 y = 0 branch = T x = 1 y = 1 branch = F x = 0 y = 0 a = 0 x = -1 y = 0 a = -1 x = 1 y = 1 a = 2 x = 0 y = 0 x = -1 y = 0 x = 1 y = 1 1: 2: 3: 4: 5: VALUE PROFILE a = -1 output = -1 a = 0 output = 1

57 Searching for IVMPs: Example 1: read (x, y); 2: a := x + y; // + should be – 3: if (x < y) 4: write (a); else 5: write (a + 1); Test Case (x, y) Actual Output Expected Output (1, 1)31 (-1, 0) (0, 0)11 IVMPs Identified: stmt 1, inst 1: ( {x=1, y=1}  {x=0, y=0} ) 3: if (x < y) stmt 2, inst 1: ( {x=1, y=1, a=2}  {x=0, y=0, a=0} ) x = 1 y = 1 branch = F \ -1 \ 0 \ T x = 1 y = 1 branch = F \ 0 \ F a = 2 output = 3 x = 0 y = 0 branch = F x = -1 y = 0 branch = T x = 1 y = 1 branch = F x = 0 y = 0 a = 0 x = -1 y = 0 a = -1 x = 1 y = 1 a = 2 x = 0 y = 0 x = -1 y = 0 x = 1 y = 1 1: 2: 3: 4: 5: VALUE PROFILE a = -1 output = -1 a = 0 output = 1

58 Searching for IVMPs: Example 1: read (x, y); 2: a := x + y; // + should be – 3: if (x < y) 4: write (a); else 5: write (a + 1); Test Case (x, y) Actual Output Expected Output (1, 1)31 (-1, 0) (0, 0)11 IVMPs Identified: stmt 1, inst 1: ( {x=1, y=1}  {x=0, y=0} ) 5: write (a + 1); stmt 2, inst 1: ( {x=1, y=1, a=2}  {x=0, y=0, a=0} ) stmt 5, inst 1: ( {a=2, output=3}  {a=0, output=1} ) a = 2 output = 3 \ 0 \ 1 a = 2 output = 3 x = 0 y = 0 branch = F x = -1 y = 0 branch = T x = 1 y = 1 branch = F x = 0 y = 0 a = 0 x = -1 y = 0 a = -1 x = 1 y = 1 a = 2 x = 0 y = 0 x = -1 y = 0 x = 1 y = 1 1: 2: 3: 4: 5: VALUE PROFILE a = -1 output = -1 a = 0 output = 1

59 Searching for IVMPs: Example 1: read (x, y); 2: a := x + y; // + should be – 3: if (x < y) 4: write (a); else 5: write (a + 1); Test Case (x, y) Actual Output Expected Output (1, 1)31 (-1, 0) (0, 0)11 IVMPs Identified: stmt 1, inst 1: ( {x=1, y=1}  {x=0, y=0} ) stmt 2, inst 1: ( {x=1, y=1, a=2}  {x=0, y=0, a=0} ) stmt 5, inst 1: ( {a=2, output=3}  {a=0, output=1} ) DONE a = 2 output = 3 x = 0 y = 0 branch = F x = -1 y = 0 branch = T x = 1 y = 1 branch = F x = 0 y = 0 a = 0 x = -1 y = 0 a = -1 x = 1 y = 1 a = 2 x = 0 y = 0 x = -1 y = 0 x = 1 y = 1 1: 2: 3: 4: 5: VALUE PROFILE a = -1 output = -1 a = 0 output = 1

60 IVMPs at Non-Faulty Statements  Causes of IVMPs at non-faulty statements Statements in same dependence chain Coincidence  Consider multiple failing runs Stmt w/ IVMPs in more runs  more likely to be faulty Stmt w/ IVMPs in fewer runs  less likely to be faulty

61 {1, 2} {4, 5} {3} MOST LIKELY TO BE FAULTY LEAST LIKELY TO BE FAULTY Multiple Failing Runs: Example 1: read (x, y); 2: a := x + y; 3: if (x < y) 4: write (a); else 5: write (a + 1); Test Case (x, y)Actual OutputExpected Output (1, 1)31 (0, 1)1 (-1, 0) (0, 0)11 [A] [B] [C] [D] Test Case [A] IVMPs: stmt 1, inst 1: ( {x=1, y=1}  {x=0, y=1} ) stmt 1, inst 1: ( {x=1, y=1}  {x=0, y=0} ) stmt 2, inst 1: ( {x=1, y=1, a=2}  {x=0, y=1, a=1} ) stmt 2, inst 1: ( {x=1, y=1, a=2}  {x=0, y=0, a=0} ) stmt 5, inst 1: ( {a=2, output=3}  {a=0, output=1} ) stmts with IVMPs: {1, 2, 5} 1: read (x, y); 2: a := x + y; 5: write (a + 1); Test Case [B] IVMPs: stmt 1, inst 1: ( {x=0, y=1}  {x=-1, y=0} ) stmt 2, inst 1: ( {x=0, y=1,a=1}  {x=-1, y=0,a=-1} ) stmt 4, inst 1: ( {a=1, output=1}  {a=-1, output=-1} ) stmts with IVMPs: {1, 2, 4} 2: a := x + y; 4: write (a); 1: read (x, y);

62 Ranking Statements using IVMPs  Sort in decreasing order of:  Break ties using Tarantula technique [Jones et. al., ICSE 2002] The number of failing runs in which the statement is associated with at least one IVMP fraction of failing runs exercising stmt fraction of passing runs exercising stmt fraction of failing runs exercising stmt +

63 Techniques Evaluated  Value Replacement technique Consider all available failing runs (ValRep-All) Consider only 2 failing runs (ValRep-2) Consider only 1 failing run (ValRep-1)  Tarantula technique (Tarantula) Consider all available test cases Most effective technique known for our benchmarks  Only rank statements exercised by failing runs

64  Score for each ranked statement list  Represents percentage of statements that need not be examined before error is located  Higher score is better Metric for Comparison size of list rank of the faulty stmt 100% x size of list High Suspiciousness Low Suspiciousness High Suspiciousness Low Suspiciousness Higher Score Lower Score

65 Benchmark Programs ProgramLOC# Faulty Ver.Avg. Suite Size (Pool Size) tcas (1608) totinfo (1052) sched (2650) sched (4130) ptok (4130) ptok (4115) replace (5542)  129 faulty programs (errors) derived from 7 base programs  Each faulty program is associated with a branch-coverage adequate test suite containing at least 5 failing and 5 passing test cases  Test suite used by Value Replacement, test pool used by Tarantula

66 Effectiveness Results Number (%) of faulty programs ScoreValRep-AllVal-Rep-2ValRep-1 ≥ 99%23 (17.8%)21 (16.3%)18 (14.0%) ≥ 90%89 (69.0%)84 (65.1%)75 (58.1%) Value Replacement technique

67 Effectiveness Results Comparison to Tarantula Number (%) of faulty programs ScoreValRep-AllVal-Rep-2ValRep-1Tarantula ≥ 99%23 (17.8%)21 (16.3%)18 (14.0%)7 (5.4%) ≥ 90%89 (69.0%)84 (65.1%)75 (58.1%)48 (37.2%)

68 Value Replacement: Summary  Highly Effective Precisely locates 39 / 129 errors (30.2%) Most effective previously known: 5 / 129 (3.9%)  Limitations Can require significant computation time to search for IVMPs Assumes multiple failing runs are caused by the same error

69 Handling Multiple Errors  Effectively locate multiple simultaneous errors [Jeffrey et. al., ICSM 2009] Iteratively compute a ranked list of statements to find and fix one error at a time Three variations of this technique  MIN: minimal computation; use same list each time  FULL: full computation; produce new list each time  PARTIAL: partial computation; revise list each time

70 Multiple-Error Techniques Value Replacement Faulty Program and Test Suite Ranked List of Program Statements Developer Find/Fix Error Done Single Error Faulty Program and Test Suite Ranked List of Program Statements Done Failing Run Remains? No Yes Multiple Errors (MIN) Value Replacement Developer Find/Fix Error

71 Multiple-Error Techniques Faulty Program and Test Suite Ranked List of Program Statements Done Failing Run Remains? No Yes Multiple Errors (FULL) Value Replacement Developer Find/Fix Error Faulty Program and Test Suite Ranked List of Program Statements Done Failing Run Remains? No Yes Multiple Errors (PARTIAL) Partial Value Replacement Developer Find/Fix Error

72 PARTIAL Technique  Step 1: Initialize ranked lists and locate first error For each statement s, compute a ranked list by considering only failing runs exercising s Report ranked list with highest suspiciousness value at the front of the list  Step 2: Iteratively revise ranked lists and locate each remaining error For each remaining failing run that exercises the statement just fixed, recompute IVMPs Update any affected ranked lists Report ranked list with the most different elements at the front of the list, compared to previously-selected lists

73 PARTIAL Technique: Example Program (2 faulty statements) Failing RunExecution Trace Statements with IVMPs A(1, 2, 3, 5){2, 5} B(1, 2, 3, 5){1, 2} C(1, 2, 4, 5){2, 4, 5} Computed Ranked Lists: (statement suspiciousness ) , 5 2, 1 1, 4 1, , 1 1, 5 1, 3 0, , 4 1, 5 1, 1 0, , 5 2, 1 1, 4 1, 3 0 [based on runs A, B, C] [based on runs A, B] [based on run C] [based on runs A, B, C] Report list 1, 2, or 5 (assume 1)  Fix faulty statement 2

74 PARTIAL Technique: Example Program (1 faulty statement) Failing RunExecution Trace Statements with IVMPs C(1, 2, 4, 5){4} Computed Ranked Lists: (statement suspiciousness ) , 1 1, 4 1, 5 1, , 1 1, 5 1, 3 0, , 1 0, 2 0, 3 0, , 1 1, 4 1, 5 1, 3 0 [based on runs A, B, C] (C updated) [based on runs A, B] (no updates) [based on run C] (C updated) [based on runs A, B, C] (C updated) Report list 4  Fix faulty statement 4  Done

75 Techniques Compared  (MIN) Only compute ranked list once  (FULL) Fully recompute ranked list each time  (PARTIAL) Compute IVMPs for subset of failing runs and revise ranked lists each time  (ISOLATED) Locate each error in isolation

76 Benchmark Programs Program# 5-Error Faulty Versions Average Suite Size (# Failing Runs / # Passing Runs) tcas2011 (5 / 6) totinfo2022 (10 / 12) sched2029 (10 / 19) sched22030 (9 / 21) ptok232 (8 / 24) ptok21129 (5 / 24) replace2038 (9 / 29) Each faulty program contains 5 seeded errors, each in a different stmt Each faulty program is associated with a stmt-coverage adequate test suite such that at least one failing run exercises each error Experimental Benchmark Programs

tcas totinfo sched sched2 ptok ptok2 replace Effectiveness Comparison of Value Replacement Techniques Isolated Full Partial Min Effectiveness Results Avg. Score per Ranked List (%)

78 Efficiency of Value Replacement  Searching for IVMPs is time-consuming  Lossy techniques Reduce search space for finding IVMPs May result in some missed IVMPs Performed for single-error benchmarks  Lossless techniques Only affect the efficiency of implementation Result in no missed IVMPs Performed for multi-error benchmarks 5 failing runs X 50,000 stmt instances per run 15 alt value sets per instance X = 3.75 million value replacement program executions  Over 10 days if each execution requires a quarter-second

79 Lossy Techniques  Limit considered statement instances Find IVMP  skip all subsequent instances of the same statement in the current run Don’t find IVMP  skip statement in subsequent runs  Limit considered alternate value sets only use min, and max >, as compared to original value orig max <min >min <max > (skip)

80 Lossless Techniques stmt instance 1 stmt instance 2 stmt instance 3 Original Execution (assume 2 alternate value sets at each stmt instance) Regular Value Replacement Executions (value replacements are independent of each other) (portions of original execution are duplicated multiple times) (x 6) (x 4) (x 2) Efficiency Improvements: (1)Fork child process to do each value replacement in original failing execution (2)Perform value replacements in parallel

81 Lossless Techniques With Redundant Execution Removed (no duplication of any portion of original execution) With Parallelization (total time required to perform all value replacements is reduced)

82 Search Reduction by Lossy Techniques Reduction in # of Executions by Lossy Techniques: Single-Error Benchmarks # val replacements needed FullLimited Mean2.0 M0.03 M Max21.5 M0.4 M

83 Search Reduction by Lossy Techniques Reduction in # of Executions by Lossy Techniques: Single-Error Benchmarks On average, total number of executions reduced by a factor of 67

84 Time Required for Reduced Search Time Required to Search using Lossy Techniques: Single-Error Benchmarks Mean55.6 min < 1 min39% of progs < 10 min60% of progs < 100 min87% of progs Max846.5 min

85 Time Required for Reduced Search Time Required to Search using Lossy Techniques: Single-Error Benchmarks Only 13% of faulty programs required more than 100 minutes of IVMP search time

tcas totinfo sched sched2 ptok ptok2 replace Time to Search in Each Faulty Program using Lossless Techniques: Multi-Error Benchmarks Full Partial Min Time Required with Lossless Techniques Avg. Time (seconds) With Lossless techniques, multiple errors in a program can be located in minutes. With Lossy techniques, some single errors require hours to locate.

87 Execution Suppression  Efficient location of memory errors through targeted state alteration [Jeffrey et. al., TOPLAS 2010]  Alter state in a way that will definitely get closer to the goal each time  Goal: identify first point of memory corruption in a failing execution

88 Memory Errors and Corruption  Memory errors Buffer overflow Uninitialized read Dangling pointer Double free Memory leak  Memory corruption Incorrect memory location is accessed, or Incorrect value is assigned to a pointer variable

89 Study of Memory Corruption Traversal of error  First point of memory corruption  Failure ProgramLOCMemory Error Type Analyzed Input Types gzip6.3 KGlobal overflowNo crashCrash 1 man10.8 KGlobal overflowCrash 1 bc10.7 KHeap overflowNo crashCrash 1 pine211.9 KHeap overflowNo crashCrash 1 mutt65.9 KHeap overflowNo crashCrash 1 ncompress1.4 KStack overflowNo crashCrash 1Crash 2 polymorph1.1 KStack overflowNo crashCrash 1Crash 2 xv69.2 KStack overflowNo crashCrash 1 tar28.4 KNULL dereferenceCrash 1 tidy35.9 KNULL dereferenceCrash 1 cvs104.1 KDouble freeCrash 1

90 Observations from Study  Total distance from point of error traversal until failure can be large  Different inputs triggering memory corruption may result in different crashes or no crashes  Distance from error traversal to first memory corruption, is considerably less than distance from first memory corruption to failure

91 Execution Suppression: High-Level  Program crash reveals memory corruption  Key: assume memory corruption leads to crash  Component 1: suppression Iteratively identify first point of memory corruption Omit the effect of certain statements during execution  Component 2: variable re-ordering Expose crashes where they may not occur Helpful since key assumption does not always hold

92 Suppression: How it Works While a crash occurs Identify accessed location L directly causing crash Identify last definition D of location L Re-execute program and omit execution of D and anything dependent on it Endwhile Report the statement associated with the most recent D First point of memory corruption

93 Suppression: Example 1: int *p1 = &x[1]; 2: int *p2 = &x[0]; 3: int *q1 = &y[1]; 4: int *q2 = &x[0]; 5: *p1 = readInt(); 6: *p2 = readInt(); 7: *q1 = readInt(); 8: *q2 = readInt(); 9: int a = *p1 + *p2; 10: int b = *q1 + *q2; 11: int c = a + b + 1; 12: intArray[c] = 0; 13: structArray[*p2]->f = 0; 14: free(p2); 15: free(q2); Stmt 4: copy-paste error: “x” should be “y” Stmt 8: clobbers stmt 6 Stmts : propagation Stmt 12: potential buffer overflow Stmt 13: potential overflow or NULL dereference Stmt 15: double free

94 Suppression: Example Stmt 4: The error as well as the first point of memory corruption (Located in 4 executions) 1: int *p1 = &x[1]; 2: int *p2 = &x[0]; 3: int *q1 = &y[1]; 4: int *q2 = &x[0]; 5: *p1 = readInt(); 6: *p2 = readInt(); 7: *q1 = readInt(); 8: *q2 = readInt(); 9: int a = *p1 + *p2; 10: int b = *q1 + *q2; 11: int c = a + b + 1; 12: intArray[c] = 0; 13: structArray[*p2]->f = 0; 14: free(p2); 15: free(q2);

95 Example: Execution 1 of 4 1: int *p1 = &x[1]; 2: int *p2 = &x[0]; 3: int *q1 = &y[1]; 4: int *q2 = &x[0]; 5: *p1 = readInt(); 6: *p2 = readInt(); 7: *q1 = readInt(); 8: *q2 = readInt(); 9: int a = *p1 + *p2; 10: int b = *q1 + *q2; 11: int c = a + b + 1; 12: intArray[c] = 0; 13: structArray[*p2]->f = 0; 14: free(p2); 15: free(q2); Stmt: Loc Defined: OK? 1 p1 2 p2 3 q1 5 *p1 6 *p2 7 *q1 4 q2 8 *p2/*q2 9 a 10 b 11 c 12 CRASH Action: Suppress definition of c at stmt 11 and all of its effects

96 Example: Execution 2 of 4 1: int *p1 = &x[1]; 2: int *p2 = &x[0]; 3: int *q1 = &y[1]; 4: int *q2 = &x[0]; 5: *p1 = readInt(); 6: *p2 = readInt(); 7: *q1 = readInt(); 8: *q2 = readInt(); 9: int a = *p1 + *p2; 10: int b = *q1 + *q2; 11: int c = a + b + 1; 12: intArray[c] = 0; 13: structArray[*p2]->f = 0; 14: free(p2); 15: free(q2); Stmt: Loc Defined: OK? 1 p1 2 p2 3 q1 5 *p1 6 *p2 7 *q1 4 q2 8 *p2/*q2 9 a 10 b CRASH Action: Suppress def of *p2/*q2 at stmt 8 and all of its effects

97 Example: Execution 3 of 4 1: int *p1 = &x[1]; 2: int *p2 = &x[0]; 3: int *q1 = &y[1]; 4: int *q2 = &x[0]; 5: *p1 = readInt(); 6: *p2 = readInt(); 7: *q1 = readInt(); 8: *q2 = readInt(); 9: int a = *p1 + *p2; 10: int b = *q1 + *q2; 11: int c = a + b + 1; 12: intArray[c] = 0; 13: structArray[*p2]->f = 0; 14: free(p2); 15: free(q2); 1 p1 2 p2 3 q1 5 *p1 6 *p2 7 *q1 4 q2 Action: Suppress q2 at stmt 4 and effects CRASH Stmt: Loc Defined: OK?

98 Example: Execution 4 of 4 1: int *p1 = &x[1]; 2: int *p2 = &x[0]; 3: int *q1 = &y[1]; 4: int *q2 = &x[0]; 5: *p1 = readInt(); 6: *p2 = readInt(); 7: *q1 = readInt(); 8: *q2 = readInt(); 9: int a = *p1 + *p2; 10: int b = *q1 + *q2; 11: int c = a + b + 1; 12: intArray[c] = 0; 13: structArray[*p2]->f = 0; 14: free(p2); 15: free(q2); 1 p1 2 p2 3 q1 5 *p1 6 *p2 7 *q1 4 Result: stmt 4 identified Stmt: Loc Defined: OK?

99 Example: Summary Execution 1 p1: p2: q1: q2: p1: p2: q1: p2, q2: * * * ** a: b: c: 12 CRASH! p1: p2: q1: q2: p1: p2: q1: p2, q2: * * * ** a: b: 13 CRASH! Execution 2 suppress p1: p2: q1: q2: p1: p2: q1: * * * Execution 3 suppress 15 CRASH! p1: p2: q1: p1: p2: q1: * * * Execution 4 suppress 15 REPORT 4

100 Variable Re-Ordering  Re-order variables in memory prior to execution  Try to cause a crash due to corruption, in cases where crash does not occur  Can overcome limitations of suppression Do not terminate prematurely when corruption does not cause crash Applicable to executions that do not crash  Position address variables after buffers

101 Variable Re-Ordering: Example From program ncompress: void comprexx(char **fileptr) { int fdin; int fdout; char tempname[1024]; strcpy(tempname, *fileptr);... } tempname fdout fdin $ return addr On the call stack: Original Variable Ordering: Variable Re-Ordering: tempname fdout fdin $ return addr Original ordering  no stack smash Re-ordering  stack smash

102  The Complete Algorithm  Explanation Execution Suppression Algorithm exec := original failing execution; Do (A) identifiedStmt, exec := run suppression component using exec; (B) reordering, exec := run variable re-ordering component using exec; While (crashing reordering is found); Report identifiedStmt; (A) Runs suppression until no further crashes occur (B) Attempts to expose an additional crash Do/While loop iterates as long as variable re-ordering exposes a new crash

103 Execution Suppression: Evaluation Suppression-only results (no variable re-ordering): ProgramInput Type # Exec. Required Maximum Static Dependence Distance From Located Stmt To… 1 st Memory CorruptionError gzipCrash 1200 manCrash 1212 bcCrash 1201 pineCrash 1205 muttCrash 1301 ncompressCrash 1 Crash polymorphCrash 1 Crash xvCrash 1402 tarCrash 1200 tidyCrash 1200 cvsCrash 1200

104 Execution Suppression: Evaluation Suppression and variable re-ordering results: ProgramInput Type # Crashes Exposed # Var R-O Exec. Maximum Static Dependence Distance From Located Stmt To… 1 st Memory CorruptionError gzipNo Crash manCrash bcNo Crash pineNo Crash muttNo Crash ncompressNo Crash1500 polymorphNo Crash1601 xvNo Crash113502

105 Memory Errors in Multithreaded Programs  Assume programs run on single processor  Two main enhancements required for Execution Suppression Reproduce failure on multiple executions  Record failure-inducing thread interleaving  Replay same interleaving on subsequent executions  In general, other factors should be recorded/replayed Identify data race errors  Data race: concurrent, unsynchronized access of a shared memory location by multiple threads; at least one write  Identified on-the-fly during suppression

106 Identifying Data Races  Data races involve WAR, WAW, or RAW dependencies  Identified points of suppression are writes Can involve WAR or WAW dependence prior to that point Can involve RAW dependence after that point  Monitor for an involved data race on-the-fly during a suppression execution

107 On-the-fly Data Race Detection Identified suppression point (thread T 1 writes to location L) Last access to L by a thread other than T 1 Next read from L by a thread other than T 1 Monitor for synchronization on L Suppression Execution

108 On-the-fly Data Race Detection Last access to L by a thread other than T 1 Monitor for synchronization on L Suppression Execution Next read from L by a thread other than T 1 WAR or WAW data race may be identified at this point

109 On-the-fly Data Race Detection Last access to L by a thread other than T 1 Monitor for synchronization on L Suppression Execution RAW data race may be identified at this point WAR or WAW data race may be identified at this point

110 Potentially-Harmful Data Races  Given two memory accesses involved in a data race, force other thread interleavings to see if the state is altered Memory access point 1: access to L from thread T 1 Memory access point 2: access to L from thread T 2 For each ready thread besides T 1, re-execute from this point and schedule it in place of T 1 If value in L is changed at this point, data race is potentially-harmful Harmful Data Race Checking Executions

111 Evaluation with Multithreaded Programs ProgramLOCError Type# Executions Required Precisely Identifies Error? apache191 KData race3yes mysql-1508 KData race3yes mysql-2508 KData race3yes mysql-3508 KUninitialized read2yes prozilla-116 KStack overflow2yes prozilla-216 KStack overflow4yes axel3 KStack overflow3yes Multithreaded Benchmark Programs and Results:

112 Implementing Suppression: General  Global variables count: Dynamic instruction count value suppress: Suppression mode flag (boolean)  Variables associated with each register and memory word lastDef: count value associated with the instruction last defining it corrupt: whether associated effects need to be suppressed target.lastDef := ++count; At a program instruction (defines target, uses src 1 and src 2 ): Ensure instruction responsible for a crash can be identified: Carry out suppression as necessary: if (current instruction is a suppression point) suppress := true; target.corrupt := true; if (suppress) if (src 1.corrupt or src 2.corrupt) target.corrupt := true; else execute instruction; target.corrupt := false;

113 Software/Hardware Support Software-only implementation can incur relatively high overhead (SW) Reduce overhead with hardware support Existing support in Itanium processors for deferred exception handling: extra bit for registers (HW1) Further memory augmentation: extra bit for memory words (HW1 + HW2) Overheads compared in a simulator

114 Average Overhead: 7.2x (SW) 2.7x (HW1) 1.8x (HW1 + HW2) Performance Overhead Comparison

115 Other Dynamic Error Location Techniques  Other state-alteration techniques Delta Debugging [Zeller et al. FSE 2002, TSE 2002, ICSE 2005]  Search in space for values relevant to a failure  Search in time for failure cause transitions Predicate Switching [Zhang et. al. ICSE 2006]  Alter predicate outcome to correct failing output  Value Replacement is more aggressive; Execution Suppression is better targeted for memory errors

116 Other Dynamic Error Location Techniques  Program slicing-based techniques Pruning Dynamic Slices with Confidence [Zhang et. al. PLDI 2006] Failure-Inducing Chops [Gupta et. al. ASE 2005]  Invariant-based techniques Daikon [Ernst et. al. IEEE TSE Feb. 2001] AccMon [Zhou et. al. HPCA 2007]

117 Other Dynamic Error Location Techniques  Statistical techniques Cooperative Bug Isolation [Ben Liblit doctoral dissertation, 2005] SOBER [Liu et. al. FSE 2005] Tarantula [Jones et. al. ICSE 2002] COMPUTERESULT Spectra-based techniques Nearest Neighbor [Renieris and Reiss, ASE 2003] FAILPASS SUSPICIOUS

118 Future Directions  Enhancements to Value Replacement Improve scalability Study when IVMPs cannot be found at faulty statement  Enhancements to Execution Suppression Improve scalability of variable re-ordering Other techniques to expose crashes Handle memory errors that do not involve corruption  Applications to Fixing Errors IVMPs can be used in BugFix [Jeffrey et. al., ICPC 2009] Comparatively little research in automated techniques for fixing errors  Applications to Tolerating Errors Suppression can be used to recover from failures in server programs [Nagarajan et. al., ISMM 2009] Other applications?

Dissertations Dennis Jeffrey, Google  Dynamic State Alteration Techniques for Automatically Locating software Errors, Vijay Nagarajan, University of Edinburgh  IMPRESS: improving Multicore Performance and Reliability via Efficient Support for Software Monitoring, 2007.

Dissertations Chen Tian, Samsung R&D Center  Speculative parallelization on Multicore Processors, Min Feng  The SpiceC Parallel Programming System, 2012 (expected).

Ongoing Work Yan Wang  Qzdb: The QuickZoom Debugger Li Tan  Debugging SpiceC Programs

Ongoing Work Kishore Kumar Pusukuri  OS Architecture interaction on Multicore Processors. Sai Charan Koduru  Resource Allocation issues in Multicore Systems. Changhui Lin  Memory Consistency Models for Multicore Systems.