CHESS: Systematic Concurrency Testing Tom Ball, Sebastian Burckhardt, Madan Musuvathi, Shaz Qadeer Microsoft Research

Slides:



Advertisements
Similar presentations
CHESS : Systematic Testing of Concurrent Programs
Advertisements

Effective Program Verification for Relaxed Memory Models Sebastian BurckhardtMadanlal Musuvathi Microsoft Research CAV, July 10, 2008.
1 Chao Wang, Yu Yang*, Aarti Gupta, and Ganesh Gopalakrishnan* NEC Laboratories America, Princeton, NJ * University of Utah, Salt Lake City, UT Dynamic.
Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu NASA Ames Research Center.
“FENDER” AUTOMATIC MEMORY FENCE INFERENCE Presented by Michael Kuperstein, Technion Joint work with Martin Vechev and Eran Yahav, IBM Research 1.
Ch. 7 Process Synchronization (1/2) I Background F Producer - Consumer process :  Compiler, Assembler, Loader, · · · · · · F Bounded buffer.
Chapter 6: Process Synchronization
5.1 Silberschatz, Galvin and Gagne ©2009 Operating System Concepts with Java – 8 th Edition Chapter 5: CPU Scheduling.
Iterative Context Bounding for Systematic Testing of Multithreaded Programs Madan Musuvathi Shaz Qadeer Microsoft Research.
CHESS: A Systematic Testing Tool for Concurrent Software CSCI6900 George.
Tom Ball, Sebastian Burckhardt, Madan Musuvathi, Shaz Qadeer Microsoft Research.
 Thomas Ball Principal Researcher Microsoft Corporation  Sebastian Burckhardt Researcher Microsoft Corporation  Madan Musuvathi Researcher Microsoft.
Poirot – A Concurrency Sleuth Shaz Qadeer Research in Software Engineering Microsoft Research.
CHESS: Find and Reproduce Heisenbugs in Concurrent Programs Tom Ball, Sebastian Burckhardt, Peli de Halleux, Madan Musuvathi, Shaz Qadeer Microsoft Research.
CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),
Threads 1 CS502 Spring 2006 Threads CS-502 Spring 2006.
University of Pennsylvania 9/19/00CSE 3801 Concurrent Processes CSE 380 Lecture Note 4 Insup Lee.
Partial Order Reduction for Scalable Testing of SystemC TLM Designs Sudipta Kundu, University of California, San Diego Malay Ganai, NEC Laboratories America.
Memory Model Safety of Programs Sebastian Burckhardt Madanlal Musuvathi Microsoft Research EC^2, July 7, 2008.
Concurrency Testing Challenges, Algorithms, and Tools Madan Musuvathi Microsoft Research.
Synchronization CSCI 444/544 Operating Systems Fall 2008.
Process Synchronization Ch. 4.4 – Cooperating Processes Ch. 7 – Concurrency.
272: Software Engineering Fall 2012 Instructor: Tevfik Bultan Lecture 4: SMT-based Bounded Model Checking of Concurrent Software.
Taming Concurrency: A Program Verification Perspective Shaz Qadeer Microsoft Research.
Operating Systems CSE 411 CPU Management Oct Lecture 13 Instructor: Bhuvan Urgaonkar.
Thread-modular Abstraction Refinement Thomas A. Henzinger, et al. CAV 2003 Seonggun Kim KAIST CS750b.
OSE 2013 – synchronization (lec3) 1 Operating Systems Engineering Locking & Synchronization [chapter #4] By Dan Tsafrir,
1 Testing Concurrent Programs Why Test?  Eliminate bugs?  Software Engineering vs Computer Science perspectives What properties are we testing for? 
CUTE: A Concolic Unit Testing Engine for C Technical Report Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
Advanced Operating Systems CIS 720 Lecture 1. Instructor Dr. Gurdip Singh – 234 Nichols Hall –
1 VeriSoft A Tool for the Automatic Analysis of Concurrent Reactive Software Represents By Miller Ofer.
Lecture 2 Foundations and Definitions Processes/Threads.
COMP 111 Threads and concurrency Sept 28, Tufts University Computer Science2 Who is this guy? I am not Prof. Couch Obvious? Sam Guyer New assistant.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Mutual Exclusion.
By Sarita Adve & Kourosh Gharachorloo Slides by Jim Larson Shared Memory Consistency Models: A Tutorial.
Detecting and Eliminating Potential Violation of Sequential Consistency for concurrent C/C++ program Duan Yuelu, Feng Xiaobing, Pen-chung Yew.
CS399 New Beginnings Jonathan Walpole. 2 Concurrent Programming & Synchronization Primitives.
CS527 Topics in Software Engineering (Software Testing and Analysis) Darko Marinov August 30, 2011.
CAPP: Change-Aware Preemption Prioritization Vilas Jagannath, Qingzhou Luo, Darko Marinov Sep 6 th 2011.
CSE 153 Design of Operating Systems Winter 2015 Midterm Review.
CUTE: A Concolic Unit Testing Engine for C Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
MULTIVIE W Slide 1 (of 21) Software Transactional Memory Should Not Be Obstruction Free Paper: Robert Ennals Presenter: Emerson Murphy-Hill.
4.1 Introduction to Threads Overview Multithreading Models Thread Libraries Threading Issues Operating System Examples Windows XP Threads Linux Threads.
Specifying Multithreaded Java semantics for Program Verification Abhik Roychoudhury National University of Singapore (Joint work with Tulika Mitra)
Slides created by: Professor Ian G. Harris Operating Systems  Allow the processor to perform several tasks at virtually the same time Ex. Web Controlled.
CS 153 Design of Operating Systems Winter 2016 Lecture 7: Synchronization.
CHESS Finding and Reproducing Heisenbugs in Concurrent Programs
Agenda  Quick Review  Finish Introduction  Java Threads.
Testing Concurrent Programs Sri Teja Basava Arpit Sud CSCI 5535: Fundamentals of Programming Languages University of Colorado at Boulder Spring 2010.
Chapter 4 – Thread Concepts
Healing Data Races On-The-Fly
CSE 120 Principles of Operating
Multi-processor Scheduling
Background on the need for Synchronization
Advanced Topics in Concurrency and Reactive Programming: Asynchronous Programming Majeed Kassis.
Synchronization.
Chapter 4 – Thread Concepts
Effective Data-Race Detection for the Kernel
Specifying Multithreaded Java semantics for Program Verification
Over-Approximating Boolean Programs with Unbounded Thread Creation
Thread Implementation Issues
Why Threads Are A Bad Idea (for most purposes)
CSE 153 Design of Operating Systems Winter 19
CS333 Intro to Operating Systems
Chapter 6: Synchronization Tools
Foundations and Definitions
Why Threads Are A Bad Idea (for most purposes)
Why Threads Are A Bad Idea (for most purposes)
CSE 542: Operating Systems
CSE 542: Operating Systems
Presentation transcript:

CHESS: Systematic Concurrency Testing Tom Ball, Sebastian Burckhardt, Madan Musuvathi, Shaz Qadeer Microsoft Research

Testing concurrent programs is HARD Rare thread interleavings expose bugs Coverage problem Testing misses thread interleavings that expose errors Reproducibility problem Concurrency bugs == Heisenbugs Not reproducible  hard to debug Crash dumps don’t help

Thread interleavings x++; x*=2;

Concurrency testing today Concurrency testing == stress testing Example: testing a concurrent queue Create 100 threads performing queue operations Run for days/weeks Stress increases the interleaving variety, but Not systematic: might miss interleavings Not predictable: cannot find the same error again Makes any error found hard to debug

1 Why stress is not sufficient

Concurrency testing : what we need Methodology and tools to systematically and predictably test thread interleavings

CHESS in a nutshell Replace the OS scheduler with a demonic scheduler Systematically explore all scheduling choices Concurrent Program Win32 API Kernel Scheduler Kernel Scheduler Demonic Scheduler Demonic Scheduler

CHESS will run this program 6 times exploring all the different interleavings x++; x*=2;

2 Don’t stress, use CHESS

CHESS architecture Kernel: Threads, Scheduler, Synchronization Objects Kernel: Threads, Scheduler, Synchronization Objects While(not done) { TestScenario() } While(not done) { TestScenario() } TestScenario() { … } Program CHESS CHESS runs the scenario in a loop Every run takes a different interleaving Every run is repeatable Win32 API Intercept synch. & threading calls To control and introduce nondeterminism Detect Assertion violations Deadlocks Dataraces Livelocks

CHESS methodology generalizes Need wrappers for every concurrency API CHESS has wrappers for Win32,.NET, Singularity Wrappers understand the semantics of the API Expose nondeterminism in the API Looking for volunteers to build wrappers for Linux and Java.NET Program.NET Program.NET CLR CHESS Win32 Program Win32 Program Win32 / OS CHESS Singularity Program Singularity Program Singularity CHESS

CHESS clients PCP = Parallel Computing Platform (for multi/many-cores) PLINQ: Parallel LINQ CDS: Concurrent Data Structures STM: Software Transactional Memory TPL: Task Parallel Library ConcRT: Concurrency RunTime CCR: Concurrency Coordination Runtime Dryad Part of COSMOS Singularity/Midori CHESS can systematically test the boot and shutdown process

Stateless model checking [Verisoft ‘97] Systematically enumerate all paths in a state-space graph Don’t capture program states Capturing states is extremely hard for large programs Effective for message-passing programs CHESS applies stateless model checking for shared- memory multithreaded programs

Outline Preemption bounding [PLDI ‘07] Fair stateless model checking [PLDI ‘08] Sober [CAV ’08, EC2 ‘08] FeatherLite Concurrency Explorer [EC2 ‘08]

Outline Preemption bounding Makes CHESS effective on deep state spaces Fair stateless model checking Sober FeatherLite Concurrency Explorer

x = 1; … y = k; x = 1; … y = k; State space explosion x = 1; … y = k; x = 1; … y = k; … n threads k steps each Number of executions = O( n nk ) Exponential in both n and k Typically: n 100 Limits scalability to large programs Goal: Scale CHESS to large programs (large k)

x = 1; if (p != 0) { x = p->f; } x = 1; if (p != 0) { x = p->f; } Preemption bounding Prioritize executions with small number of preemptions Two kinds of context switches: Preemptions – forced by the scheduler e.g. Time-slice expiration Non-preemptions – a thread voluntarily yields e.g. Blocking on an unavailable lock, thread end x = p->f; } x = p->f; } x = 1; if (p != 0) { x = 1; if (p != 0) { p = 0; preemption non-preemption

Polynomial state space Terminating program with fixed inputs and deterministic threads n threads, k steps each, c preemptions Number of executions <= nk C c. (n+c)! = O( (n 2 k) c. n! ) Exponential in n and c, but not in k x = 1; … y = k; x = 1; … y = k; x = 1; … y = k; x = 1; … y = k; x = 1; … x = 1; … x = 1; … x = 1; … y = k; … y = k; … y = k; Choose c preemption points Permute n+c atomic blocks

3 Preemption bounding

Find lots of bugs with 2 preemptions ProgramLines of codeBugs Work Stealing Q4K4 CDS6K1 CCR9K3 ConcRT16K4 Dryad18K7 APE19K4 STM20K2 TPL24K9 PLINQ24K1 Singularity175K2 37 (total) Acknowledgement: testers from PCP team

So, is CHESS is unsound? Soundness: prove that the program is correct for a given input test harness Need to exhaustively explore all interleavings For small programs, CHESS is sound Iteratively increase the preemption bound Preemption bounding helps scale to large programs A good “knob” to trade resources for coverage Better search algorithms  more coverage faster Partial-order reduction Modular testing of loosely-coupled programs

Outline Preemption bounding Makes CHESS effective on deep state spaces Fair stateless model checking Makes CHESS effective on cyclic state spaces Enables CHESS to find liveness violations (livelocks) Sober FeatherLite Concurrency Explorer

Concurrent programs have cyclic state spaces Spinlocks Non-blocking algorithms Implementations of synchronization primitives Periodic timers … L1: while( ! done) { L2: Sleep(); } L1: while( ! done) { L2: Sleep(); } M1: done = 1; ! done L2 ! done L2 ! done L1 ! done L1 done L2 done L2 done L1 done L1

A demonic scheduler unrolls any cycle ad-infinitum ! done done ! done done ! done done while( ! done) { Sleep(); } while( ! done) { Sleep(); } done = 1; ! done

Depth bounding ! done done ! done done ! done done ! done Prune executions beyond a bounded number of steps Depth bound

Problem 1: Ineffective state coverage ! done Bound has to be large enough to reach the deepest bug Typically, greater than 100 synchronization operations Every unrolling of a cycle redundantly explores reachable state space Depth bound

Problem 2: Cannot find livelocks Livelocks : lack of progress in a program temp = done; while( ! temp) { Sleep(); } temp = done; while( ! temp) { Sleep(); } done = 1;

Key idea This test terminates only when the scheduler is fair Fairness is assumed by programmers All cycles in correct programs are unfair A fair cycle is a livelock while( ! done) { Sleep(); } while( ! done) { Sleep(); } done = 1; ! done done

We need a fair demonic scheduler Avoid unrolling unfair cycles Effective state coverage Detect fair cycles Find livelocks (violations of fair termination) Concurrent Program Test Harness Win32 API Demonic Scheduler Demonic Scheduler Fair Demonic Scheduler Fair Demonic Scheduler

Fair termination allows CHESS to check for arbitrary liveness properties Example: Good Samaritan assumption Forall threads t : GF scheduled(t)  GF yield(t) A thread when scheduled infinitely often yields the processor infinitely often Examples of yield: Sleep(), ScheduleThread(), asm {rep nop;} Thread completion while( ! done) { Sleep(); } while( ! done) { Sleep(); } done = 1;

Outline Preemption bounding Makes CHESS effective on deep state spaces Fair stateless model checking Makes CHESS effective on cyclic state spaces Enables CHESS to find liveness violations (livelocks) Sober Detect relaxed-memory model errors Do not miss behaviors only possible in a relaxed memory model FeatherLite Concurrency Explorer

C# Example volatile bool isIdling; volatile bool hasWork; //Consumer thread void BlockOnIdle(){ lock (condVariable){ isIdling = true; if (!hasWork) Monitor.Wait(condVariable); isIdling = false; } //Producer thread void NotifyPotentialWork(){ hasWork = true; if (isIdling) lock (condVariable) { Monitor.Pulse(condVariable); } 32

Key pieces of code on previous slide: On x86, hardware may perform store late Bug: Producer thread does not notice waiting Consumer, does not send signal Store ii, 1 Example: Store Buffer Vulnerability Store ii, 1 volatile int ii = 0; volatile int hw = 0; Load hw, 0 Load ii, 1 Store hw, 1 ConsumerProducer 00 33

Sober algorithm Programmers assume sequential-consistency (SC) Insert synchronizations & fences to counter memory- model relaxations Sober checks if a program is memory-model safe i.e., program has only SC executions in a memory model Reports any such violation as an error Sober is a dynamic monitor that checks if any SC execution can be extended to a non-SC execution Theorem: CHESS + Sober guarantees memory-model safety

Outline Preemption bounding Makes CHESS effective on deep state spaces Fair stateless model checking Makes CHESS effective on cyclic state spaces Enables CHESS to find liveness violations (livelocks) Sober Detect relaxed-memory model errors Do not miss behaviors only possible in a relaxed memory model FeatherLite A light-weight data-race detection engine (<20% overhead) Concurrency Explorer

Outline Preemption bounding Makes CHESS effective on deep state spaces Fair stateless model checking Makes CHESS effective on cyclic state spaces Enables CHESS to find liveness violations (livelocks) Sober Detect relaxed-memory model errors Do not miss behaviors only possible in a relaxed memory model FeatherLite A light-weight data-race detection engine (<20% overhead) Concurrency Explorer First-class concurrency debugging

Conclusion Don’t stress, use CHESS CHESS binary and papers available at Stateless model checking is very effective Preemption bounding to scale to deep state spaces Fair demonic scheduler to handle nonterminating programs Need better testing and debugging methodologies for concurrent programs

Questions