1 Distributed Dynamic Partial Order Reduction based Verification of Threaded Software Yu Yang (PhD student; summer intern at CBL) Xiaofang Chen (PhD student;

Slides:

Advertisements

Similar presentations

Shared-Memory Model and Threads Intel Software College Introduction to Parallel Programming – Part 2.

Advertisements

Hybrid BDD and All-SAT Method for Model Checking Orna Grumberg Joint work with Assaf Schuster and Avi Yadgar Technion – Israel Institute of Technology.

Enabling Speculative Parallelization via Merge Semantics in STMs Kaushik Ravichandran Santosh Pande College.

1 Chao Wang, Yu Yang*, Aarti Gupta, and Ganesh Gopalakrishnan* NEC Laboratories America, Princeton, NJ * University of Utah, Salt Lake City, UT Dynamic.

Hierarchical Cache Coherence Protocol Verification One Level at a Time through Assume Guarantee Xiaofang Chen, Yu Yang, Michael Delisi, Ganesh Gopalakrishnan.

Module 7: Advanced Development  GEM only slides here  Started on page 38 in SC09 version Module 77-0.

Concurrency Important and difficult (Ada slides copied from Ed Schonberg)

D u k e S y s t e m s Time, clocks, and consistency and the JMM Jeff Chase Duke University.

CS 484. Discrete Optimization Problems A discrete optimization problem can be expressed as (S, f) S is the set of all feasible solutions f is the cost.

1 Semantics Driven Dynamic Partial-order Reduction of MPI-based Parallel Programs Robert Palmer Intel Validation Research Labs, Hillsboro, OR (work done.

Guoliang Jin, Linhai Song, Wei Zhang, Shan Lu, and Ben Liblit University of Wisconsin–Madison Automated Atomicity- Violation Fixing.

Background for “KISS: Keep It Simple and Sequential” cs264 Ras Bodik spring 2005.

Iterative Context Bounding for Systematic Testing of Multithreaded Programs Madan Musuvathi Shaz Qadeer Microsoft Research.

Multiple Processor Systems

File Management Chapter 12. File Management A file is a named entity used to save results from a program or provide data to a program. Access control.

Atomicity in Multi-Threaded Programs Prachi Tiwari University of California, Santa Cruz CMPS 203 Programming Languages, Fall 2004.

CS444/CS544 Operating Systems Synchronization 2/16/2006 Prof. Searleman

Scheduling Considerations for building Dynamic Verification Tools for MPI Sarvani Vakkalanka, Michael DeLisi Ganesh Gopalakrishnan, Robert M. Kirby School.

1 An Approach to Formalization and Analysis of Message Passing Libraries Robert Palmer Intel Validation Research Labs, Hillsboro, OR (work done at the.

DISTRIBUTED AND HIGH-PERFORMANCE COMPUTING CHAPTER 7: SHARED MEMORY PARALLEL PROGRAMMING.

1 MPI Verification Ganesh Gopalakrishnan and Robert M. Kirby Students Yu Yang, Sarvani Vakkalanka, Guodong Li, Subodh Sharma, Anh Vo, Michael DeLisi, Geof.

Partial Order Reduction for Scalable Testing of SystemC TLM Designs Sudipta Kundu, University of California, San Diego Malay Ganai, NEC Laboratories America.

Argonne National Laboratory School of Computing and SCI Institute, University of Utah Practical Model-Checking Method For Verifying Correctness of MPI.

The Problem  Rigorous descriptions for widely used APIs essential  Informal documents / Experiments not a substitute Goals / Benefits  Define MPI rigorously.

Dr. Muhammed Al-Mulhem 1ICS ICS 535 Design and Implementation of Programming Languages Part 1 OpenMP -Example ICS 535 Design and Implementation.

CS 584. Discrete Optimization Problems A discrete optimization problem can be expressed as (S, f) S is the set of all feasible solutions f is the cost.

The shift from sequential to parallel and distributed computing is of fundamental importance for the advancement of computing practices. Unfortunately,

University of Michigan Electrical Engineering and Computer Science 1 Practical Lock/Unlock Pairing for Concurrent Programs Hyoun Kyu Cho 1, Yin Wang 2,

ADLB Update Recent and Current Adventures with the Asynchronous Dynamic Load Balancing Library Rusty Lusk Mathematics and Computer Science Division Argonne.

1 Advanced Computer Programming Concurrency Multithreaded Programs Copyright © Texas Education Agency, 2013.

DATA STRUCTURES OPTIMISATION FOR MANY-CORE SYSTEMS Matthew Freeman | Supervisor: Maciej Golebiewski CSIRO Vacation Scholar Program

Multi-core Programming Thread Profiler. 2 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Topics Look at Intel® Thread Profiler features.

OpenMP in a Heterogeneous World Ayodunni Aribuki Advisor: Dr. Barbara Chapman HPCTools Group University of Houston.

German National Research Center for Information Technology Research Institute for Computer Architecture and Software Technology German National Research.

Parallel Programming Philippas Tsigas Chalmers University of Technology Computer Science and Engineering Department © Philippas Tsigas.

Distributed Verification of Multi-threaded C++ Programs Stefan Edelkamp joint work with Damian Sulewski and Shahid Jabbar.

The HDF Group Multi-threading in HDF5: Paths Forward Current implementation - Future directions May 30-31, 2012HDF5 Workshop at PSI 1.

A Framework for Elastic Execution of Existing MPI Programs Aarthi Raveendran Graduate Student Department Of CSE 1.

04/10/25Parallel and Distributed Programming1 Shared-memory Parallel Programming Taura Lab M1 Yuuki Horita.

Pallavi Joshi* Mayur Naik † Koushik Sen* David Gay ‡ *UC Berkeley † Intel Labs Berkeley ‡ Google Inc.

Chapter 3 Parallel Programming Models. Abstraction Machine Level – Looks at hardware, OS, buffers Architectural models – Looks at interconnection network,

Use of Coverity & Valgrind in Geant4 Gabriele Cosmo.

PRET-OS for Biomedical Devices A Part IV Project.

INTEL CONFIDENTIAL Shared Memory Considerations Introduction to Parallel Programming – Part 4.

The shift from sequential to parallel and distributed computing is of fundamental importance for the advancement of computing practices. Unfortunately,

CSV 889: Concurrent Software Verification Subodh Sharma Indian Institute of Technology Delhi Scalable Symbolic Execution: KLEE.

CS 584. Discrete Optimization Problems A discrete optimization problem can be expressed as (S, f) S is the set of all feasible solutions f is the cost.

CS265: Dynamic Partial Order Reduction Koushik Sen UC Berkeley.

1 Adaptive Parallelism for Web Search Myeongjae Jeon Rice University In collaboration with Yuxiong He (MSR), Sameh Elnikety (MSR), Alan L. Cox (Rice),

OpenMP for Networks of SMPs Y. Charlie Hu, Honghui Lu, Alan L. Cox, Willy Zwaenepoel ECE1747 – Parallel Programming Vicky Tsang.

CAPP: Change-Aware Preemption Prioritization Vilas Jagannath, Qingzhou Luo, Darko Marinov Sep 6 th 2011.

Hill Climbing In a Banking Application (PThreads Version) -by Nitin Agarwal Kailash Aurangabadkar Prashant Jain Rashmi Kankaria Nan Zhang.

Eraser: A dynamic Data Race Detector for Multithreaded Programs Stefan Savage, Michael Burrows, Greg Nelson, Patrick Sobalvarro, Thomas Anderson Presenter:

Gauss Students’ Views on Multicore Processors Group members: Yu Yang (presenter), Xiaofang Chen, Subodh Sharma, Sarvani Vakkalanka, Anh Vo, Michael DeLisi,

SMP Basics KeyStone Training Multicore Applications Literature Number: SPRPxxx 1.

Reachability Testing of Concurrent Programs1 Reachability Testing of Concurrent Programs Richard Carver, GMU Yu Lei, UTA.

About Me I'm a software Committer on HDFS

Hybrid BDD and All-SAT Method for Model Checking

Multithreading Tutorial

L21: Putting it together: Tree Search (Ch. 6)

Threads and Memory Models Hal Perkins Autumn 2011

Chapter 4: Threads.

Programming with Shared Memory

Lecture 14: Pthreads Mutex and Condition Variables

Multithreading Tutorial

Threads and Memory Models Hal Perkins Autumn 2009

Concurrency: Mutual Exclusion and Process Synchronization

Multithreading Tutorial

Lecture 14: Pthreads Mutex and Condition Variables

Presentation transcript:

1 Distributed Dynamic Partial Order Reduction based Verification of Threaded Software Yu Yang (PhD student; summer intern at CBL) Xiaofang Chen (PhD student; summer intern at IBM) Ganesh Gopalakrishnan Robert M. Kirby School of Computing University of Utah SPIN 2007 Workshop Presentation Supported by: Microsoft HPC Institutes NSF CNS

2 Thread Programming will become more prevalent FV of thread programs will grow in importance

3 Why FV for Threaded Programs > 80% of chips shipped will be multi-core (photo courtesy of Intel Corporation.)

4 Model Checking will Increasingly be thru Dynamic Methods Also known as Runtime or In-Situ methods

5 Why Dynamic Verification Methods Even after early life-cycle modeling and validation, the final code will have far more details Early life-cycle modeling is often impossible - Use of libraries (API) such as MPI, OpenMP, Shmem, … - Library function semantics can be tricky - The bug may be in the library function implementation

6 Model Checking will often be “stateless”

7 Why Stateless One may not be able to access a lot of the state - e.g. state of the OS. It is expensive to hash and lookup revisits. Stateless is easier to parallelize

8 Partial Order Reduction is Crucial !

9 Why POR? Process P0: : MPI_Init 1: MPI_Win_lock 2: MPI_Accumulate 3: MPI_Win_unlock 4: MPI_Barrier 5: MPI_Finalize Process P1: : MPI_Init 1: MPI_Win_lock 2: MPI_Accumulate 3: MPI_Win_unlock 4: MPI_Barrier 5: MPI_Finalize ONLY DEPENDENT OPERATIONS 504 interleavings without POR (2 * (10!)) / (5!)^2 2 interleavings with POR !!

10 Dynamic POR is almost a “must” ! ( Dynamic POR as in Flanagan and Godefroid, POPL 2005)

11 Why Dynamic POR ? a[ j ]++ a[ k ]-- Ample Set depends on whether j == k Can be very difficult to determine statically Can determine dynamically

12 Why Dynamic POR ? The notion of action dependence (crucial to POR methods) is a function of the execution

13 Computation of “ample” sets in Static POR versus in DPOR Ample determined using “local” criteria Current State Next move of Red process Nearest Dependent Transition Looking Back Add Red Process to “Backtrack Set” This builds the Ample set incrementally based on observed dependencies Blue is in “Done” set { BT }, { Done }

14 l We target C/C++ PThread Programs l Instrument the given program (largely automated) l Run the concurrent program “till the end” l Record interleaving variants while advancing l When # recorded backtrack points reaches a soft limit, spill work to other nodes l In one larger example, a 11-hour run was finished in 11 minutes using 64 nodes l Heuristic to avoid recomputations was essential for speed-up. l First known distributed DPOR Putting it all together …

15 A Simple DPOR Example {}, {} t0: lock(t) unlock(t) t1: lock(t) unlock(t) t2: lock(t) unlock(t)

16 t0: lock {}, {} t0: lock(t) unlock(t) t1: lock(t) unlock(t) t2: lock(t) unlock(t) A Simple DPOR Example

17 t0: lock t0: unlock {}, {} t0: lock(t) unlock(t) t1: lock(t) unlock(t) t2: lock(t) unlock(t) A Simple DPOR Example

18 t0: lock t0: unlock t1: lock {}, {} t0: lock(t) unlock(t) t1: lock(t) unlock(t) t2: lock(t) unlock(t) A Simple DPOR Example

19 t0: lock t0: unlock t1: lock {t1}, {t0} t0: lock(t) unlock(t) t1: lock(t) unlock(t) t2: lock(t) unlock(t) A Simple DPOR Example

20 t0: lock t0: unlock t1: lock t1: unlock t2: lock {t1}, {t0} {}, {} t0: lock(t) unlock(t) t1: lock(t) unlock(t) t2: lock(t) unlock(t) A Simple DPOR Example

21 t0: lock t0: unlock t1: lock t1: unlock t2: lock {t1}, {t0} {t2}, {t1} t0: lock(t) unlock(t) t1: lock(t) unlock(t) t2: lock(t) unlock(t) A Simple DPOR Example

22 t0: lock t0: unlock t1: lock t2: unlock t1: unlock t2: lock {t1}, {t0} {t2}, {t1} t0: lock(t) unlock(t) t1: lock(t) unlock(t) t2: lock(t) unlock(t) A Simple DPOR Example

23 t0: lock t0: unlock t1: lock t1: unlock t2: lock {t1}, {t0} {t2}, {t1} t0: lock(t) unlock(t) t1: lock(t) unlock(t) t2: lock(t) unlock(t) A Simple DPOR Example

24 t0: lock t0: unlock {t1}, {t0} {t2}, {t1} t0: lock(t) unlock(t) t1: lock(t) unlock(t) t2: lock(t) unlock(t) A Simple DPOR Example

25 t0: lock t0: unlock t2: lock {t1,t2}, {t0} {}, {t1, t2} t0: lock(t) unlock(t) t1: lock(t) unlock(t) t2: lock(t) unlock(t) A Simple DPOR Example

26 t0: lock t0: unlock t2: lock t2: unlock {t1,t2}, {t0} {}, {t1, t2} t0: lock(t) unlock(t) t1: lock(t) unlock(t) t2: lock(t) unlock(t) A Simple DPOR Example …

27 t0: lock t0: unlock {t1,t2}, {t0} {}, {t1, t2} t0: lock(t) unlock(t) t1: lock(t) unlock(t) t2: lock(t) unlock(t) A Simple DPOR Example

28 {t2}, {t0,t1} t0: lock(t) unlock(t) t1: lock(t) unlock(t) t2: lock(t) unlock(t) A Simple DPOR Example

29 t1: lock t1: unlock {t2}, {t0, t1} t0: lock(t) unlock(t) t1: lock(t) unlock(t) t2: lock(t) unlock(t) A Simple DPOR Example …

30 For this example, all the paths explored during DPOR For others, it will be a proper subset

31 Idea for parallelization: Explore computations from the backtrack set in other processes. “Embarrassingly Parallel” – it seems so, anyway !

32 We first built a sequential DPOR explorer for C / Pthreads programs, called “Inspect” Multithreaded C/C++ program instrumented program instrumentation Thread library wrapper compile executable thread 1 thread n scheduler request/permit

33 l Stateless search does not maintain search history l Different branches of an acyclic space can be explored concurrently l Simple master-slave scheme can work here – one load balancer + workers We then made the following observations

34 Request unloading idle node id work description report result load balancer We then devised a work-distribution scheme…

35 We got zero speedup! Why? Deeper investigation revealed that multiple nodes ended up exploring the same interleavings

36 Illustration of the problem (1 of 5) t0: lock t0: unlock t1: lock t2: unlock t1: unlock t2: lock {t1}, {t0} {t2}, {t1}

37 Illustration of the problem (2 of 5) t0: lock t0: unlock t1: lock t2: unlock t1: unlock t2: lock {t1}, {t0} {t2}, {t1} Heuristic : Handoff DEEPEST backtrack point for another node to explore Reason : Largest number of paths emanate from there To Node 1

38 Detail of (2 of 5) t0: lock t0: unlock t1: lock t2: unlock t1: unlock t2: lock {t1}, {t0} {t2}, {t1} Node 0 t0: lock t0: unlock t1: lock t2: unlock t1: unlock t2: lock { }, {t0,t1} {t2}, {t1}

39 Detail of (2 of 5) t0: lock t0: unlock t1: lock t2: unlock t1: unlock t2: lock {t1}, {t0} {t2}, {t1} Node 1Node 0 t0: lock t0: unlock t1: lock t2: unlock t1: unlock t2: lock { }, {t0,t1} {t2}, {t1} t0: lock {t1}, {t0}

40 Detail of (2 of 5) t0: lock t0: unlock t1: lock t2: unlock t1: unlock t2: lock {t1}, {t0} {t2}, {t1} Node 1Node 0 t0: lock t0: unlock t1: lock t2: unlock t1: unlock t2: lock { }, { t0,t1 } {t2}, {t1} t0: lock { t1 }, {t0} t1 is forced into DONE set before work handed to Node 1 Node 1 keeps t1 in backtrack set

41 Illustration of the problem (3 of 5) t0: lock t0: unlock t1: lock t2: unlock t1: unlock t2: lock {t1}, {t0} {t2}, {t1} To Node 1 Decide to do THIS work at Node 0 itself…

42 t0: lock t0: unlock {}, {t0,t1} {t2}, {t1} {t1}, {t0} Illustration of the problem (4 of 5) Being expanded by Node 0 Being expanded by Node 1

43 Illustration of the problem (5 of 5) t0: lock t0: unlock {t2}, {t0,t1} {}, {t2} t2: lock t2: unlock

44 Illustration of the problem (5 of 5) t0: lock t0: unlock {t2}, {t0,t1} {}, {t2} {t1}, {t0} t1: lock t1: unlock t2: lock t2: unlock

45 Illustration of the problem (5 of 5) t0: lock t0: unlock {t2}, {t0,t1} {}, {t2} {t2}, {t0, t1} t1: lock t1: unlock t2: lock t2: unlock t2: lock t2: unlock {}, {t2} Redundancy!

46 New Backtrack Set Computation: Aggressively mark up the stack! t0: lock t0: unlock t1: lock t2: unlock t1: unlock t2: lock {t1,t2}, {t0} {t2}, {t1} l Update the backtrack sets of ALL dependent operations! l Forms a good allocation scheme l Does not involve any synchronizations l Redundant work may still be performed l Likelihood is reduced because a node aggressively “owns” one operation and all its dependants

47 Implementation and Evaluation l Using MPI for communication among nodes l Did experiments on a 72-node cluster – 2.4 GHz Intel XEON process, 2GB memory/node – Two (small) benchmarks Indexer & file system benchmark used in Flanagan and Godefoid’s DPOR paper – Aget -- a multithreaded ftp client – Bbuf – an implementation of bounded buffer

48 Sequential Checking Time Benchmark ThreadsRuns Time (sec) fsbench268, indexer1632, aget6113, bbuf81,938,

49 Speedup on indexer & fs (small exs); so diminishing returns > 40 nodes…

50 Speedup on aget

51 Speedup on bbuf

52 Conclusions and Future Work l Method described is VERY promising l We have an in-situ model checker for MPI programs also! (EuroPVM / MPI 2007) – Will be parallelized using MPI for work distribution! l The C/PThread Work needs to be pushed a lot more: – Automate Instrumentation – Try many new examples – Improve work-distribution heuristic in response to findings – Release tool

53 Questions?

54 Answers ! l Properties: Currently – Local “assert”s – Deadlocks – Uninitialized Variables l No plans for liveness l Tool release likely in 6 months l That is a very good question. Let’s talk!

55 Extra Slides

56 Concurrent operations on some database Class A operations: pthread_mutex_lock(mutex); a_count++; if (a_count == 1) pthred_mutex_lock(res); pthread_mutex_unlock(mutex); … pthread_mutex_lock(mutex); a_count--; if (a_count == 0) pthread_mutex_unlock(res); pthread_mutex_unlock(mutex); Class B operations: pthread_mutex_lock(mutex); b_count++; if (b_count == 1) pthred_mutex_lock(res); pthread_mutex_unlock(mutex); … pthread_mutex_lock(mutex); b_count--; if (b_count == 0) pthread_mutex_unlock(res); pthread_mutex_unlock(mutex);

57 Initial random execution a1 : acquire mutex a2 : a_count + + a3 : a_count == 1 a4 : acquire res a5 : release mutex a6 : acquire mutex a7 : a_count a8 : a_count == 0 a9 : release res a10 : release mutex b1 : acquire mutex b2 : b_count + + b3 : b_count == 1 b4 : acquire res b5 : release mutex b6 : acquire mutex b7 : b_count b8 : b_count == 0 b9 : release lock b10 : release mutex Class A operations: pthread_mutex_lock(mutex); a_count++; if (a_count == 1) pthred_mutex_lock(res); pthread_mutex_unlock(mutex); … pthread_mutex_lock(mutex); a_count--; if (a_count == 0) pthread_mutex_unlock(res); pthread_mutex_unlock(mutex);

58 Initial random execution a1 : acquire mutex a2 : a_count + + a3 : a_count == 1 a4 : acquire res a5 : release mutex a6 : acquire mutex a7 : a_count a8 : a_count == 0 a9 : release res a10 : release mutex b1 : acquire mutex b2 : b_count + + b3 : b_count == 1 b4 : acquire res b5 : release mutex b6 : acquire mutex b7 : b_count b8 : b_count == 0 b9 : release lock b10 : release mutex Class A operations: pthread_mutex_lock(mutex); a_count++; if (a_count == 1) pthred_mutex_lock(res); pthread_mutex_unlock(mutex); … pthread_mutex_lock(mutex); a_count--; if (a_count == 0) pthread_mutex_unlock(res); pthread_mutex_unlock(mutex);

59 Initial random execution a1 : acquire mutex a2 : a_count + + a3 : a_count == 1 a4 : acquire res a5 : release mutex a6 : acquire mutex a7 : a_count a8 : a_count == 0 a9 : release res a10 : release mutex b1 : acquire mutex b2 : b_count + + b3 : b_count == 1 b4 : acquire res b5 : release mutex b6 : acquire mutex b7 : b_count b8 : b_count == 0 b9 : release lock b10 : release mutex Class A operations: pthread_mutex_lock(mutex); a_count++; if (a_count == 1) pthred_mutex_lock(res); pthread_mutex_unlock(mutex); … pthread_mutex_lock(mutex); a_count--; if (a_count == 0) pthread_mutex_unlock(res); pthread_mutex_unlock(mutex);

60 Initial random execution a1 : acquire mutex a2 : a_count + + a3 : a_count == 1 a4 : acquire res a5 : release mutex a6 : acquire mutex a7 : a_count a8 : a_count == 0 a9 : release res a10 : release mutex b1 : acquire mutex b2 : b_count + + b3 : b_count == 1 b4 : acquire res b5 : release mutex b6 : acquire mutex b7 : b_count b8 : b_count == 0 b9 : release lock b10 : release mutex Class A operations: pthread_mutex_lock(mutex); a_count++; if (a_count == 1) pthred_mutex_lock(res); pthread_mutex_unlock(mutex); … pthread_mutex_lock(mutex); a_count--; if (a_count == 0) pthread_mutex_unlock(res); pthread_mutex_unlock(mutex);

61 Initial random execution a1 : acquire mutex a2 : a_count + + a3 : a_count == 1 a4 : acquire res a5 : release mutex a6 : acquire mutex a7 : a_count a8 : a_count == 0 a9 : release res a10 : release mutex b1 : acquire mutex b2 : b_count + + b3 : b_count == 1 b4 : acquire res b5 : release mutex b6 : acquire mutex b7 : b_count b8 : b_count == 0 b9 : release lock b10 : release mutex Class A operations: pthread_mutex_lock(mutex); a_count++; if (a_count == 1) pthred_mutex_lock(res); pthread_mutex_unlock(mutex); … pthread_mutex_lock(mutex); a_count--; if (a_count == 0) pthread_mutex_unlock(res); pthread_mutex_unlock(mutex);

62 Initial random execution a1 : acquire mutex a2 : a_count + + a3 : a_count == 1 a4 : acquire res a5 : release mutex a6 : acquire mutex a7 : a_count a8 : a_count == 0 a9 : release res a10 : release mutex b1 : acquire mutex b2 : b_count + + b3 : b_count == 1 b4 : acquire res b5 : release mutex b6 : acquire mutex b7 : b_count b8 : b_count == 0 b9 : release lock b10 : release mutex Class A operations: pthread_mutex_lock(mutex); a_count++; if (a_count == 1) pthred_mutex_lock(res); pthread_mutex_unlock(mutex); … pthread_mutex_lock(mutex); a_count--; if (a_count == 0) pthread_mutex_unlock(res); pthread_mutex_unlock(mutex);

63 Initial random execution a1 : acquire mutex a2 : a_count + + a3 : a_count == 1 a4 : acquire res a5 : release mutex a6 : acquire mutex a7 : a_count -- a8 : a_count == 0 a9 : release res a10 : release mutex b1 : acquire mutex b2 : b_count + + b3 : b_count == 1 b4 : acquire res b5 : release mutex b6 : acquire mutex b7 : b_count b8 : b_count == 0 b9 : release lock b10 : release mutex Class A operations: pthread_mutex_lock(mutex); a_count++; if (a_count == 1) pthred_mutex_lock(res); pthread_mutex_unlock(mutex); … pthread_mutex_lock(mutex); a_count--; if (a_count == 0) pthread_mutex_unlock(res); pthread_mutex_unlock(mutex);

64 Initial random execution a1 : acquire mutex a2 : a_count + + a3 : a_count == 1 a4 : acquire res a5 : release mutex a6 : acquire mutex a7 : a_count -- a8 : a_count == 0 a9 : release res a10 : release mutex b1 : acquire mutex b2 : b_count + + b3 : b_count == 1 b4 : acquire res b5 : release mutex b6 : acquire mutex b7 : b_count b8 : b_count == 0 b9 : release lock b10 : release mutex Class A operations: pthread_mutex_lock(mutex); a_count++; if (a_count == 1) pthred_mutex_lock(res); pthread_mutex_unlock(mutex); … pthread_mutex_lock(mutex); a_count--; if (a_count == 0) pthread_mutex_unlock(res); pthread_mutex_unlock(mutex);

65 Initial random execution a1 : acquire mutex a2 : a_count + + a3 : a_count == 1 a4 : acquire res a5 : release mutex a6 : acquire mutex a7 : a_count -- a8 : a_count == 0 a9 : release res a10 : release mutex b1 : acquire mutex b2 : b_count + + b3 : b_count == 1 b4 : acquire res b5 : release mutex b6 : acquire mutex b7 : b_count b8 : b_count == 0 b9 : release lock b10 : release mutex Class A operations: pthread_mutex_lock(mutex); a_count++; if (a_count == 1) pthred_mutex_lock(res); pthread_mutex_unlock(mutex); … pthread_mutex_lock(mutex); a_count--; if (a_count == 0) pthread_mutex_unlock(res); pthread_mutex_unlock(mutex);

66 Initial random execution a1 : acquire mutex a2 : a_count + + a3 : a_count == 1 a4 : acquire res a5 : release mutex a6 : acquire mutex a7 : a_count -- a8 : a_count == 0 a9 : release res a10 : release mutex b1 : acquire mutex b2 : b_count + + b3 : b_count == 1 b4 : acquire res b5 : release mutex b6 : acquire mutex b7 : b_count b8 : b_count == 0 b9 : release lock b10 : release mutex Class A operations: pthread_mutex_lock(mutex); a_count++; if (a_count == 1) pthred_mutex_lock(res); pthread_mutex_unlock(mutex); … pthread_mutex_lock(mutex); a_count--; if (a_count == 0) pthread_mutex_unlock(res); pthread_mutex_unlock(mutex);

67 Initial random execution a1 : acquire mutex a2 : a_count + + a3 : a_count == 1 a4 : acquire res a5 : release mutex a6 : acquire mutex a7 : a_count a8 : a_count == 0 a9 : release res a10 : release mutex b1 : acquire mutex b2 : b_count + + b3 : b_count == 1 b4 : acquire res b5 : release mutex b6 : acquire mutex b7 : b_count b8 : b_count == 0 b9 : release lock b10 : release mutex Class B operations: pthread_mutex_lock(mutex); b_count++; if (b_count == 1) pthred_mutex_lock(res); pthread_mutex_unlock(mutex); … pthread_mutex_lock(mutex); b_count--; if (b_count == 0) pthread_mutex_unlock(res); pthread_mutex_unlock(mutex);

68 Initial random execution a1 : acquire mutex a2 : a_count + + a3 : a_count == 1 a4 : acquire res a5 : release mutex a6 : acquire mutex a7 : a_count-- a8 : a_count == 0 a9 : release res a10 : release mutex b1 : acquire mutex b2 : b_count + + b3 : b_count == 1 b4 : acquire res b5 : release mutex b6 : acquire mutex b7 : b_count b8 : b_count == 0 b9 : release lock b10 : release mutex Class B operations: pthread_mutex_lock(mutex); b_count++; if (b_count == 1) pthred_mutex_lock(res); pthread_mutex_unlock(mutex); … pthread_mutex_lock(mutex); b_count--; if (b_count == 0) pthread_mutex_unlock(res); pthread_mutex_unlock(mutex);

69 Initial random execution a1 : acquire mutex a2 : a_count + + a3 : a_count == 1 a4 : acquire res a5 : release mutex a6 : acquire mutex a7 : a_count a8 : a_count == 0 a9 : release res a10 : release mutex b1 : acquire mutex b2 : b_count + + b3 : b_count == 1 b4 : acquire res b5 : release mutex b6 : acquire mutex b7 : b_count b8 : b_count == 0 b9 : release lock b10 : release mutex Class B operations: pthread_mutex_lock(mutex); b_count++; if (b_count == 1) pthred_mutex_lock(res); pthread_mutex_unlock(mutex); … pthread_mutex_lock(mutex); b_count--; if (b_count == 0) pthread_mutex_unlock(res); pthread_mutex_unlock(mutex);

70 Dependent operations? a1 : acquire mutex a2 : a_count + + a3 : a_count == 1 a4 : acquire res a5 : release mutex a6 : acquire mutex a7 : a_count a8 : a_count == 0 a9 : release res a10 : release mutex b1 : acquire mutex b2 : b_count + + b3 : b_count == 1 b4 : acquire res b5 : release mutex b6 : acquire mutex b7 : b_count b8 : b_count == 0 b9 : release lock b10 : release mutex Class B operations: pthread_mutex_lock(mutex); b_count++; if (b_count == 1) pthred_mutex_lock(res); pthread_mutex_unlock(mutex); … pthread_mutex_lock(mutex); b_count--; if (b_count == 0) pthread_mutex_unlock(res); pthread_mutex_unlock(mutex);

71 Start an alternative execution a1 : acquire mutex a2 : a_count + + a3 : a_count == 1 a4 : acquire res a5 : release mutex a6 : acquire mutex a7 : a_count -- a8 : a_count == 0 a9 : release res a10 : release mutex b1 : acquire mutex b2 : b_count + + b3 : b_count == 1 b4 : acquire res b5 : release mutex b6 : acquire mutex b7 : b_count b8 : b_count == 0 b9 : release lock b10 : release mutex Class A operations: pthread_mutex_lock(mutex); a_count++; if (a_count == 1) pthred_mutex_lock(res); pthread_mutex_unlock(mutex); … pthread_mutex_lock(mutex); a_count--; if (a_count == 0) pthread_mutex_unlock(res); pthread_mutex_unlock(mutex);

72 Get a deadlock! a1 : acquire mutex a2 : a_count + + a3 : a_count == 1 a4 : acquire res a5 : release mutex b1 : acquire mutex b2 : b_count + + b3 : b_count == 1 a6 : acquire mutex a7 : a_count -- a8 : a_count == 0 a9 : release res a10 : release mutex b4 : acquire res b5 : release mutex b6 : acquire mutex b7 : b_count b8 : b_count == 0 b9 : release lock b10 : release mutex Class A operations: pthread_mutex_lock(mutex); a_count++; if (a_count == 1) pthred_mutex_lock(res); pthread_mutex_unlock(mutex); pthread_mutex_lock(mutex); Class B operations: pthread_mutex_lock(mutex); b_count++; if (b_count == 1) pthred_mutex_lock(res);