Scaling Model Checking of Dataraces Using Dynamic Information Ohad Shacham Tel Aviv University IBM Haifa Lab Mooly Sagiv Tel Aviv University Assaf Schuster Technion
Datarace Happens when two threads access a memory location concurrently At least one access is a write Unpredictable results Can indicate bugs Hard to detect Hard to reproduce
Datarace example TicketPurchase(NumOfTickets) { if (NumOfTickets · FreeTickets) FreeTickets -= NumOfTickets else Print “ Full ” ; }
Datarace example Thread IThread II TicketPurchase(2) if (NumOfTickets · FreeTickets) TicketPurchase(4) if (NumOfTickets · FreeTickets) FreeTickets -= NumOfTickets {FreeTickets = -2} {FreeTickets = 4} TicketPurchase(NumOfTickets) { if (NumOfTickets · FreeTickets) FreeTickets -= NumOfTickets else Print “ Full ” ; }
Datarace example TicketPurchase(NumOfTickets) { Lock(lock FreeTickets ) if (NumOfTickets · FreeTickets) FreeTickets -= NumOfTickets else Print “ Full ” ; Unlock(lock FreeTickets ) }
Datarace detection Static datarace detection tools Racex [Engler and Ashcraft] TVLA [Sagiv et. al.] Dynamic datarace detection tools: Lamport ’ s happens-before partial order (Djit) Lock based techniques (Lockset)
Difficulties in model checking dataraces Infinite state space Huge number of interleavings Huge transition systems Size problem
Observation Programs maintaining a locking discipline are dataraces free
Hybrid solution Dynamically check a locking discipline Produce witnesses for dataraces using a model checker Explore suffixes of the trace
Basic idea
Algorithm flow Lockset Multithreaded program List of Warnings Find a 1 11 22 Model Checker
Lockset Savage et. al. SOSP 1997 Lockset invariant multiple accesses to a specific memory location are guarded by a unique lock
Lockset example Lock(lock x ) X = 7 Unlock(lock x ) Lock(lock y ) Z = Y Lock(lock y ) Y = 2 Unlock(lock y ) Lock(lock y ) Y = X {lock x } {lock y } Thread IThread IIC(X) {lock x, lock y } {lock x } {lock y } Unlock(lock y ) Locks I Locks II
Lockset Advantage Predict dataraces which may occur in a different thread interleaving Disadvantages Spurious dataraces Hard to use Lack of trace
Lockset strength Lock(lock x ); X = 7; Unlock(lock x ); Lock(lock y ); Z = Y; Lock(lock y ); Y = 2; Unlock(lock y ); Lock(lock y ); Y = X; {lock x } {lock y } Thread IThread IIC(X) {lock x, lock y } {lock x } {lock y } Unlock(lock y ); Locks I Locks II
Our hybrid solution Combine Lockset & Model Checking Provide witnesses for dataraces Rare dataraces Dataraces in large programs Model Checking Provide witnesses for rare DR Lockset scale for large programs+
A witness for a datarace a1a1 a2a2 11 22 m a 1 = m a 2 a 1 =Write Ç a 2 =Write
Required data from Lockset X=7 Y=2 Z=Y Y=X Thread IThread II
Lockset provides for each warning only a single access event a 2 Find a prior access event a 1 which can take part in a race with a 2 a1a1 a2a2 Using Lockset data
X = 7 Z = Y Y = 2 Y = X A Warning on X X=7 Z=Y Y=2 Y=X {lock x } {lock y }
Prefix a1a1 a2a2
MODEL without t 1 Building a model a1a1
Using a model checker a1a1 11 a2a2 Is a 2 reachable by t 2 ?
Using a model checker a2a2 a1a1 a1a1
Reduce the model checker cost Reduction in the model size Elimination of thread t 1 Providing a single new initial configuration Heuristically reducing the number of steps that the model checker should carry out
Example Lock(lock x ); X = 7; Unlock(lock x ); Lock(lock y ); Z = Y; Lock(lock y ); Y = 2; Unlock(lock y ); Lock(lock y ); Y = X; {lock x } {lock y } Thread IThread IIC(X) {lock x, lock y } {lock x } {lock y } Unlock(lock y ); Lock(lock y ); Y = 2; Unlock(lock y ); Lock(lock y ); Y = X; 11 22 X = 7; Locks I Locks II
Prototype implementation A prototype tool based on IBM tools Lockset – The IBM Watson tool Wolf – IBM Haifa ’ s software model checker
Prototype implementation Lockset Multi-threaded program List of Warnings Find a 1 Extend 1 Wolf 11 22
Benchmark programs LinesDescriptionProgram 706 traveling salesman from ETH Tsp 708 Enhanced traveling salesman Our_tsp 3751 Multithreaded raytracer from specjvm98 mtrt Web Crawler Kernel from ETH Hedc 362 Parallel sort SortArray 129 Finds prime numbers in a given interval PrimeFinder 150 Elevator simulator Elevsim 166 Shared DB simulator DQueries
Experimental results 4 threads3 threads2 threadsProgram Memory (MB) Time (sec) Memory (MB) Time (sec) Memory (MB) Time (sec) Mem Out our_tsp Mem Out SortArray PrimeFinder ElevSim DQueries Hedc Mem OutOut Mem tsp
Conclusion Hybrid technique which combines dynamic datarace detector and a model checker Provide witnesses for dataraces which occur only in rare interleavings Helps the user in analyzing the datarace No spurious dataraces