Shape Analysis for Fine-Grained Concurrency using Thread Quantification Josh Berdine Microsoft Research Joint work with: Tal Lev-Ami, Roman Manevich, Mooly.

Slides:



Advertisements
Similar presentations
1 Lecture 5 Towards a Verifying Compiler: Multithreading Wolfram Schulte Microsoft Research Formal Methods 2006 Race Conditions, Locks, Deadlocks, Invariants,
Advertisements

Shared-Memory Model and Threads Intel Software College Introduction to Parallel Programming – Part 2.
1
Feichter_DPG-SYKL03_Bild-01. Feichter_DPG-SYKL03_Bild-02.
1 Vorlesung Informatik 2 Algorithmen und Datenstrukturen (Parallel Algorithms) Robin Pomplun.
© 2008 Pearson Addison Wesley. All rights reserved Chapter Seven Costs.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Bounded Model Checking of Concurrent Data Types on Relaxed Memory Models: A Case Study Sebastian Burckhardt Rajeev Alur Milo M. K. Martin Department of.
Partially Disjunctive Shape Analysis Josh Berdine Byron Cook MSR Cambridge Tal Lev-Ami Roman Manevich Mooly Sagiv Ran Shaham Tel Aviv University Ganesan.
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
Author: Julia Richards and R. Scott Hawley
1 Copyright © 2013 Elsevier Inc. All rights reserved. Appendix 01.
Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.
UNITED NATIONS Shipment Details Report – January 2006.
David Burdett May 11, 2004 Package Binding for WS CDL.
1 RA I Sub-Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Casablanca, Morocco, 20 – 22 December 2005 Status of observing programmes in RA I.
Properties of Real Numbers CommutativeAssociativeDistributive Identity + × Inverse + ×
Create an Application Title 1A - Adult Chapter 3.
Custom Statutory Programs Chapter 3. Customary Statutory Programs and Titles 3-2 Objectives Add Local Statutory Programs Create Customer Application For.
Multiplying binomials You will have 20 seconds to answer each of the following multiplication problems. If you get hung up, go to the next problem when.
Chapter 6 File Systems 6.1 Files 6.2 Directories
REVIEW: Arthropod ID. 1. Name the subphylum. 2. Name the subphylum. 3. Name the order.
1 Chapter 10 - Structures, Unions, Bit Manipulations, and Enumerations Outline 10.1Introduction 10.2Structure Definitions 10.3Initializing Structures 10.4Accessing.
Break Time Remaining 10:00.
Turing Machines.
1 Refactoring with Contracts Shmuel Tyszberowicz School of Computer Science The Academic College of Tel Aviv Yaffo Maayan Goldstein School of Computer.
PP Test Review Sections 6-1 to 6-6
Chapter 24 Lists, Stacks, and Queues
Chapter 1 Object Oriented Programming 1. OOP revolves around the concept of an objects. Objects are created using the class definition. Programming techniques.
Bright Futures Guidelines Priorities and Screening Tables
1 Lecture 16: Tables and OOP. 2 Tables -- get and put.
Chapter 10: Virtual Memory
Bellwork Do the following problem on a ½ sheet of paper and turn in.
Exarte Bezoek aan de Mediacampus Bachelor in de grafische en digitale media April 2014.
Chapter 6 File Systems 6.1 Files 6.2 Directories
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
1..
CONTROL VISION Set-up. Step 1 Step 2 Step 3 Step 5 Step 4.
Adding Up In Chunks.
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt Synthetic.
1 Termination and shape-shifting heaps Byron Cook Microsoft Research, Cambridge Joint work with Josh Berdine, Dino Distefano, and.
Subtraction: Adding UP
1 hi at no doifpi me be go we of at be do go hi if me no of pi we Inorder Traversal Inorder traversal. n Visit the left subtree. n Visit the node. n Visit.
10 -1 Chapter 10 Amortized Analysis A sequence of operations: OP 1, OP 2, … OP m OP i : several pops (from the stack) and one push (into the stack)
Analyzing Genes and Genomes
Types of selection structures
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Pointers and Arrays Chapter 12
Essential Cell Biology
Clock will move after 1 minute
Intracellular Compartments and Transport
PSSA Preparation.
Essential Cell Biology
Immunobiology: The Immune System in Health & Disease Sixth Edition
Energy Generation in Mitochondria and Chlorplasts
Select a time to count down from the clock above
Murach’s OS/390 and z/OS JCLChapter 16, Slide 1 © 2002, Mike Murach & Associates, Inc.
Techniques for proving programs with pointers A. Tikhomirov.
1 Decidability continued…. 2 Theorem: For a recursively enumerable language it is undecidable to determine whether is finite Proof: We will reduce the.
Chapter 9: Using Classes and Objects. Understanding Class Concepts Types of classes – Classes that are only application programs with a Main() method.
1 Programming Languages (CS 550) Mini Language Interpreter Jeremy R. Johnson.
Hongjin Liang and Xinyu Feng
TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A AAA A A A AA A Proving that non-blocking algorithms don't block.
Heap Decomposition for Concurrent Shape Analysis R. Manevich T. Lev-Ami M. Sagiv Tel Aviv University G. Ramalingam MSR India J. Berdine MSR Cambridge Dagstuhl.
Comparison Under Abstraction for Verifying Linearizability Daphna Amit Noam Rinetzky Mooly Sagiv Tom RepsEran Yahav Tel Aviv UniversityUniversity of Wisconsin.
Thread Quantification for Concurrent Shape Analysis Josh BerdineMSR Cambridge Tal Lev-AmiTel Aviv University Roman ManevichTel Aviv University Mooly Sagiv.
Shape Analysis Overview presented by Greta Yorsh.
Presentation transcript:

Shape Analysis for Fine-Grained Concurrency using Thread Quantification Josh Berdine Microsoft Research Joint work with: Tal Lev-Ami, Roman Manevich, Mooly Sagiv (Tel Aviv), Ganesan Ramalingam (MSR India)

2 Non-blocking stack [Treiber,86] void push(Stack *S, data_type v) { [1] Node *x = alloc(sizeof(Node)); [2] x->d = v; [3] do { [4] Node *t = S->Top; [5] x->n = t; [6] } while (!CAS(&S->Top,t,x)); [7] } data_type pop(Stack *S){ [8] do { [9] Node *t = S->Top; [10] if (t == NULL) [11] return EMPTY; [12] Node *s = t->n; [13] data_type r = t->d; [14] } while (!CAS(&S->Top,t,s)); [15] return r; [16] } benign data races unbounded number of threads t points to valid memory? list remains acyclic? if (S->Top == t) S->Top = x; evaluate to true; else evaluate to false; Stack linearizable?

Linearizable data structure –Concurrent operations allowed to be interleaved –Operations appear to execute atomically External observer gets the illusion that each operation takes effect instantaneously at some point between its invocation and its response Order of operations of same thread preserved –Sequential specification defines legal sequential executions 3 time push(4) pop():4push(7) push(4) pop():4push(7) Last In First Out Concurrent LIFO stack T1T1 T2T2 Linearizability [Herlihy and Wing, TOPLAS'90]

push2(4,5) pop2():8,5push2(7,8) 4 void push2(Stack *S, data_type v1, data_type * v2) { push(s, v1); push(s, v2); } void pop2(Stack *S, data_type * v1, data_type * v2) { *v2 = pop(s); *v1 = pop(s); } time push2(4,5) pop2():8,5push2(7,8) illegal sequential execution Non-linearizable pairs stack

push2(4,5) pop2():8,5push2(7,8) 5 void push2(Stack *S, data_type v1, data_type * v2) { push(s, v1); push(s, v2); } void pop2(Stack *S, data_type * v1, data_type * v2) { *v2 = pop(s); *v1 = pop(s); } time push2(4,5) pop2():8,5push2(7,8) illegal sequential execution Non-linearizable pairs stack

Motivation + what is linearizability Universally quantified shape abstractions Checking linearizability Case studies 6 Outline

Heaps contain both threads and objects 7 Concurrent heaps [Yahav, POPL01] thread object with program counter thread-local variable list field list object pc=6 pc=2 x n x Top t global variable

Heaps contain both threads and objects –Logical structure, or –Formula in subset of FO TC [Yorsh et al., TOCL07] 8 Concurrent heaps [Yahav, POPL01] pc=6 pc=2 x n x Top t pc(tr 1 )=6 pc(tr 2 )=2 v 1,v 2,v 3. Top(v 1 ) x(tr 1,v 2 ) t(tr 1,v 1 ) x(tr 2,v 3 ) n(v 2,v 1 ) … v1v1 v3v3 v2v2 tr 1 tr 2

9 Unbounded concurrent heaps void push(Stack *S, data_type v) { [1] Node *x = alloc(sizeof(Node)); [2] x->d = v; [3] do { [4] Node *t = S->Top; [5] x->n = t; [6] } while (!CAS(&S->Top,t,x)); [7] } pc=6 pc=5 x n x Top pc=1 pc=2 x x t pc=5 x t pc=6 x n t t pc=1 Unbounded parallel composition: push(Top,?) ||... || push(Top,?) n n

Each subheap –Presents a view of heap relative to one thread –Can be instantiated 0 times 10 Thread-relative subheaps pc=5 t pc=2 x x pc=1 Top pc=6 t n x Top n n n n n n n n

Each subheap –Presents a view of heap relative to one thread –Can be instantiated 0 times –Bounded by finitary abstraction 11 Bounded thread-relative subheaps pc=4 t pc=2 x x pc=1 Top pc=6 t n x Top n n n n n n n n

12 Concurrent heap pc(tr 1 )=6 pc(tr 2 )=2 v 1,v 2,v 3. Top(v 1 ) x(tr 1,v 2 ) t(tr 1,v 1 ) x(tr 2,v 3 ) n(v 2,v 1 ) … pc=6 pc=2 x n x Top t v1v1 v3v3 v2v2 tr 1 tr 2

pc=2 x Top pc(t)=6 v 1,v 2. Top(v 1 ) x(t,v 2 ) t(t,v 1 ) n(v 2,v 1 ) … t. pc(t)=2 v 1,v 3. Top(v 1 ) x(t,v 3 ) … 13 Universally quantified local heaps pc=6 x n Top t t t v1v1 v1v1 v2v2 v3v3 symbolic thread

pc(t)=6 v 1,v 2. Top(v 1 ) x(t,v 2 ) t(t,v 1 ) n(v 2,v 1 ) … t. pc(t)=2 v 1,v 3. Top(v 1 ) x(t,v 3 ) … 14 Meaning of quantified invariant pc=6 x n Top t x pc=1 pc=6 pc=2 t Information maintained (dis)equalities between local variables of each thread and global variables Objects reachable from global variables Information lost (dis)equalities between local variables of different threads Number of threads pc=2 x Top x pc=1 pc=6 pc=3 t pc=1 ×m×m n×n×

Motivation + what is linearizability Universally quantified shape abstractions Checking linearizability Case studies 15 Outline

Linearizable data structure –Concurrent operations allowed to be interleaved –Operations appear to execute atomically External observer gets the illusion that each operation takes effect instantaneously at some point between its invocation and its response Order of operations of same thread preserved –Sequential specification defines legal sequential executions 16 time push(4) pop():4push(7) push(4) pop():4 push(7) Last In First Out Concurrent LIFO stack T1T1 T2T2 Linearizability [Herlihy and Wing, TOPLAS'90]

Compare each concurrent execution to a specific sequential execution Show that every (terminating) concurrent operation returns the same result as its sequential counterpart 17 Verification of fixed linearization points [Amit et al., CAV07] linearization point operation Concurrent Execution Sequential Execution compare results... linearization point Conjoined Execution compare results

Top pc=1 18 Conjoined execution for push concurrent state sequential view isomorphism relation Top void push(Stack *S, data_type v) { [1] Node *x = alloc(sizeof(Node)); [2] x->d = v; [3] do { [4] Node *t = S->Top; [5] x->n = t; [6] } while (!CAS(&S->Top,t,x)); on CAS [7] }

Top pc=1 19 Conjoined execution for push conjoined state duo-object void push(Stack *S, data_type v) { [1] Node *x = alloc(sizeof(Node)); [2] x->d = v; [3] do { [4] Node *t = S->Top; [5] x->n = t; [6] } while (!CAS(&S->Top,t,x)); on CAS [7] }

20 Conjoined execution for push Top pc=2 x delta object tracks differences between concurrent and sequential execution per thread Top pc=1 void push(Stack *S, data_type v) { [1] Node *x = alloc(sizeof(Node)); [2] x->d = v; [3] do { [4] Node *t = S->Top; [5] x->n = t; [6] } while (!CAS(&S->Top,t,x)); on CAS [7] }

21 Conjoined execution for push void push(Stack *S, data_type v) { [1] Node *x = alloc(sizeof(Node)); [2] x->d = v; [3] do { [4] Node *t = S->Top; [5] x->n = t; [6] } while (!CAS(&S->Top,t,x)); on CAS [7] } Top pc=2 x Top pc=1 Top pc=5 x t … Top pc=6 x t n Top pc=7 n if (S->Top == t) S->Top = x; evaluate to true; else evaluate to false;

22 Run operation sequentially void push(Stack *S, data_type v) { [1] Node *x = alloc(sizeof(Node)); [2] x->d = v; [3] do { [4] Node *t = S->Top; [5] x->n = t; [6] } while (!CAS(&S->Top,t,x)); on CAS [7] } Top pc=7 n Top pc=7 n x Top pc=7 n x t Top pc=7 n x t n Top pc=7 nn Top pc=7 n Check results: concurrent and sequential stacks are correlated

Observations used Unbounded number of heap objects –Number of delta objects created per thread is bounded –Objects in recursive data structures bounded by existing shape abstractions Delta objects always referenced by local or global variables –Captured by single threads view of heap Threads mutate data structures near global access points –Can precisely model success/failure of CAS without looking deep into heap Losing most inter-thread correlations is ok –Fine-grained programs must protect themselves from interference 23

Motivation + what is linearizability Universally quantified shape abstractions Checking linearizability Case studies 24 Outline

25 Case studies Verified Programs#statestime (sec.) Non-blocking stack [Treiber 1986] Two-lock queue [Michael & Scott, PODC 1996] 3, Non-blocking queue [Doherty & Groves, FORTE 2004] 10,

Related work [Gotsman et al., PLDI07] –Thread-modular shape analysis for coarse-grained concurrency [Vafeiadis et al.,06,07,08] –Linearizability for an unbounded number of threads with rely-guarantee & separation logic 26

Strengths –Parametric shape abstraction for an unbounded number of threads –Verifies linearizability of fine-grained concurrent implementations –Tunable scalability via thread-modular aspects –Tunable precision via abstract semantics using multiple-instantiations of invariants Limitations / Future work –Fixed, specified, linearization points –Setting the frameworks knobs optimally can be difficult, and require understanding program –Only as good as underlying heap abstraction –Does not prove encapsulation of data structure –May want to prove more than linearizability 27 Conclusion

28

29 An unbounded state void push(Stack *S, data_type v) { [1] Node *x = alloc(sizeof(Node)); [2] x->d = v; [3] do { [4] Node *t = S->Top; [5] x->n = t; [6] } while (!CAS(&S->Top,t,x)); on CAS [7] } pc=6 pc=4 x n x Top pc=1 pc=2 x x t pc=4 x t pc=6 x n t t pc=1 unbounded number of delta objects n n

Top pc=1 n n Top pc=2 x n n pc=4 x Top t n n pc=6 x n Top t n n 30 Bounded local states number of delta objects per local heap bounded

31 Loss of non-aliasing information pc(t)=6 v 1,v 2. Top(v 1 ) x(t,v 2 ) t(t,v 1 ) n(v 2,v 1 ) … t. pc=6 x n Top pc=6 x n t t x n t x t unwanted aliasing consider x->n=t Remedy: record non-aliasing information explicitly n

32 Adding non-aliasing information pc=6 P x n Top pc=6 P x n t t x n t x Referenced by exactly one thread pc(t)=6 v 1,v 2. Top(v 1 ) x(t,v 2 ) t(t,v 1 ) n(v 2,v 1 ) Private(v 1 ) Private(v 2 ) … t. P t n

33 Adding non-aliasing information pc(t)=6 v 1,v 2. Top(v 1 ) x(t,v 2 ) t(t,v 1 ) n(v 2,v 1 ) Private(v 1 ) Private(v 2 ) … t. pc=6 P x n Top pc=6 P x n t t x n t P x P t Operation on private objects invisible to other threads n

Add universal quantification on top of finitary heap abstractions –Handle unbounded number of threads Local heaps can overlap –Handle fine-grained concurrency Strengthen local heaps by Private predicate –Private objects cannot be affected by actions of other threads Missing: transformers (see paper) 34 Recap

Tracks bounded differences between concurrent and sequential execution per thread –Abstracts two heaps together –Handles unbounded number of threads Abstracts correlations between threads – Thread-modular characteristics 35 Shape analysis with delta abstraction for unbounded threads

36 Linearization points for Treibers stack void push(Stack *S, data_type v) { [1] Node *x = alloc(sizeof(Node)); [2] x->d = v; [3] do { [4] Node *t = S->Top; [5] x->n = t; [6] } while (!CAS(&S->Top,t,x)); on CAS [7] } data_type pop(Stack *S){ [8] do { [9] Node *t = S->Top; [10] if (t == NULL) [11] return EMPTY; [12] Node *s = t->n; [13] data_type r = t->d; [14] } while (!CAS(&S->Top,t,s)); on CAS [15] return r; [16] }

Generic technique for lifting abstract domains with universal quantifiers Abstract transformers –Thread instantiation Combining universal quantification with heap decomposition 37 Whats missing from the talk?

Can you handle mutex? Yes with Canonical Abstraction t 1. { …. t 2. … } Not with Boolean Heaps –Only one level of quantification 38

Support free variables (u,v,w) Support join and meet operations 39 Requirements from base domain

Incrementally constructed during execution Nodes allocated by matching push operations are correlated Correlated nodes have equal data values –Show that matching pops return data values of correlated nodes Constructing the correlation relation

Every operation has (user-specified) fixed linearization point –Statement at which the operation appears to take effect Show that these linearization points are correct for every concurrent execution User may specify –Several (alternative) linearization points –Certain types of conditional linearization points e.g., successful CAS operations 41 Fixed linearization points

Stack's most-general client void client (Stack S) { do { if (?) push(S, rand()); else pop(S); } while ( 1 ); }

New parametric shape analysis –Universally quantified shape abstractions Extra level of quantification over shape abstraction –Fine-grained concurrency –Unbounded number of threads –Thread-modular aspects Sound transformers Application –Checking linearizability of concurrent data structures 43 Main results