Algorithmics for Software Transactional Memory Hagit Attiya Technion
2 Moore's Law is dead … says Gordon Moore (TechWorld, 4/2005) Intel announces a drastic change in its business strategy (San Francisco, 5/2004): «Multicore is the way to boost performance» Many applications will be concurrent Forking threads is easy, Handling the conflicts is hard!
3 Transactional Synchronization A transaction is a sequence of operations by a single process on a set of data items (data set) to be executed atomically We consider only read/write operations: Read set: the items read by T Write set: the items written by T Read X Write X Read Z Read Y Read X Write X Read Z Read Y
4 Transactional Synchronization A transaction is a sequence of operations by a single process on a set of data items (data set) to be executed atomically A transaction ends either by committing all its updates take effect or by aborting no update is effective Read X Write X Read Z Read Y Commit/ Abort Read X Write X Read Z Read Y Commit/ Abort
5 Software Implementation of TM Data representation for transactions & data Algorithms to execute transactions in terms of low-level primitives (read, write, CAS…) on base objects (memory locations) Must provide Opacity [Guerroui & Kapalka] or strict serializability [Papadimitriou] Obstruction freedom
Typical Implementation Scheme To write data item O, a transaction acquires O, possibly aborting the transaction that owns O (killer write) To read an object, a transaction takes a snapshot to see if the system hasn’t changed since its previous reads; else T is aborted (careful read) Or: validate the reads when the transaction asks to commit (optimistic read)
7 Algorithmics Acquiring several data items Validating that a set of values read is consistent Multi-location Synchronization Atomic Snapshot
8 (Full) Atomic Snapshot [Afek et al. 1990] n components Update a single component Scan all the components “at once” (atomically) Provides an instantaneous view of the whole memory update ok scan v 1,…,v n
9 Partial Snapshot [Attiya, Guerraoui, Ruppert, 2008] n components Update a single component Scan components “at once” (atomically) Allows to read parts of the memory Worthwhile if we can do it more efficiently than a (full) scan update ok scan v i1,…,v ir
10 Optimizing Partial Scans Transactions that only observe the data Empty write set Be invisible (not write to base objects) Avoid contention for the memory Always terminate successfully (wait-free)
11 DAP: Disjoint Access Parallelism T1T1 Read(Y) Write(X 1 ) T2T2 Write(X 2 ) T3T3 Read(X 2 ) Read(X 1 ) Disjoint data sets no contention Data sets are connected may contend Y X2X2 X1X1 T3T3 T1T1 Improves scalability for large data structures by reducing interference
12 DAP: More Formally An STM implementation is disjoint access parallel if two transactions T 1 and T 2 access the same base object ONLY IF the data sets of T 1 and T 2 are connected. The data sets of T 1 & T 2 either intersect or are connected via other transactions
13 State of the Art Read-only Tx termination Invisible read-only Tx DAPAlgorithm [Herlihy, Luchangco, Moir & Scherer] [Avni & Shavit] [Riegel, Felber & Fetzer] Harris, Fraser & Pratt]
14 Inherent Tradeoff Theorem. There is no TM implementation that is DAP and has invisible & wait-free read-only transactions [Attiya, Hillel & Milani] Proof utilizes the notion of a flippable execution, used to prove lower bounds for atomic snapshot objects [Israeli, Shirazi] [Attiya, Ellen & Fatourou]
15 Flippable Execution w/ 2 Updaters p1p1 p2p2 q s 1 … s l-1 s l … s k U 1 … U l … U 0 … U l-1 … U k A complete transaction in which p 1 writes l-1 to X 1 A read-only transaction by q that reads X 1, X 2 EkEk
16 Flippable Execution w/ 2 Updaters p1p1 p2p2 q s 1 … s l-1 s l … s k U 1 … U l … U 0 … U l-1 … U k EkEk Indistinguishable from executions where the order of (each pair of) updates is flipped… In one of two ways (forward and backward).
17 Flippable Execution: Backward Flip p1p1 p2p2 q s 1 … s l-1 s l … s k U 1 … U l … U 0 … U l-1 … U k EkEk p1p1 p2p2 q s 1 … s l-1 s l … s k U 1 … U l … U 0 … U l-1 … U k Backward Flip
18 Lemma 1. The read-only transaction of q cannot terminate successfully Relies on strict serializabitly (linearizability) Any interleaving of the transactions yields a result that can be achieved in a sequential execution of the same set of transactions (a serialization) The serialization must preserve the real-time order of (non-overlapping) transactions Why Flippable Executions?
19 Serialization of E k p1p1 p2p2 q s 1 … s l-1 s l … s k U 1 … U l … U 0 … U l-1 … U k EkEk U 1 … U l …U 0 U l-1 U k Serialization of E k : Possible serialization point Returns (l-1,l-2)
20 Nowhere to Serialize p1p1 p2p2 q s 1 … s l-1 s l … s k U 1 … U l … U 0 … U l-1 … U k EkEk U 1 … U l …U 0 U l-1 U k Serialization Returns (l-1,l-2) p1p1 p2p2 q s 1 … s l-1 s l … s k U 1 … U l … U 0 … U l-1 … U k BW Flip Still returns (l-1,l-2) U 1 … U l U l-1 … U k U0U0 Serialization x Indistinguishable from some flip (say, backward) X 1 = l-2 X 2 = l-3 X 1 = l X 2 = l-3 X 1 = l X 2 = l-1
21 Constructing a Flippable Execution Lemma 2. In a DAP TM, two consecutive transactions writing to different data items do not contend on the same base object. U1U1 U2U2
22 Proof of Lemma 2 Towards a contradiction, assumme U 1 and U 2 contend on a base object Let o be the last base object accessed by U 1 for which U 2 has a contending access U1U1 U2U2 Last contending access to o First contending access to o
23 Proof of Lemma 2 By contradictions, assumme U 1 and U 2 contend on a base object U1U1 U2U2 Last contending access to o First contending access to o U1U1 U2U2 U1 and U2 have disjoint data sets & contend on a base object Not DAP
24 Completing the Proof Show that a flippable execution exists The steps of the read-only transaction can be removed (since it is invisible) Since their data sets are disjoint, transactions U l & U l-1 do not “communicate” (by Lemma 2) Can be flipped By Lemma 1, the read-only transaction cannot terminate successfully If aborts, can apply the same argument again…
25 Extensions… Also a Lower Bound: A transaction with a data set of size t must write to t-2 base objects The results hold also for weaker TM consistency conditions: Serializability Snapshot Isolation
26 Algorithmics Acquiring several locations Validating that a set of values read is consistent Multi-location Synchronization Atomic Snapshot
27 Multi-Location Synchronization Acquire nodes in the data set How to resolve conflicts with blocking operations (holding a required data item)? To reduce delays
28 Multi-Location Synchronization Acquire nodes in the data set How to resolve conflicts with blocking operations (holding a required data item)? To reduce delays To avoid deadlock
29 Multi-Location Synchronization: Classical Solution E.g., [Touitou & Shavit] Acquired items by increasing order (e.g., of addresses) Avoids deadlocks But not delays Cannot be applied when data items are given in piecemeal…
30 Multi-Location Synchronization: Practical STM Implementations Use a contention manager to decide which transaction has the object (the other aborts) The other operation aborts & restarts But this requires support from the operating system Does not have provable guarantees
31 Multi-Location Synchronization: More Concurrency [Attiya, Hillel] Acquire locks in arbitrary order No need to know the data set in advance When there is a conflict between two transactions contending for a data item, either Wait for the other operation Abort & reset the other operation
32 Distributed Conflict Resolution Depends on the operations’ progress The more advanced operation wins 1.How to gauge progress? 2.What to do on a tie?
33 Who’s More Advanced? The operation that locked more data items If a less advanced operation needs an item wait (blocking in a limited radius) or help the conflicting operation If a more advanced operation needs an item abort the conflicting operation and claim the item
34 What about Ties? When two transactions locked the same number of items Each transaction has a descriptor with a lock Use DCAS to race for locking the two descriptors Winner calls the shots… 22
35 Wrap-Up Can be shown to guarantee progress in small neighborhoods If I take many steps, then a transaction within “short distance” completes The distance depends on size of the data set. Can be made non-blocking / wait-free By integrating a helping mechanism In a graph on the transactions defined by data conflicts
36 Measuring Concurrency: Spatial Relations between Operations A set of operations induces a conflict graph Nodes represent items Edges connect items of the same operation Chains of operations Paths Distance is length of path (here, 2)
37 Measuring Concurrency: Spatial Relations between Operations Disjoint access Non adjacent edges Distance is infinite Overlapping operations Adjacent edges Distance is 0 d-neighborhood of an operation: all operations at distance ≤ d
38 Formally: Interference between Operations Disjoint access Overlapping operations Non-overlapping operations Provides more concurrency & yields better throughput Interference inevitable no interference Should not interfere!
39 A Lot More to be Done Better algorithms for partial snapshots, incorporated into full-fledged STM Other lower bounds (w/ weaker notions of DAP or weaker consistency conditions) More precise definitions of concurrency (& their practical validation) Better and simpler algorithms Definitions of other STM properties (liveness, responsiveness)
Thank you!