Window-Based Greedy Contention Management for Transactional Memory Gokarna Sharma (LSU) Brett Estrade (Univ. of Houston) Costas Busch (LSU) DISC TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A AA A
Transactional Memory - Background The emergence of multi-core architectures – Opportunities and challenges How to handle access to shared data? – Locks, Monitors, … Transactional memory (TM) is an alternative synchronization abstraction – Simple, composable, … Three types – Hardware, Software, and Hybrid TMs – Our focus is on STM Systems DISC 20102
STM Systems Progress is ensured through contention management (CM) policy If transactions modify different data – everything is OK If transactions modify same data – conflicts arise that must be resolved - job of a contention management policy Of particular interest are greedy contention managers – Transactions immediately restart after every abort DISC 20103
Prior Work Mostly empirical evaluation Theoretical Analysis – [Guerraoui et al., PODC’05] Greedy Contention Manager Competitive ratio = O(s 2 ) (s is the number of shared resources) – [Attiya et al., PODC’06] Improved to O(s) – [Schneider & Wattenhofer, ISAAC’09] RandomizedRounds Contention Manager Competitive ratio = O(C logn) (C is the maximum number of conflicting transactions and n is the number of transactions) – [Attiya & Milani, OPODIS’09] Bimodal Scheduler Competitive ratio = O(s) (for bimodal workload with equi-length transactions) DISC 20104
Our Contributions Execution window model for TM Makespan bound of any CM algorithm based on the contention measure C with in the window and the window parameters M and N Two new randomized contention management algorithms that are very close to O(s)-competitive An adaptive version that adapts to the amount of contention C DISC N N M M Transactions... Threads......
Roadmap Previous TM models and problem complexity Our TM model Our algorithms and proof ideas DISC 20106
Previous TM Models One-shot scheduling problem – n transactions, a single transaction per thread – Best bound proven to be achievable is O(s) Problem Complexity: directly related to vertex coloring – Coloring problem -> One-shot scheduling problem -> One-shot scheduling Solution -> Coloring Solution NP-Hard to approximate an optimal vertex coloring Can we do better under the limitations of coloring reduction? DISC 20107
Execution Window Model A M £ N window W – M threads with a sequence of N transactions per thread, i.e., collection of N one-shot transaction sets DISC N N M M Transactions Threads
Makespan Bounds Let C denote the maximum number of conflicting transactions for any transaction inside the window Trivial Makespan Bounds: – Straightforward upper bound: ¿ ¢ min(CN,MN), where ¿ is the execution time duration – One-shot analysis bound [Attiya et al., PODC’06]: O(sN) – Using RandomizedRounds [Schneider & Wattenhofer, ISAAC’09] N times, makespan bound: O( ¿ ¢ CN logM) Our Bounds: – Offline-Greedy: Makespan bound = O( ¿ ¢ (C + N log(MN))) and Competitive Ratio = O(s + log(MN)) with high probability – Online-Greedy: Makespan bound = O( ¿ ¢ (C log(MN) + N log 2 (MN))) and Competitive Ratio = O( s ¢ log(MN) + log 2 (MN))) high probability DISC 20109
Intuition The random delays help conflicting transactions shift inside the window and their execution time may not coincide More apparent in scenarios where conflicts are more frequent inside the same column transactions and less frequent in different column transactions DISC N N’ Random interval 1 23N M 1 23N N M...
How it works? Random intervals: Assume each thread P i knows C i and each transaction has same duration ¿ (this assumption can be removed) Conflicts: Divide time steps into frames [each time step is of size ¿ ] – Frame size depends on the conflict resolution strategy of the algorithm Number of frames in random intervals: Each thread chooses a random number q i independently, uniformly, and randomly from the range [0, ® i -1], where ® i = C i / log(MN) Handling conflicts: Use priorities DISC
How it works? (Contd…) DISC N M N q 1 2 [0, ® 1 -1], ® 1 = C 1 / log(MN) Frames C=max i C i, 1 · i · M F 11 F 3N Thread 1 Thread 2 Thread 3 Thread M F 1N F 12 Makespan = (C / log(MN) + Number of frames) £ Frame Size = (C / log(MN) + N) £ Frame Size First frame of Thread 1 where T 11 executes Second frame of Thread 1 where T 12 executes
Offline-Greedy Algorithm Initialization: – Frames are of size © = £ ( ¿ ¢ ln(MN)) time steps – Each thread P i is assigned initially a random period of q i 2 [0, ® i -1] frames, ® i = C i / log(MN) – Each transaction T ij is assigned to frame F ij = q i + (j-1) Priority assignment: each transaction has two priorities: low or high – Transaction T ij is initially in low priority – T ij switches to high priority in the first time step of frame F ij and remains in high priority thereafter Conflict resolution: uses conflict graph explicitly to resolve conflicts – Conflict graph is dynamic and evolves while the execution of the transactions progresses DISC
Offline-Greedy Algorithm (Contd…) Proof Intuition: With high probability each transaction commits in its assigned frame – Let A’ µ A denote the subset of conflicting transactions with T ij in frame F ij |A’| · log(MN) – 1, then T ij commits in frame F ij |A’| ¸ log(MN) with probability at most (1/MN) 2 Makespan: O( ¿ ¢ (C + N log(MN))) with high probability – Pro: For C · N log(MN) makespan is log(MN) factor far from optimal, since N is a trivial lower bound – Con: Need to know dependency graph to resolve conflicts Competitive ratio: O(s + log(MN)) with high probability – Pro: Independent with any choice of C DISC
Online-Greedy Algorithm Online in the sense that it does not depend on knowing the dependency graph to resolve conflicts Similar to Offline-Greedy except the conflict resolution strategy Priority assignment – Two different priorities associated with each transaction as a vector h π (1), π (2) i – π (1) represent the Boolean priority as in Offline-Greedy – π (2) 2 [1, M] represent random priorities: A transaction chooses π (2) uniformly at random on the start of frame F ij and after every abort [Idea from Schneider & Wattenhofer, ISAAC’09] Conflict resolution – On conflict of T ij with T kl : if π ij (2) < π kl (2) then abort(T ij, T kl ) otherwise abort(T kl, T ij ) DISC
Online-Greedy Algorithm (Contd…) Proof Intuition: frame duration is now £ ’=O( ¿ ¢ log 2 (MN)) – Analysis is similar to Offline-Greedy Makespan: O( ¿ ¢ (C log(MN) + N log 2 (MN))) with high probability – Pro: no need to know dependency graph to resolve conflicts – Con: makespan is worse in comparison to Offline-Greedy Competitive ratio: O(s ¢ log(MN) + log 2 (MN))) with high probability Pro: Independent of the contention measure C 16DISC 2010
Adaptive-Greedy Algorithm Limitations of Offline-Greedy and Online-Greedy algorithms – The values of C i need to be known in advance Adaptive-Greedy: each thread starts with guessing C i = 1 – Similar to the exponential back-off strategy used by Polka – Based on current C i estimate, the thread attempts to execute Online-Greedy algorithm – If a thread P i is unable to commit transactions (bad event) then P i assumes choice of C i is incorrect and starts over again by assuming C i ’ = 2 ¢ C i for remaining transactions Correct choice of C i is reached in logC i iterations DISC
Discussions For variable length transactions – ¿ on makespan bounds is replaced with ¿ max, which is the maximum duration of any transaction in the window – ¿ max / ¿ min factor in competitive ratio bounds, where ¿ min is the minimum duration of any transaction in the window Future extensions – Instead of one randomization interval at the beginning of window, random periods of low priority between subsequent transactions – Dynamic expansion and contraction of the execution window to preserve the contention measure C DISC
Conclusions Execution window model for TM Two new randomized greedy CM algorithms that are very close to O(s)-competitive Adaptive version of the previous algorithms for better performance by avoiding the limitations of the known value of C DISC