Download presentation
Presentation is loading. Please wait.
1
1 Johannes Schneider Transactional Memory: How to Perform Load Adaption in a Simple And Distributed Manner Johannes Schneider David Hasenfratz Roger Wattenhofer
2
2 Johannes Schneider “computer science will become washing machine science.“ Without easy and efficient parallel programming methods…
3
How to handle access to shared data? Locks, Monitors… Coarse grained vs. fine grained locking easy but slow program demanding, time consuming but fast programs Problems difficult error prone Composability …… Johannes Schneider lock all data modify/use data unlock all data lock A lock B modify/use A,B lock C modify/use A,B,C unlock A modify/use B,C unlock B,C lock B lock A modify/use A,B unlock A,B Deadlock! Only 1 thread can execute 3 Thread 1 Thread 2
4
Transactional memory(TM) - a possible solution Simple for the programmer Composable Idea from database community Many TM systems (internally) still use locks But the TM system (not the programmer) takes care of Performance Correctness (no deadlocks...) Johannes Schneider Begin transaction modify/use data End transaction Method A.x() Begin Transaction B.y() … End Transaction Method B.y() Begin transaction … End transaction 4
5
Transactional memory systems If transactions modify different data, everything is ok the same data, conflicts arise that must be resolved Transactions might get delayed or aborted Job of a contention manager A transaction keeps track of all modified values It restores all values, if it is aborted A transaction successfully finishes with a commit Johannes Schneider 5
6
Abort or delay a transaction, i.e. adapt load Distributed Each thread has its own manager Example Initially: A=1, B=1 Manager 1 Manager 2 T1 Trans. 1 T1 Trans. 2 B:=2 … A:=3 … conflict … A:=2 … Abort (undo all changes, i.e. set A:=1) and restart (after a while) T1 Trans.1 … A:=2 … Trans. 2 B:=2 … A:=3 … conflict Abort (set B:=1) and restart OR wait and retry Conflicts – A contention manager decides Johannes Schneider 6 Manager 1 Manager 2 Delay to adapt load!
7
Prior work Contention Managers [PODC03,PODC05,ISAAC09…] System load was not (explicitly) considered Load adaption (based on contention) Estimate contention intensity: CI [SPAA08] If abort: CI = a CI + (1-a) with parameter a [0,1] If commit: CI = a CI If CI > parameter b then resort to central scheduler Keep a transaction queue per core [PODC08] Central dispatcher assigns transactions to a core, i.e. its queue Each core iteratively executes transactions from queue If transaction A on core 1 is aborted due to B on core 2 then A is appended to the queue of core 2 Central scheduler will become a bottleneck Johannes Schneider 7 Core 1 Core 2 A B C D Core 1 Core 2 A B C D B aborts A
8
This paper Theoretical analysis Decentralized (simple) approaches to load adaption based on contention Johannes Schneider 8
9
Strategies Ignore: Do not learn from conflicts ImmediateRestart Stay real: Remember faced conflicts SerializeFacedConflicts Do not schedule prior conflicting transactions concurrently Be cautious: Assume additional conflicts SerializeAll All transactions in a subgraph are assumed to conflict Johannes Schneider 9 B A D C Conflict graph A conflicted with C D conflicted with B A D C B A D C B A C B D
10
Load Adaption Strategies AbortBackoff If aborted wait for a random time [0,2 #aborts ] Priority = number of aborts #aborts Who wins a conflict? 2 strategies Estimate the work done Unrelated to work done Johannes Schneider 10
11
Theory Part - Model n transactions (and threads) Start concurrently on n cores Transaction sequence of operations operation takes 1 time unit duration (number of operations) t T is fixed 2 types of operations Write = modify (shared) resource and lock it until commit Compute/abort/commit Ignore overhead of load adaption Remembering transactions, scheduling… Johannes Schneider 11 Core 1 Core 2 B A Core n Z … A
12
Moderate parallelism Shared counter Conflicts directly after transaction start Linked List Conflicts at arbitrary time Expected time span until all transactions committed Speed-up log n (at best) Johannes Schneider 12 PolicyCounterList ImmediateRestart AbortBackoff SerializeFacedConflicts SerializeAll Transaction run time #transactions
13
Substantial parallelism Worst case Conflict graph is d-ary tree of logarithmic height Exponential gap in worst case SerializeAll and others Johannes Schneider 13 PolicyTime until transactions committed ImmediateRestart AbortBackoff SerializeFacedConflicts SerializeAll T1 T2 T3 T4 T5 …
14
Practical investigation Remembering conflicts causes too much overhead Good for analysis but not for implementation Quickadapter Serializes transactions Each core has a “waiting” flag If aborted, set flag and wait until flag unset If commit, unset some flag AbortBackOff (Also considered some variants) Johannes Schneider 14
15
Practical investigation Evaluation on 16 core machine DSTM2 system Visible readers Six benchmarks Little parallelism Shared counter, Sorted List (accessed objects not released), Listcounter Considerable parallelism Red Black Tree, LFUCache, RandomAccessArray Compare new load adaption policies to existing contention managers Johannes Schneider 15
16
Discussion Hard to keep maximum throughput, also in [SPAA08, PODC08] Even without conflicts Improvement for 1 benchmark worsens another On average better than schemes without load adaption 16 Johannes Schneider
17
Conclusion Simple and distributed load adaption strategies Theory (For now) constants and parameters matter a lot Practice Hard to keep load at peak for all usage patterns 17 Johannes Schneider
18
18 Johannes Schneider \vspace{10pt} Thanks for your attention! Questions? ???
19
Analysis AbortBackoff for counter Recall: If aborted wait for a random time [0,2 #aborts ] Assume #aborts ~ log (nt T ) + x (for some x) Define: a(x) := fraction of active nodes a(0) = 1 (after time ~2 log (nt T ) = nt T a constant fraction still active) Chance conflict for interval [0,2 #aborts ] Interval [0, 2 log(ntT)+x ] ~ a(x) nt T / 2 log (nt T ) +x = a(x) /2 x a(x+1) = a(x)/2 x = 1/2 ∑ i=0..x i ~ 1/2 x 2 a(√log n) = 1/2 (√log n) 2 = 1/n ∑ i=0.. log (nt T ) +√log n length interval = ∑ i=0.... log (nt T ) +√log n 2 i = nt T 2 √log n+1 Johannes Schneider 19 T1 T2 T3 a(x)nt T = 3/n n t T = 3t T
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.