Download presentation
Presentation is loading. Please wait.
Published byDamon Covell Modified over 9 years ago
1
Scheduling-based TM Contention Management A survey talk 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, Rome Danny Hendler Ben-Gurion university Danny Hendler
2
TM: a research toy? TM “TM:Why it is only a research toy.” Cascaval et al., 2008 “Why STM can be more than a research toy.” Dragojevic et al., 2009 There is consensus that TM performance must be improved. Specifically: “In some workloads, performance degraded when we used too many concurrent threads. One possible alternative to improving performance in these cases would be to modify the thread scheduler so it avoids running more concurrent threads than is optimal for a given workload, based on the information provided by the STM runtime. “ [Dragojevic et al.] 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler
3
TM scheduling: rationale Transactional threads controlled by TM-aware scheduler o Kernel-level, user-level Richer “tool-box“ for reducing and/or preventing transaction conflicts Improve performance under high-contention 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler
4
TM system Non TM-scheduled threads Contention Manager Contention Detection arbitrate proceed Abort/retry, wait “Conventional” (non-scheduling) contention management greedy Aggressive Karma Polka Suicide Polite “ Polymorphic contention management”, [Guerraoui, Herlihy & Pochon DISC'05 ] 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler
5
Conventional Contention Management is often problematic Loser resumes execution after a waiting period May resume execution too early May resume execution too late Repeated collisions occur under high contention Livelocks Performance may become worse than single lock Scheduling-based CM to the rescue. 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler
6
Talk outline Preliminaries The first TM schedulers Later user-land work Kernel support 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler
7
“Adaptive Transaction Scheduling for transactional memory systems” [Yoo & Lee, SPAA'08] “CAR-STM: Scheduling-based collision avoidance and resolution for software transactional memory” [Dolev, Hendler & Suissa, PODC '08] “Steal-on-abort: dynamic transaction reordering to reduce conflicts in transactional memory” [Ansari, Jarvis, Kirkham, Kotsedilis, Lujan and Watson, HiPEAC'09] The first TM schedulers 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler
8
A single scheduling queue Per-thread Contention Intensity (CI) computed Adaptive mechanism CI below threshold transaction begins normally CI above threshold transaction serialized (queued) Adaptive Transaction Scheduling (ATS) (Yoo & Lee, SPAA'08) 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler
9
Timeline flows from top to bottom An average of all the CIs from running threads Transactions begin execution without resorting to the scheduler As contention starts to increase, some transactions call the scheduler As more transactions get serialized, contention intensity starts to decrease Contention intensity subsides below threshold More transactions start without the scheduler to exploit more parallelism ATS adaptively varies the number of concurrent transactions according to the dynamic contention feedback Behavior of a Queue-Based Scheduler ATS: adaptive parallelism control Yoo & Lee, Transaction Scheduling.
10
CAR-STM (Collision Avoidance and Resolution for STM) (Dolev, Hendler & Suissa, PODC'08) Per-core transaction queues Serialize conflicting transactions Contention avoidance: attempt to avoid even first collision 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler
11
CAR-STM high-level architecture Transaction queue #1 TQ thread Transaction thread T-Info Core #1 Serializing contention mgr. Dispatcher Collision Avoider Core #k Transaction queue #k 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler
12
Execution time: STMBench7 * R/W dominated workloads (*) “STMBench7: a benchmark for STM”, [Guerraoui, Kapalka & Viteck., Eurosys'07]
13
Shortcomings of first TM schedulers May restrict parallelism too much ATS: a single serialization queue CAR-STM: at most a single transactional thread per core o High overheads even in the lack of contention 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler
14
Talk outline Preliminaries The first TM schedulers Later user-land work Kernel support 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler
15
A low-overhead serializer Avoid repeated collisions while minimizing over-serialization No per-core queues Adaptive “On the impact of serializing contention management on STM performance”, [Heber, Hendler & Suissa., opodis'09] “Scheduling support for TM contention management”, [Maldonado, Felber, Fedorova, Hendler, Lawall, Marlier, Muller & Suissa PPoPP'10] 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler
16
Transactional threads Condition variables A low-overhead serializer (cont'd) 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler
17
1) t Identifies a collision 2) t calls contention manager: ABORT_OTHER 3) t changes status of t' to ABORT (writes that t is winner) tt' 4) t' identifies it was aborted A low-overhead serializer (cont'd) 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler
18
t t' 5) t' rolls back transaction and goes to sleep on the condition variable of t 6) Eventually t commits and broadcasts on its condition variable… A low-overhead serializer (cont'd) 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler
19
t' A low-overhead serializer (cont'd) 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler
20
A low-overhead serializer (cont'd) Stabilization mechanism Algorithm is adaptive o Serializing mode / “Conventional’’ mode Prevents “mode-oscillations”: o Shifting to serialization-mode reduces perceived contention o Should use two thresholds 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler
21
Throughput of CM conventional algorithms is low CAR-STM over-serializes compared with low-overhead serializer Stabilization mechanism helps A low-overhead serializer (cont'd) Experimental evaluation 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler
22
“Shrink” – collision prevention/avoidance Predicts future accesses based on past accesses o Read-set predicted based on past few committed/aborted TXs (temporal locality) o Write-set predicted based on immediately preceding aborted TX Serialize with a probability proportional to the number of threads currently serialized (serialization affinity), if thread's success rate is low and a collision is predicted “Preventing versus curing: avoiding conflicts in transactional memories”, [Dragojevic, Guerraoui, Singh and Singh, PODC'09] 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler
23
“Shrink” – collision prevention (cont'd) Don't serialize when contention is low Update statistics, release lock if you own it Serialize only if contention is high & a collision is “predicted” 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler
24
“Shrink” – experimental evaluation STMBench7, read-write workload 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler
25
Talk outline Preliminaries The first TM schedulers Later user-land work Kernel support 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler
26
Scheduling Support for Transactional Memory Contention Management Implement CM scheduling support in the kernel scheduler (Linux & OpenSolaris) (Strict) serialization Soft serialization Time-slice extension Different mechanisms for communication between user- level STM library and kernel scheduler “Scheduling support for TM contention management”, [Maldonado, Felber, Fedorova, Hendler, Lawall, Marlier, Muller & Suissa PPoPP'10] 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler
27
TM Library / Kernel Communication via Shared Memory Segment (Ser-k algorithm) User code notifies kernel on events such as: transaction start, commit and abort (in which case thread yields) Kernel code handles moving thread between ready and blocked queues 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler
28
Soft Serialization Instead of blocking, reduce loser thread priority and yield Efficient in scenarios where loser transactions may take a different execution path when retrying Priority should be restored upon commit or when conflicting transactions terminate 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler
29
Time-slice extention Preemption in the midst of a transaction increases conflict “window of vulnerability” Defer preemption of transactional threads avoid CPU monopolization by bounding number of extensions and yielding after commit May be combined with serialization/soft serialization 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler
30
Evaluation ( STMBench7, 16-core AMD Opterom ) Conventional CM deteriorates when threads>cores Serializing by local spinning is efficient as long as threads ≤ cores 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler
31
Evaluation - STMBench7 throughput Serializing by sleeping on condition var is best when threads>cores, since system call overhead is negligible (long transactions) All strict serialization schemes significantly reduce aborts 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler
32
“Transactional scheduling for read-dominated workloads” [Attiya & Milani, OPODIS'09] “Taking the heat off transactions: dynamic selection of pessimistic concurrency control [Sonmez, Harris, Cristal, Unsal & Valeo, IPDPS'09] “Proactive transaction scheduling for contention management” [Blake, Dreslinky & Mudge, MICRO'09] “Improving performance by reducing aborts in HTM” [Ansari, Khan, Lujan, Kotselidis, Kirkham and Watson, HIPEAC'10] “Window-based greedy contention management for TM” [Sharma, Estrade & Busch, DC'10] “On Transaction Scheduling in distributed TM systems] [Kim & Ravindran, 2010] “Kernel-assisted Scheduling and Deadline Support for STM” [Maldonado, Marlier, Felber, Lawall, Muller & Riviere, DSN'11] “Adaptive thread scheduling techniques for improving scalability of STM” [Chan, Lam & Wang, 2011] … Additional TM scheduling work 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler
33
Scheduling support for TM Conclusions & future work Scheduling-based CM results in improved throughput under high contention o Overhead is negligible when contention is low Lightweight kernel support can improve performance and efficiency for some workloads Dynamically selecting best CM algorithm for workload at hand is a challenging research direction o Machine learning? 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler
34
Thank you. 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.