Scheduling-based TM Contention Management A survey talk 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, Rome Danny Hendler Ben-Gurion.

Scheduling-based TM Contention Management A survey talk 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, Rome Danny Hendler Ben-Gurion university Danny Hendler

TM: a research toy? TM “TM:Why it is only a research toy.” Cascaval et al., 2008 “Why STM can be more than a research toy.” Dragojevic et al., 2009 There is consensus that TM performance must be improved. Specifically: “In some workloads, performance degraded when we used too many concurrent threads. One possible alternative to improving performance in these cases would be to modify the thread scheduler so it avoids running more concurrent threads than is optimal for a given workload, based on the information provided by the STM runtime. “ [Dragojevic et al.] 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

TM scheduling: rationale  Transactional threads controlled by TM-aware scheduler o Kernel-level, user-level  Richer “tool-box“ for reducing and/or preventing transaction conflicts Improve performance under high-contention 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

TM system Non TM-scheduled threads Contention Manager Contention Detection arbitrate proceed Abort/retry, wait “Conventional” (non-scheduling) contention management greedy Aggressive Karma Polka Suicide Polite “ Polymorphic contention management”, [Guerraoui, Herlihy & Pochon DISC'05 ] 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Conventional Contention Management is often problematic  Loser resumes execution after a waiting period  May resume execution too early  May resume execution too late  Repeated collisions occur under high contention  Livelocks  Performance may become worse than single lock Scheduling-based CM to the rescue. 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Talk outline Preliminaries The first TM schedulers Later user-land work Kernel support 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

“Adaptive Transaction Scheduling for transactional memory systems” [Yoo & Lee, SPAA'08] “CAR-STM: Scheduling-based collision avoidance and resolution for software transactional memory” [Dolev, Hendler & Suissa, PODC '08] “Steal-on-abort: dynamic transaction reordering to reduce conflicts in transactional memory” [Ansari, Jarvis, Kirkham, Kotsedilis, Lujan and Watson, HiPEAC'09] The first TM schedulers 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

A single scheduling queue Per-thread Contention Intensity (CI) computed Adaptive mechanism  CI below threshold  transaction begins normally  CI above threshold  transaction serialized (queued) Adaptive Transaction Scheduling (ATS) (Yoo & Lee, SPAA'08) 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Timeline flows from top to bottom An average of all the CIs from running threads Transactions begin execution without resorting to the scheduler As contention starts to increase, some transactions call the scheduler As more transactions get serialized, contention intensity starts to decrease Contention intensity subsides below threshold More transactions start without the scheduler to exploit more parallelism ATS adaptively varies the number of concurrent transactions according to the dynamic contention feedback Behavior of a Queue-Based Scheduler ATS: adaptive parallelism control Yoo & Lee, Transaction Scheduling.

CAR-STM (Collision Avoidance and Resolution for STM) (Dolev, Hendler & Suissa, PODC'08)  Per-core transaction queues  Serialize conflicting transactions  Contention avoidance: attempt to avoid even first collision 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

CAR-STM high-level architecture Transaction queue #1 TQ thread Transaction thread T-Info Core #1 Serializing contention mgr. Dispatcher Collision Avoider Core #k Transaction queue #k 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Execution time: STMBench7 * R/W dominated workloads (*) “STMBench7: a benchmark for STM”, [Guerraoui, Kapalka & Viteck., Eurosys'07]

Shortcomings of first TM schedulers  May restrict parallelism too much  ATS: a single serialization queue  CAR-STM: at most a single transactional thread per core o High overheads even in the lack of contention 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

A low-overhead serializer  Avoid repeated collisions while minimizing over-serialization  No per-core queues  Adaptive “On the impact of serializing contention management on STM performance”, [Heber, Hendler & Suissa., opodis'09] “Scheduling support for TM contention management”, [Maldonado, Felber, Fedorova, Hendler, Lawall, Marlier, Muller & Suissa PPoPP'10] 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Transactional threads Condition variables A low-overhead serializer (cont'd) 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

1) t Identifies a collision 2) t calls contention manager: ABORT_OTHER 3) t changes status of t' to ABORT (writes that t is winner) tt' 4) t' identifies it was aborted A low-overhead serializer (cont'd) 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

t t' 5) t' rolls back transaction and goes to sleep on the condition variable of t 6) Eventually t commits and broadcasts on its condition variable… A low-overhead serializer (cont'd) 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

t' A low-overhead serializer (cont'd) 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

A low-overhead serializer (cont'd) Stabilization mechanism  Algorithm is adaptive o Serializing mode / “Conventional’’ mode  Prevents “mode-oscillations”: o Shifting to serialization-mode reduces perceived contention o Should use two thresholds 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Throughput of CM conventional algorithms is low CAR-STM over-serializes compared with low-overhead serializer Stabilization mechanism helps A low-overhead serializer (cont'd) Experimental evaluation 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

“Shrink” – collision prevention/avoidance  Predicts future accesses based on past accesses o Read-set predicted based on past few committed/aborted TXs (temporal locality) o Write-set predicted based on immediately preceding aborted TX  Serialize with a probability proportional to the number of threads currently serialized (serialization affinity), if thread's success rate is low and a collision is predicted “Preventing versus curing: avoiding conflicts in transactional memories”, [Dragojevic, Guerraoui, Singh and Singh, PODC'09] 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

“Shrink” – collision prevention (cont'd) Don't serialize when contention is low Update statistics, release lock if you own it Serialize only if contention is high & a collision is “predicted” 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

“Shrink” – experimental evaluation STMBench7, read-write workload 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Scheduling Support for Transactional Memory Contention Management  Implement CM scheduling support in the kernel scheduler (Linux & OpenSolaris)  (Strict) serialization  Soft serialization  Time-slice extension  Different mechanisms for communication between user- level STM library and kernel scheduler “Scheduling support for TM contention management”, [Maldonado, Felber, Fedorova, Hendler, Lawall, Marlier, Muller & Suissa PPoPP'10] 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

TM Library / Kernel Communication via Shared Memory Segment (Ser-k algorithm)  User code notifies kernel on events such as: transaction start, commit and abort (in which case thread yields)  Kernel code handles moving thread between ready and blocked queues 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Soft Serialization  Instead of blocking, reduce loser thread priority and yield  Efficient in scenarios where loser transactions may take a different execution path when retrying  Priority should be restored upon commit or when conflicting transactions terminate 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Time-slice extention  Preemption in the midst of a transaction increases conflict “window of vulnerability”  Defer preemption of transactional threads  avoid CPU monopolization by bounding number of extensions and yielding after commit  May be combined with serialization/soft serialization 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Evaluation ( STMBench7, 16-core AMD Opterom ) Conventional CM deteriorates when threads>cores Serializing by local spinning is efficient as long as threads ≤ cores 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Evaluation - STMBench7 throughput Serializing by sleeping on condition var is best when threads>cores, since system call overhead is negligible (long transactions) All strict serialization schemes significantly reduce aborts 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

“Transactional scheduling for read-dominated workloads” [Attiya & Milani, OPODIS'09] “Taking the heat off transactions: dynamic selection of pessimistic concurrency control [Sonmez, Harris, Cristal, Unsal & Valeo, IPDPS'09] “Proactive transaction scheduling for contention management” [Blake, Dreslinky & Mudge, MICRO'09] “Improving performance by reducing aborts in HTM” [Ansari, Khan, Lujan, Kotselidis, Kirkham and Watson, HIPEAC'10] “Window-based greedy contention management for TM” [Sharma, Estrade & Busch, DC'10] “On Transaction Scheduling in distributed TM systems] [Kim & Ravindran, 2010] “Kernel-assisted Scheduling and Deadline Support for STM” [Maldonado, Marlier, Felber, Lawall, Muller & Riviere, DSN'11] “Adaptive thread scheduling techniques for improving scalability of STM” [Chan, Lam & Wang, 2011] … Additional TM scheduling work 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Scheduling support for TM Conclusions & future work  Scheduling-based CM results in improved throughput under high contention o Overhead is negligible when contention is low  Lightweight kernel support can improve performance and efficiency for some workloads  Dynamically selecting best CM algorithm for workload at hand is a challenging research direction o Machine learning? 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Thank you. 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Scheduling-based TM Contention Management A survey talk 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, Rome Danny Hendler Ben-Gurion.

Similar presentations

Presentation on theme: "Scheduling-based TM Contention Management A survey talk 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, Rome Danny Hendler Ben-Gurion."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Scheduling-based TM Contention Management A survey talk 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, Rome Danny Hendler Ben-Gurion.

Similar presentations

Presentation on theme: "Scheduling-based TM Contention Management A survey talk 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, Rome Danny Hendler Ben-Gurion."— Presentation transcript:

Similar presentations

About project

Feedback