Download presentation
Presentation is loading. Please wait.
Published byFlorence Hill Modified over 9 years ago
1
CAR-STM: Scheduling-based Collision Avoidance and Reduction for Software Transactional Memory Shlomi Dolev, Danny Hendler and Adi Suissa PODC 2008
2
CAR-STM: rationale “transaction ignorant” thread scheduling problematic TM scheduler handles transactional threads This permits: o Serializing contention management o Proactive collision avoidance
3
“Conventional” STM system high-level structure OS-scheduler-controlled application threads Contention Manager Contention Detection arbitrate proceed Abort/retry/Wait TM System
4
CAR-STM's distinctive features Proactive Collision avoidance Proactively assign transaction thread to core with “most conflicting’’ transactions based on application-provided information Serializing contention management Serialize the execution of colliding transactions
5
Relying on (current) OS scheduling is problematic! 1)Introduces pseudo-parallelism 2)Hurts TM performance stability/predictability 3)Does not allow proactive collision avoidance and serializing CM. OS scheduling of transaction threads:
6
CAR-STM high-level architecture Transaction queue #1 TQ thread Transaction thread T-Info Core #1 Serializing contention mgr. Dispatcher Collision Avoider Core #k Transaction queue #k
7
TQ-Entry Structure Transaction queue #1 TQ thread Transaction thread T-Info Core #1 Serializing contention mgr. Dispatcher Collision Avoider Core #k Transaction queue #k wrapper method Transaction data T-Info Trans. thread Lock, condition var
8
Transaction dispatching process Call Dispatcher with an optional T-Info pointer argument 1 Call app-specific conflict probability method 3 Dispatcher calls Collision Avoider 2 Enque transaction in most-conflicting queue. Put thread to sleep, notify TQ thread. 4 4
9
Transaction execution TQ thread Core #i Transaction queue #i wrapper method Transaction data T-Info Trans. thread Lock, condition var TQ thread executes transaction 1 TQ thread wakes-up transaction thread 2 TQ thread dequeues entry 3
10
Dispatcher / TQ-thread synchronization TQ thread Core #i Transaction queue #i Dispatcher When TQ is emptied, TQ thread goes to sleep 1 When dispatcher adds a transaction, it wakes-up TQ thread 2
11
Serializing Contention Managers When two transactions collide, fail the newer transaction and move it to the TQ of the older Fast elimination of live-lock scenarios Two SCMs implemented o Basic (BSCM) – move failed transaction to end of the other transactions' TQ o Permanent (PSCM) – Make the failed transaction a subordinate-transaction of the other transaction
12
PSCM TaTa Transaction queue #1 TQ thread Core #1 PSCM TbTb Transaction queue #k TQ thread Core #k TcTc TdTd TeTe Transactions a and b collide, b is older 1
13
PSCM Transaction queue #1 TQ thread Core #1 PSCM TbTb Transaction queue #k TQ thread Core #k TaTa TcTc TdTd TeTe Losing transaction and its subordinates are made subordinates of winning transaction TaTa TcTc
14
Experimental evaluation Incorporated CAR-STM within RSTM Tested on an 8-way 4 x XEON-7110M server Serializing CM tests: Workloads generated by STMBEench7 [Guerraoui, Kapalka, Vitek, '07] Proactive collision avoidance tested on synthetic app
15
STMBench7 A benchmark for STM implementations Generates realistic workloads representative of complex, object-oriented applications Workloads composed of 45 operation types on a shared data structure Operation categories o Long / short traversals o Short operations o Structure modification operations
16
Metrics and workload types WritesReadsWorkload type 10%90%Read dominated 40%60%Read/Write 90%10%Write dominated CommentsOperation typesMetrics 5 min + quiescenceAllExecution time AllQuiescence time All except long traversalsThroughput
17
Execution time: R/W dominated workloads Speed-up of between 1.7 and 36 Reduction of standard deviation by factor of up to 40
18
Execution time: read dominated workloads
19
Execution time: Write dominated workloads
20
Quiescence time: a measure of live-lock Speed-up of between 11 and 118
21
Throughput: write dominated workloads Throughput increase of up to 15.7
22
Experimental evaluation: proactive collision avoidance RegionedArray (RA) synthetic app (read, write, delete) Each thread runs for 20 seconds o Randomly select region o Randomly select transaction length o Randomly select operation o Transaction repeatedly applies operation to randomly-selected region item Transactional memory Dagstuhl, June 08
23
Experimental results Transactional memory Dagstuhl, June 08
24
Most relevant prior art [Yoo, Lee, 2008]: Adaptive transaction scheduling for TM systems [Bai, Shen, Zhang, Scherer, Ding, Scott]: A key- based adaptive TM executor
25
Conclusions Transactions-ignorant scheduling is problematic Serializing contention management eliminates live-lock STM behavior Proactive Collision avoidance contribution application-dependent Some future work directions Robust scheduling Transaction-aware OS scheduling Better handling of page faults, local data access,…
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.