CAR-STM: Scheduling-based Collision Avoidance and Reduction for Software Transactional Memory Shlomi Dolev, Danny Hendler and Adi Suissa PODC 2008.

Slides:



Advertisements
Similar presentations
Software Transactional Objects Guy Eddon Maurice Herlihy TRAMP 2007.
Advertisements

On-the-fly Healing of Race Conditions in ARINC-653 Flight Software
Improving IPC by Kernel Design Jochen Liedtke Slides based on a presentation by Rebekah Leslie.
Scalable Multi-Cache Simulation Using GPUs Michael Moeng Sangyeun Cho Rami Melhem University of Pittsburgh.
Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1.
IDIT KEIDAR DMITRI PERELMAN RUI FAN EuroTM 2011 Maintaining Multiple Versions in Software Transactional Memory 1.
Multi-granular, multi-purpose and multi-Gb/s monitoring on off-the-shelf systems TELE9752 Group 3.
Scheduling Memory Transactions. Synchronization alternatives: Transactional Memory  A (memory) transaction is a sequence of memory reads and writes executed.
Scheduling-based TM Contention Management A survey talk 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, Rome Danny Hendler Ben-Gurion.
Transactional Contention Management as a Non-Clairvoyant Scheduling Problem Alessia Milani [Attiya et al. PODC 06] [Attiya and Milani OPODIS 09]
Transactional Memory (TM) Evan Jolley EE 6633 December 7, 2012.
NUMA Tuning for Java Server Applications Mustafa M. Tikir.
Presented by: Ofer Kiselov & Omer Kiselov Supervised by: Dmitri Perelman Final Presentation.
1 Johannes Schneider Transactional Memory: How to Perform Load Adaption in a Simple And Distributed Manner Johannes Schneider David Hasenfratz Roger Wattenhofer.
1 MetaTM/TxLinux: Transactional Memory For An Operating System Hany E. Ramadan, Christopher J. Rossbach, Donald E. Porter and Owen S. Hofmann Presenter:
Lock vs. Lock-Free memory Fahad Alduraibi, Aws Ahmad, and Eman Elrifaei.
1 Lecture 7: Transactional Memory Intro Topics: introduction to transactional memory, “lazy” implementation.
1 Lecture 23: Transactional Memory Topics: consistency model recap, introduction to transactional memory.
G Robert Grimm New York University Extensibility: SPIN and exokernels.
Scheduling Memory Transactions Parallel computing day, Ben-Gurion University, October 20, 2009.
The Cost of Privatization Hagit Attiya Eshcar Hillel Technion & EPFLTechnion.
Chapter 4: Transaction Management
1 Scalable Transactional Memory Scheduling Gokarna Sharma (A joint work with Costas Busch) Louisiana State University.
Experience with K42, an open- source, Linux-compatible, scalable operation-system kernel IBM SYSTEM JOURNAL, VOL 44 NO 2, 2005 J. Appovoo 、 M. Auslander.
A Dynamic Elimination-Combining Stack Algorithm Gal Bar-Nissan, Danny Hendler and Adi Suissa Department of Computer Science, BGU, January 2011 Presnted.
LogTM: Log-Based Transactional Memory Kevin E. Moore, Jayaram Bobba, Michelle J. Moravan, Mark D. Hill, & David A. Wood Presented by Colleen Lewis.
An Introduction to Software Transactional Memory
SEDA: An Architecture for Well-Conditioned, Scalable Internet Services
1 Scalable and transparent parallelization of multiplayer games Bogdan Simion MASc thesis Department of Electrical and Computer Engineering.
Window-Based Greedy Contention Management for Transactional Memory Gokarna Sharma (LSU) Brett Estrade (Univ. of Houston) Costas Busch (LSU) 1DISC 2010.
Adaptive Transaction Scheduling for Transactional Memory Systems Richard M. Yoo Hsien-Hsin S. Lee Georgia Tech.
A New Method for Concurrency Control in Centralized Database Systems Victor T.S. Shi and William Perrizo Computer Science, North Dakota State University.
Scheduling Basic scheduling policies, for OS schedulers (threads, tasks, processes) or thread library schedulers Review of Context Switching overheads.
Scaling Dynamic Content Applications through Data Replication - Opportunities for Compiler Optimizations Cristiana Amza UofT.
WormBench A Configurable Application for Evaluating Transactional Memory Systems MEDEA Workshop Ferad Zyulkyarov 1, 2, Sanja Cvijic 3, Osman.
Low-Overhead Software Transactional Memory with Progress Guarantees and Strong Semantics Minjia Zhang, 1 Jipeng Huang, Man Cao, Michael D. Bond.
Threaded Programming in Python Adapted from Fundamentals of Python: From First Programs Through Data Structures CPE 401 / 601 Computer Network Systems.
On the Performance of Window-Based Contention Managers for Transactional Memory Gokarna Sharma and Costas Busch Louisiana State University.
Consider the program fragment below left. Assume that the program containing this fragment executes t1() and t2() on separate threads running on separate.
Department of Computer Science MapReduce for the Cell B. E. Architecture Marc de Kruijf University of Wisconsin−Madison Advised by Professor Sankaralingam.
Technology from seed Exploiting Off-the-Shelf Virtual Memory Mechanisms to Boost Software Transactional Memory Amin Mohtasham, Paulo Ferreira and João.
Measuring the Capacity of a Web Server USENIX Sympo. on Internet Tech. and Sys. ‘ Koo-Min Ahn.
StealthTest: Low Overhead Online Software Testing Using Transactional Memory Jayaram Bobba, Weiwei Xiong*, Luke Yen †, Mark D. Hill, and David A. Wood.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
Shouqing Hao Institute of Computing Technology, Chinese Academy of Sciences Processes Scheduling on Heterogeneous Multi-core Architecture.
Managing Processors Jeff Chase Duke University. The story so far: protected CPU mode user mode kernel mode kernel “top half” kernel “bottom half” (interrupt.
DECS: A Dynamic Elimination-Combining Stack Algorithm Gal Bar-Nissan, Danny Hendler, Adi Suissa 1 OPODIS 2011.
Scalable lock-free Stack Algorithm Wael Yehia York University February 8, 2010.
NB-FEB: A Universal Scalable Easy- to-Use Synchronization Primitive for Manycore Architectures Phuong H. Ha (Univ. of Tromsø, Norway) Philippas Tsigas.
Mutual Exclusion -- Addendum. Mutual Exclusion in Critical Sections.
Tool Support for Testing Classify different types of test tools according to their purpose Explain the benefits of using test tools.
Window-Based Greedy Contention Management for Transactional Memory Gokarna Sharma (LSU) Brett Estrade (Univ. of Houston) Costas Busch (LSU) DISC
Transactional Contention Management as a Non-Clairvoyant Scheduling Problem Hagit Attiya, Alessia Milani Technion, Haifa-LABRI, University of Bordeaux.
Gargamel: A Conflict-Aware Contention Resolution Policy for STM Pierpaolo Cincilla, Marc Shapiro, Sébastien Monnet.
Maurice Herlihy and J. Eliot B. Moss,  ISCA '93
Mihai Burcea, J. Gregory Steffan, Cristiana Amza
PHyTM: Persistent Hybrid Transactional Memory
Presentation by Omar Abu-Azzah
Challenges in Concurrent Computing
A Lock-Free Algorithm for Concurrent Bags
Anders Gidenstam Håkan Sundell Philippas Tsigas
Department of Computer Science University of California, Santa Barbara
Maurice Herlihy, Victor Luchangco, Mark Moir, William N. Scherer III
Changing thread semantics
Lecture 6: Transactions
Gokarna Sharma Costas Busch Louisiana State University, USA
Yiannis Nikolakopoulos
Improving IPC by Kernel Design
Hybrid Transactional Memory
Software Engineering and Architecture
Presentation transcript:

CAR-STM: Scheduling-based Collision Avoidance and Reduction for Software Transactional Memory Shlomi Dolev, Danny Hendler and Adi Suissa PODC 2008

CAR-STM: rationale  “transaction ignorant” thread scheduling problematic  TM scheduler handles transactional threads  This permits: o Serializing contention management o Proactive collision avoidance

“Conventional” STM system high-level structure OS-scheduler-controlled application threads Contention Manager Contention Detection arbitrate proceed Abort/retry/Wait TM System

CAR-STM's distinctive features Proactive Collision avoidance Proactively assign transaction thread to core with “most conflicting’’ transactions based on application-provided information Serializing contention management Serialize the execution of colliding transactions

Relying on (current) OS scheduling is problematic! 1)Introduces pseudo-parallelism 2)Hurts TM performance stability/predictability 3)Does not allow proactive collision avoidance and serializing CM. OS scheduling of transaction threads:

CAR-STM high-level architecture Transaction queue #1 TQ thread Transaction thread T-Info Core #1 Serializing contention mgr. Dispatcher Collision Avoider Core #k Transaction queue #k

TQ-Entry Structure Transaction queue #1 TQ thread Transaction thread T-Info Core #1 Serializing contention mgr. Dispatcher Collision Avoider Core #k Transaction queue #k wrapper method Transaction data T-Info Trans. thread Lock, condition var

Transaction dispatching process Call Dispatcher with an optional T-Info pointer argument 1 Call app-specific conflict probability method 3 Dispatcher calls Collision Avoider 2 Enque transaction in most-conflicting queue. Put thread to sleep, notify TQ thread. 4 4

Transaction execution TQ thread Core #i Transaction queue #i wrapper method Transaction data T-Info Trans. thread Lock, condition var TQ thread executes transaction 1 TQ thread wakes-up transaction thread 2 TQ thread dequeues entry 3

Dispatcher / TQ-thread synchronization TQ thread Core #i Transaction queue #i Dispatcher When TQ is emptied, TQ thread goes to sleep 1 When dispatcher adds a transaction, it wakes-up TQ thread 2

Serializing Contention Managers  When two transactions collide, fail the newer transaction and move it to the TQ of the older  Fast elimination of live-lock scenarios  Two SCMs implemented o Basic (BSCM) – move failed transaction to end of the other transactions' TQ o Permanent (PSCM) – Make the failed transaction a subordinate-transaction of the other transaction

PSCM TaTa Transaction queue #1 TQ thread Core #1 PSCM TbTb Transaction queue #k TQ thread Core #k TcTc TdTd TeTe Transactions a and b collide, b is older 1

PSCM Transaction queue #1 TQ thread Core #1 PSCM TbTb Transaction queue #k TQ thread Core #k TaTa TcTc TdTd TeTe Losing transaction and its subordinates are made subordinates of winning transaction TaTa TcTc

Experimental evaluation  Incorporated CAR-STM within RSTM  Tested on an 8-way 4 x XEON-7110M server  Serializing CM tests: Workloads generated by STMBEench7 [Guerraoui, Kapalka, Vitek, '07]  Proactive collision avoidance tested on synthetic app

STMBench7  A benchmark for STM implementations  Generates realistic workloads representative of complex, object-oriented applications  Workloads composed of 45 operation types on a shared data structure  Operation categories o Long / short traversals o Short operations o Structure modification operations

Metrics and workload types WritesReadsWorkload type 10%90%Read dominated 40%60%Read/Write 90%10%Write dominated CommentsOperation typesMetrics 5 min + quiescenceAllExecution time AllQuiescence time All except long traversalsThroughput

Execution time: R/W dominated workloads Speed-up of between 1.7 and 36 Reduction of standard deviation by factor of up to 40

Execution time: read dominated workloads

Execution time: Write dominated workloads

Quiescence time: a measure of live-lock Speed-up of between 11 and 118

Throughput: write dominated workloads Throughput increase of up to 15.7

Experimental evaluation: proactive collision avoidance  RegionedArray (RA) synthetic app (read, write, delete)  Each thread runs for 20 seconds o Randomly select region o Randomly select transaction length o Randomly select operation o Transaction repeatedly applies operation to randomly-selected region item Transactional memory Dagstuhl, June 08

Experimental results Transactional memory Dagstuhl, June 08

Most relevant prior art  [Yoo, Lee, 2008]: Adaptive transaction scheduling for TM systems  [Bai, Shen, Zhang, Scherer, Ding, Scott]: A key- based adaptive TM executor

Conclusions  Transactions-ignorant scheduling is problematic  Serializing contention management eliminates live-lock STM behavior  Proactive Collision avoidance contribution application-dependent Some future work directions  Robust scheduling  Transaction-aware OS scheduling  Better handling of page faults, local data access,…