Scheduling-based TM Contention Management A survey talk 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, Rome Danny Hendler Ben-Gurion.

Slides:



Advertisements
Similar presentations
Copyright 2008 Sun Microsystems, Inc Better Expressiveness for HTM using Split Hardware Transactions Yossi Lev Brown University & Sun Microsystems Laboratories.
Advertisements

Impossibilities for Disjoint-Access Parallel Transactional Memory : Alessia Milani [Guerraoui & Kapalka, SPAA 08] [Attiya, Hillel & Milani, SPAA 09]
© 2005 P. Kouznetsov Computing with Reads and Writes in the Absence of Step Contention Hagit Attiya Rachid Guerraoui Petr Kouznetsov School of Computer.
Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,
Steal-on-abort Improving Transactional Memory Performance through Dynamic Transaction Reordering Mohammad Ansari University of Manchester.
1 Presenter: Chien-Chih Chen. 2 Dynamic Scheduler for Multi-core Systems Analysis of The Linux 2.6 Kernel Scheduler Optimal Task Scheduler for Multi-core.
Evaluating Database-Oriented Replication Schemes in Software Transacional Memory Systems Roberto Palmieri Francesco Quaglia (La Sapienza, University of.
Scheduling Memory Transactions. Synchronization alternatives: Transactional Memory  A (memory) transaction is a sequence of memory reads and writes executed.
Safety Definitions and Inherent Bounds of Transactional Memory Eshcar Hillel.
Inherent limitations on DAP TMs 1 Inherent Limitations on Disjoint-Access Parallel Transactional Memory Hagit Attiya, Eshcar Hillel, Alessia Milani Technion.
Transactional Contention Management as a Non-Clairvoyant Scheduling Problem Alessia Milani [Attiya et al. PODC 06] [Attiya and Milani OPODIS 09]
Manchester University Transactions for Scala Daniel Goodman Euro TM Paris, May 2011.
Transactional Memory (TM) Evan Jolley EE 6633 December 7, 2012.
Threads Irfan Khan Myo Thein What Are Threads ? a light, fine, string like length of material made up of two or more fibers or strands of spun cotton,
DMITRI PERELMAN IDIT KEIDAR TRANSACT 2010 SMV: Selective Multi-Versioning STM 1.
1 Johannes Schneider Transactional Memory: How to Perform Load Adaption in a Simple And Distributed Manner Johannes Schneider David Hasenfratz Roger Wattenhofer.
1 MetaTM/TxLinux: Transactional Memory For An Operating System Hany E. Ramadan, Christopher J. Rossbach, Donald E. Porter and Owen S. Hofmann Presenter:
Chapter 8 – Processor Scheduling Outline 8.1 Introduction 8.2Scheduling Levels 8.3Preemptive vs. Nonpreemptive Scheduling 8.4Priorities 8.5Scheduling Objectives.
1 Soft Timers: Efficient Microsecond Software Timer Support For Network Processing Mohit Aron and Peter Druschel Rice University Presented By Jonathan.
Scheduling Memory Transactions Parallel computing day, Ben-Gurion University, October 20, 2009.
Selfishness in Transactional Memory Raphael Eidenbenz, Roger Wattenhofer Distributed Computing Group Game Theory meets Multicore Architecture.
5: CPU-Scheduling1 Jerry Breecher OPERATING SYSTEMS SCHEDULING.
The Cost of Privatization Hagit Attiya Eshcar Hillel Technion & EPFLTechnion.
1 Scalable Transactional Memory Scheduling Gokarna Sharma (A joint work with Costas Busch) Louisiana State University.
Software Transaction Memory for Dynamic-Sized Data Structures presented by: Mark Schall.
A Dynamic Elimination-Combining Stack Algorithm Gal Bar-Nissan, Danny Hendler and Adi Suissa Department of Computer Science, BGU, January 2011 Presnted.
1 Threads Chapter 4 Reading: 4.1,4.4, Process Characteristics l Unit of resource ownership - process is allocated: n a virtual address space to.
Operating System Process Scheduling (Ch 4.2, )
An Introduction to Software Transactional Memory
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Lecture 5 Operating Systems.
Task Scheduling for Highly Concurrent Analytical and Transactional Main-Memory Workloads Iraklis Psaroudakis (EPFL), Tobias Scheuer (SAP AG), Norman May.
Software Transactional Memory for Dynamic-Sized Data Structures Maurice Herlihy, Victor Luchangco, Mark Moir, William Scherer Presented by: Gokul Soundararajan.
Window-Based Greedy Contention Management for Transactional Memory Gokarna Sharma (LSU) Brett Estrade (Univ. of Houston) Costas Busch (LSU) 1DISC 2010.
Adaptive Transaction Scheduling for Transactional Memory Systems Richard M. Yoo Hsien-Hsin S. Lee Georgia Tech.
1 Previous lecture review n Out of basic scheduling techniques none is a clear winner: u FCFS - simple but unfair u RR - more overhead than FCFS may not.
CAR-STM: Scheduling-based Collision Avoidance and Reduction for Software Transactional Memory Shlomi Dolev, Danny Hendler and Adi Suissa PODC 2008.
Cosc 4740 Chapter 6, Part 3 Process Synchronization.
Scheduling Basic scheduling policies, for OS schedulers (threads, tasks, processes) or thread library schedulers Review of Context Switching overheads.
Multithreading in Java Project of COCS 513 By Wei Li December, 2000.
Scheduler Activations: Effective Kernel Support for the User- Level Management of Parallelism. Thomas E. Anderson, Brian N. Bershad, Edward D. Lazowska,
Euro-Par, A Resource Allocation Approach for Supporting Time-Critical Applications in Grid Environments Qian Zhu and Gagan Agrawal Department of.
1 Scheduling The part of the OS that makes the choice of which process to run next is called the scheduler and the algorithm it uses is called the scheduling.
1 Process Scheduling in Multiprocessor and Multithreaded Systems Matt Davis CS5354/7/2003.
Difference of Degradation Schemes among Operating Systems -Experimental analysis for web application servers- Hideaki Hibino*(Tokyo Tech) Kenichi Kourai.
This project and the research leading to these results has received funding from the European Community's Seventh Framework Programme [FP7/ ] under.
Low-Overhead Software Transactional Memory with Progress Guarantees and Strong Semantics Minjia Zhang, 1 Jipeng Huang, Man Cao, Michael D. Bond.
On the Performance of Window-Based Contention Managers for Transactional Memory Gokarna Sharma and Costas Busch Louisiana State University.
Technology from seed Exploiting Off-the-Shelf Virtual Memory Mechanisms to Boost Software Transactional Memory Amin Mohtasham, Paulo Ferreira and João.
Processes & Threads Introduction to Operating Systems: Module 5.
© 2008 Multifacet ProjectUniversity of Wisconsin-Madison Pathological Interaction of Locks with Transactional Memory Haris Volos, Neelam Goyal, Michael.
CSE 153 Design of Operating Systems Winter 2015 Midterm Review.
Kernel-Assisted Scheduling and Deadline Support for Software Transactional Memory Walther Maldonado, Patrick Marlier, Pascal Felber, Etienne Rivière University.
On Transactional Memory, Spinlocks and Database Transactions Khai Q. Tran Spyros Blanas Jeffrey F. Naughton (University of Wisconsin Madison)
1 CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling.
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 32 – Multimedia OS Klara Nahrstedt Spring 2010.
Window-Based Greedy Contention Management for Transactional Memory Gokarna Sharma (LSU) Brett Estrade (Univ. of Houston) Costas Busch (LSU) DISC
Transactional Contention Management as a Non-Clairvoyant Scheduling Problem Hagit Attiya, Alessia Milani Technion, Haifa-LABRI, University of Bordeaux.
Real-Time Operating Systems RTOS For Embedded systems.
Gargamel: A Conflict-Aware Contention Resolution Policy for STM Pierpaolo Cincilla, Marc Shapiro, Sébastien Monnet.
Workshop on Transactional Memory 2012 Walther Maldonado Moreira University of Neuchâtel (UNINE), Switzerland Pascal Felber UNINE Gilles Muller INRIA, France.
Copyright ©: Nahrstedt, Angrave, Abdelzaher
Part 2: Software-Based Approaches
Håkan Sundell Philippas Tsigas
Chapter 8 – Processor Scheduling
Department of Computer Science University of California, Santa Barbara
Gokarna Sharma Costas Busch Louisiana State University, USA
CPU SCHEDULING.
Department of Computer Science University of California, Santa Barbara
CSE 153 Design of Operating Systems Winter 2019
Dynamic Performance Tuning of Word-Based Software Transactional Memory
Presentation transcript:

Scheduling-based TM Contention Management A survey talk 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, Rome Danny Hendler Ben-Gurion university Danny Hendler

TM: a research toy? TM “TM:Why it is only a research toy.” Cascaval et al., 2008 “Why STM can be more than a research toy.” Dragojevic et al., 2009 There is consensus that TM performance must be improved. Specifically: “In some workloads, performance degraded when we used too many concurrent threads. One possible alternative to improving performance in these cases would be to modify the thread scheduler so it avoids running more concurrent threads than is optimal for a given workload, based on the information provided by the STM runtime. “ [Dragojevic et al.] 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

TM scheduling: rationale  Transactional threads controlled by TM-aware scheduler o Kernel-level, user-level  Richer “tool-box“ for reducing and/or preventing transaction conflicts Improve performance under high-contention 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

TM system Non TM-scheduled threads Contention Manager Contention Detection arbitrate proceed Abort/retry, wait “Conventional” (non-scheduling) contention management greedy Aggressive Karma Polka Suicide Polite “ Polymorphic contention management”, [Guerraoui, Herlihy & Pochon DISC'05 ] 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Conventional Contention Management is often problematic  Loser resumes execution after a waiting period  May resume execution too early  May resume execution too late  Repeated collisions occur under high contention  Livelocks  Performance may become worse than single lock Scheduling-based CM to the rescue. 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Talk outline Preliminaries The first TM schedulers Later user-land work Kernel support 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

“Adaptive Transaction Scheduling for transactional memory systems” [Yoo & Lee, SPAA'08] “CAR-STM: Scheduling-based collision avoidance and resolution for software transactional memory” [Dolev, Hendler & Suissa, PODC '08] “Steal-on-abort: dynamic transaction reordering to reduce conflicts in transactional memory” [Ansari, Jarvis, Kirkham, Kotsedilis, Lujan and Watson, HiPEAC'09] The first TM schedulers 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

A single scheduling queue Per-thread Contention Intensity (CI) computed Adaptive mechanism  CI below threshold  transaction begins normally  CI above threshold  transaction serialized (queued) Adaptive Transaction Scheduling (ATS) (Yoo & Lee, SPAA'08) 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Timeline flows from top to bottom An average of all the CIs from running threads Transactions begin execution without resorting to the scheduler As contention starts to increase, some transactions call the scheduler As more transactions get serialized, contention intensity starts to decrease Contention intensity subsides below threshold More transactions start without the scheduler to exploit more parallelism ATS adaptively varies the number of concurrent transactions according to the dynamic contention feedback Behavior of a Queue-Based Scheduler ATS: adaptive parallelism control Yoo & Lee, Transaction Scheduling.

CAR-STM (Collision Avoidance and Resolution for STM) (Dolev, Hendler & Suissa, PODC'08)  Per-core transaction queues  Serialize conflicting transactions  Contention avoidance: attempt to avoid even first collision 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

CAR-STM high-level architecture Transaction queue #1 TQ thread Transaction thread T-Info Core #1 Serializing contention mgr. Dispatcher Collision Avoider Core #k Transaction queue #k 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Execution time: STMBench7 * R/W dominated workloads (*) “STMBench7: a benchmark for STM”, [Guerraoui, Kapalka & Viteck., Eurosys'07]

Shortcomings of first TM schedulers  May restrict parallelism too much  ATS: a single serialization queue  CAR-STM: at most a single transactional thread per core o High overheads even in the lack of contention 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Talk outline Preliminaries The first TM schedulers Later user-land work Kernel support 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

A low-overhead serializer  Avoid repeated collisions while minimizing over-serialization  No per-core queues  Adaptive “On the impact of serializing contention management on STM performance”, [Heber, Hendler & Suissa., opodis'09] “Scheduling support for TM contention management”, [Maldonado, Felber, Fedorova, Hendler, Lawall, Marlier, Muller & Suissa PPoPP'10] 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Transactional threads Condition variables A low-overhead serializer (cont'd) 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

1) t Identifies a collision 2) t calls contention manager: ABORT_OTHER 3) t changes status of t' to ABORT (writes that t is winner) tt' 4) t' identifies it was aborted A low-overhead serializer (cont'd) 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

t t' 5) t' rolls back transaction and goes to sleep on the condition variable of t 6) Eventually t commits and broadcasts on its condition variable… A low-overhead serializer (cont'd) 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

t' A low-overhead serializer (cont'd) 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

A low-overhead serializer (cont'd) Stabilization mechanism  Algorithm is adaptive o Serializing mode / “Conventional’’ mode  Prevents “mode-oscillations”: o Shifting to serialization-mode reduces perceived contention o Should use two thresholds 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Throughput of CM conventional algorithms is low CAR-STM over-serializes compared with low-overhead serializer Stabilization mechanism helps A low-overhead serializer (cont'd) Experimental evaluation 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

“Shrink” – collision prevention/avoidance  Predicts future accesses based on past accesses o Read-set predicted based on past few committed/aborted TXs (temporal locality) o Write-set predicted based on immediately preceding aborted TX  Serialize with a probability proportional to the number of threads currently serialized (serialization affinity), if thread's success rate is low and a collision is predicted “Preventing versus curing: avoiding conflicts in transactional memories”, [Dragojevic, Guerraoui, Singh and Singh, PODC'09] 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

“Shrink” – collision prevention (cont'd) Don't serialize when contention is low Update statistics, release lock if you own it Serialize only if contention is high & a collision is “predicted” 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

“Shrink” – experimental evaluation STMBench7, read-write workload 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Talk outline Preliminaries The first TM schedulers Later user-land work Kernel support 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Scheduling Support for Transactional Memory Contention Management  Implement CM scheduling support in the kernel scheduler (Linux & OpenSolaris)  (Strict) serialization  Soft serialization  Time-slice extension  Different mechanisms for communication between user- level STM library and kernel scheduler “Scheduling support for TM contention management”, [Maldonado, Felber, Fedorova, Hendler, Lawall, Marlier, Muller & Suissa PPoPP'10] 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

TM Library / Kernel Communication via Shared Memory Segment (Ser-k algorithm)  User code notifies kernel on events such as: transaction start, commit and abort (in which case thread yields)  Kernel code handles moving thread between ready and blocked queues 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Soft Serialization  Instead of blocking, reduce loser thread priority and yield  Efficient in scenarios where loser transactions may take a different execution path when retrying  Priority should be restored upon commit or when conflicting transactions terminate 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Time-slice extention  Preemption in the midst of a transaction increases conflict “window of vulnerability”  Defer preemption of transactional threads  avoid CPU monopolization by bounding number of extensions and yielding after commit  May be combined with serialization/soft serialization 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Evaluation ( STMBench7, 16-core AMD Opterom ) Conventional CM deteriorates when threads>cores Serializing by local spinning is efficient as long as threads ≤ cores 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Evaluation - STMBench7 throughput Serializing by sleeping on condition var is best when threads>cores, since system call overhead is negligible (long transactions) All strict serialization schemes significantly reduce aborts 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

“Transactional scheduling for read-dominated workloads” [Attiya & Milani, OPODIS'09] “Taking the heat off transactions: dynamic selection of pessimistic concurrency control [Sonmez, Harris, Cristal, Unsal & Valeo, IPDPS'09] “Proactive transaction scheduling for contention management” [Blake, Dreslinky & Mudge, MICRO'09] “Improving performance by reducing aborts in HTM” [Ansari, Khan, Lujan, Kotselidis, Kirkham and Watson, HIPEAC'10] “Window-based greedy contention management for TM” [Sharma, Estrade & Busch, DC'10] “On Transaction Scheduling in distributed TM systems] [Kim & Ravindran, 2010] “Kernel-assisted Scheduling and Deadline Support for STM” [Maldonado, Marlier, Felber, Lawall, Muller & Riviere, DSN'11] “Adaptive thread scheduling techniques for improving scalability of STM” [Chan, Lam & Wang, 2011] … Additional TM scheduling work 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Scheduling support for TM Conclusions & future work  Scheduling-based CM results in improved throughput under high contention o Overhead is negligible when contention is low  Lightweight kernel support can improve performance and efficiency for some workloads  Dynamically selecting best CM algorithm for workload at hand is a challenging research direction o Machine learning? 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Thank you. 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler