Does Hardware Transactional Memory Change Everything?

Slides:



Advertisements
Similar presentations
Inferring Locks for Atomic Sections Cornell University (summer intern at Microsoft Research) Microsoft Research Sigmund CheremTrishul ChilimbiSumit Gulwani.
Advertisements

Transactional Memory Parag Dixit Bruno Vavala Computer Architecture Course, 2012.
Recovery Amol Deshpande CMSC424.
© 2005 P. Kouznetsov Computing with Reads and Writes in the Absence of Step Contention Hagit Attiya Rachid Guerraoui Petr Kouznetsov School of Computer.
CS492B Analysis of Concurrent Programs Lock Basics Jaehyuk Huh Computer Science, KAIST.
A KTEC Center of Excellence 1 Cooperative Caching for Chip Multiprocessors Jichuan Chang and Gurindar S. Sohi University of Wisconsin-Madison.
Reduction, abstraction, and atomicity: How much can we prove about concurrent programs using them? Serdar Tasiran Koç University Istanbul, Turkey Tayfun.
A Rely-Guarantee-Based Simulation for Verifying Concurrent Program Transformations Hongjin Liang, Xinyu Feng & Ming Fu Univ. of Science and Technology.
Transactional Locking Nir Shavit Tel Aviv University (Joint work with Dave Dice and Ori Shalev)
Wait-Free Reference Counting and Memory Management Håkan Sundell, Ph.D.
Pessimistic Software Lock-Elision Nir Shavit (Joint work with Yehuda Afek Alexander Matveev)
Thread-Level Transactional Memory Decoupling Interface and Implementation UW Computer Architecture Affiliates Conference Kevin Moore October 21, 2004.
CS444/CS544 Operating Systems Synchronization 2/16/2006 Prof. Searleman
Persistent State Service 1 Distributed Object Transactions  Transaction principles  Concurrency control  The two-phase commit protocol  Services for.
Language Support for Lightweight transactions Tim Harris & Keir Fraser Presented by Narayanan Sundaram 04/28/2008.
1 CSE SUNY New Paltz Chapter Nine Multiprocessors.
Synchronization Methods for Multicore Programming Brendan Lynch.
Processes Part I Processes & Threads* *Referred to slides by Dr. Sanjeev Setia at George Mason University Chapter 3.
Relativistic Red Black Trees. Relativistic Programming Concurrent reading and writing improves performance and scalability – concurrent readers may disagree.
Microkernels, virtualization, exokernels Tutorial 1 – CSC469.
Simple Wait-Free Snapshots for Real-Time Systems with Sporadic Tasks Håkan Sundell Philippas Tsigas.
File Access and Transfer. Issues 4 Access and transfer are different operations –with different requirements 4 Transfer –move the file from one place.
A Qualitative Survey of Modern Software Transactional Memory Systems Virendra J. Marathe Michael L. Scott.
Håkan Sundell, Chalmers University of Technology 1 Simple and Fast Wait-Free Snapshots for Real-Time Systems Håkan Sundell Philippas.
Challenges in Non-Blocking Synchronization Håkan Sundell, Ph.D. Guest seminar at Department of Computer Science, University of Tromsö, Norway, 8 Dec 2005.
Distributed Shared Memory Based on Reference paper: Distributed Shared Memory, Concepts and Systems.
A down-to-earth look at the cloud host OS Malte SchwarzkopfSteven Hand.
Cache Coherence Protocols 1 Cache Coherence Protocols in Shared Memory Multiprocessors Mehmet Şenvar.
DISTRIBUTED COMPUTING
CS510 Concurrent Systems Why the Grass May Not Be Greener on the Other Side: A Comparison of Locking and Transactional Memory.
Software Transactional Memory Should Not Be Obstruction-Free Robert Ennals Presented by Abdulai Sei.
CS510 Concurrent Systems Jonathan Walpole. RCU Usage in Linux.
Threads. Readings r Silberschatz et al : Chapter 4.
Architectural Features of Transactional Memory Designs for an Operating System Chris Rossbach, Hany Ramadan, Don Porter Advanced Computer Architecture.
Read-Copy-Update Synchronization in the Linux Kernel 1 David Ferry, Chris Gill CSE 522S - Advanced Operating Systems Washington University in St. Louis.
ECE 1747: Parallel Programming Short Introduction to Transactions and Transactional Memory (a.k.a. Speculative Synchronization)
Processes Chapter 3. Processes in Distributed Systems Processes and threads –Introduction to threads –Distinction between threads and processes Threads.
Adaptive Software Lock Elision
Chapter 4: Threads Modified by Dr. Neerja Mhaskar for CS 3SH3.
Maurice Herlihy and J. Eliot B. Moss,  ISCA '93
Chapter 4: Threads.
Presented by: Daniel Taylor
Distributed Shared Memory
The Mach System Sri Ramkrishna.
Minh, Trautmann, Chung, McDonald, Bronson, Casper, Kozyrakis, Olukotun
Virtualizing Transactional Memory
Lecture 21 Synchronization
Combining HTM and RCU to Implement Highly Efficient Balanced Binary Search Trees Dimitrios Siakavaras, Konstantinos Nikas, Georgios Goumas and Nectarios.
PHyTM: Persistent Hybrid Transactional Memory
Faster Data Structures in Transactional Memory using Three Paths
Chapter 4 Threads.
Task Scheduling for Multicore CPUs and NUMA Systems
Challenges in Concurrent Computing
Chapter 4: Threads.
CS140 – Operating Systems Midterm Review
CMPT 886: Computer Architecture Primer
Changing thread semantics
Kernel Synchronization II
Chapter 26 Concurrency and Thread
Christopher J. Rossbach, Owen S. Hofmann, Donald E. Porter, Hany E
CHAPTER 4:THreads Bashair Al-harthi OPERATING SYSTEM
Distributed Transactions
Concurrency: Mutual Exclusion and Process Synchronization
Kernel Synchronization II
Chapter 4: Threads & Concurrency
The Challenge of Cross - Language Interoperability
Mattan Erez The University of Texas at Austin
Transactions in Distributed Systems
The University of Adelaide, School of Computer Science
MV-RLU: Scaling Read-Log-Update with Multi-Versioning
Presentation transcript:

Does Hardware Transactional Memory Change Everything? © 2003 Herlihy and Shavit Does Hardware Transactional Memory Change Everything? Maurice Herlihy Computer Science Dept Brown University

Not important: The details. Important Irrevocable change © 2003 Herlihy and Shavit Not important: The details. Important Irrevocable change

How to think about Synchronization

Amdal’s Law Poor synchronization ruins everything

New synchronization architectures will have a pervasive effect … on the entire stack. Not just improving … STM, lock elision, etc.. But rethinking … algorithms models … theory of data structures & libraries

Hand-over-Hand locking b c

Hand-over-Hand locking b c

Hand-over-Hand locking b c

Hand-over-Hand locking b c

Hand-over-Hand locking b c Arguably better than a coarse-grained lock … but too many lock acquisitions … & no overtaking.

Transactional HoH locking read read a b c Transaction

Transactional HoH locking b c HW Transactions play well with locks …. Consistent snapshot …. Followed by lock acquisition … is something new!

(That is, transparently replacing locks with speculative transactions) Lock Elision …. (That is, transparently replacing locks with speculative transactions) Is a point solution … Because it affects only one lock/ code block. What about using speculation for multiple locks?

Legacy code …. Dangerous to modify locking policies … Poorly understood. Speculatively preacquire sequence of locks? Retrofit (more) deterministic execution … Reduce deadlocks, delays? Performance implications? Theorem? Such transformations preserve correctness

Modern locks themselves …. Very complex data structures Read-Write Abortable NUMA Biased … Conjecture: Perhaps we can drain the swamp …. Same goes for: Barriers Work-stealing Fork-join exchangers Anything done by java.util.concur

RCU (read-copy-update) Readers proceed without locks … But writers follow awkward protocol …. ``While RCU is widely adopted in the Linux kernel, it has not been applied to the kernel's address space structures because of two significant challenges: the address space structures obey complex invariants that make RCU's restrictions on readers and writers onerous, and fully applying RCU to the address space structures requires an RCU-compatible concurrent balanced tree, for which no simple solutions exist.'' [Clements et al. 2012] Support both light-weight reads & non-trivial updates?

Memory Management? Malloc/free vs lock-free data structures? Reference counts? Hazard pointers? RCU? Transactions can transform & improve these techniques?

Granularity?

Progress guarantees? (weasel words)

Synchronization structures Modern synchronization architectures require … A new stack …. Data structures Algorithms Synchronization structures OS, VM functions A new theory …. progress complexity transformations