A Qualitative Survey of Modern Software Transactional Memory Systems

Slides:

Advertisements

Similar presentations

CM20145 Concurrency Control

Advertisements

Optimistic Methods for Concurrency Control By : H.T. Kung & John T. Robinson Presenters: Munawer Saeed.

Transaction Management: Concurrency Control CS634 Class 17, Apr 7, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.

Software Transactional Memory and Conditional Critical Regions Word-Based Systems.

Principles of Transaction Management. Outline Transaction concepts & protocols Performance impact of concurrency control Performance tuning.

CS492B Analysis of Concurrent Programs Lock Basics Jaehyuk Huh Computer Science, KAIST.

Topic 6.3: Transactions and Concurrency Control Hari Uday.

IDIT KEIDAR DMITRI PERELMAN RUI FAN EuroTM 2011 Maintaining Multiple Versions in Software Transactional Memory 1.

Transactional Locking Nir Shavit Tel Aviv University (Joint work with Dave Dice and Ori Shalev)

Virendra J. Marathe, William N. Scherer III, and Michael L. Scott Department of Computer Science University of Rochester Presented by: Armand R. Burks.

Software Transactional Memory Kevin Boos. Two Papers Software Transactional Memory for Dynamic-Sized Data Structures (DSTM) – Maurice Herlihy et al –

Toward High Performance Nonblocking Software Transactional Memory Virendra J. Marathe University of Rochester Mark Moir Sun Microsystems Labs.

Lecture 11 Recoverability. 2 Serializability identifies schedules that maintain database consistency, assuming no transaction fails. Could also examine.

CMPT 401 Summer 2007 Dr. Alexandra Fedorova Lecture X: Transactions.

Ali Saoud Object Based Transactional Memory. Introduction Resent trends go towards object based SMT because it’s dynamic Word-based STM systems are more.

CMPT Dr. Alexandra Fedorova Lecture X: Transactions.

PARALLEL PROGRAMMING with TRANSACTIONAL MEMORY Pratibha Kona.

TOWARDS A SOFTWARE TRANSACTIONAL MEMORY FOR GRAPHICS PROCESSORS Daniel Cederman, Philippas Tsigas and Muhammad Tayyab Chaudhry.

Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed.

The Performance of Spin Lock Alternatives for Shared-Memory Microprocessors Thomas E. Anderson Presented by David Woodard.

1 MetaTM/TxLinux: Transactional Memory For An Operating System Hany E. Ramadan, Christopher J. Rossbach, Donald E. Porter and Owen S. Hofmann Presenter:

Transaction Management and Concurrency Control

CS510 Advanced OS Seminar Class 10 A Methodology for Implementing Highly Concurrent Data Objects by Maurice Herlihy.

Transaction Management

CS510 Concurrent Systems Class 13 Software Transactional Memory Should Not be Obstruction-Free.

Language Support for Lightweight transactions Tim Harris & Keir Fraser Presented by Narayanan Sundaram 04/28/2008.

Software Transaction Memory for Dynamic-Sized Data Structures presented by: Mark Schall.

Software Transactional Memory for Dynamic-Sized Data Structures Maurice Herlihy, Victor Luchangco, Mark Moir, William Scherer Presented by: Gokul Soundararajan.

Data Concurrency Control And Data Recovery

A Qualitative Survey of Modern Software Transactional Memory Systems Virendra J. Marathe Michael L. Scott.

CS5204 – Operating Systems Transactional Memory Part 2: Software-Based Approaches.

Chapter 11 Concurrency Control. Lock-Based Protocols  A lock is a mechanism to control concurrent access to a data item  Data items can be locked in.

WG5: Applications & Performance Evaluation Pascal Felber

Lowering the Overhead of Software Transactional Memory Virendra J. Marathe, Michael F. Spear, Christopher Heriot, Athul Acharya, David Eisenstat, William.

Transactional Memory Lecturer: Danny Hendler. 2 2 From the New York Times…

Chapter 16 Concurrency. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.16-2 Topics in this Chapter Three Concurrency Problems Locking Deadlock.

CS510 Concurrent Systems Jonathan Walpole. A Methodology for Implementing Highly Concurrent Data Objects.

Software Transactional Memory Should Not Be Obstruction-Free Robert Ennals Presented by Abdulai Sei.

© 2008 Multifacet ProjectUniversity of Wisconsin-Madison Pathological Interaction of Locks with Transactional Memory Haris Volos, Neelam Goyal, Michael.

Transaction Management Transparencies. ©Pearson Education 2009 Chapter 14 - Objectives Function and importance of transactions. Properties of transactions.

A Methodology for Implementing Highly Concurrent Data Objects by Maurice Herlihy Slides by Vincent Rayappa.

MULTIVIE W Slide 1 (of 21) Software Transactional Memory Should Not Be Obstruction Free Paper: Robert Ennals Presenter: Emerson Murphy-Hill.

Chapter 13 Managing Transactions and Concurrency Database Principles: Fundamentals of Design, Implementation, and Management Tenth Edition.

Maurice Herlihy, Victor Luchangco, Mark Moir, William N. Scherer III

Maurice Herlihy and J. Eliot B. Moss, ISCA '93

Memory Hierarchy Ideal memory is fast, large, and inexpensive

Minh, Trautmann, Chung, McDonald, Bronson, Casper, Kozyrakis, Olukotun

Part 2: Software-Based Approaches

Transaction Management and Concurrency Control

Faster Data Structures in Transactional Memory using Three Paths

Concurrency Control.

Anders Gidenstam Håkan Sundell Philippas Tsigas

Changing thread semantics

Chapter 10 Transaction Management and Concurrency Control

Part 1: Concepts and Hardware- Based Approaches

Chapter 15 : Concurrency Control

Lecture: Coherence and Synchronization

Lecture 2 Part 2 Process Synchronization

Hybrid Transactional Memory

Introduction of Week 13 Return assignment 11-1 and 3-1-5

Distributed Transactions

Software Transactional Memory Should Not be Obstruction-Free

Locking Protocols & Software Transactional Memory

Transactions and Concurrency

Concurrency Control Techniques

Lecture 23: Transactional Memory

Lecture: Coherence and Synchronization

CONCURRENCY Concurrency is the tendency for different tasks to happen at the same time in a system ( mostly interacting with each other ) . Parallel.

Lecture 18: Coherence and Synchronization

Dynamic Performance Tuning of Word-Based Software Transactional Memory

Presentation transcript:

A Qualitative Survey of Modern Software Transactional Memory Systems Virendra J. Marathe Michael L. Scott

Concepts and Background: Non-blocking Synchronization Algorithms Wait-freedom: all processes contending for a set of objects make progress in a finite number of steps. This rules out deadlock and starvation. Lock-freedom: at least one process makes progress. This rules out deadlock but not starvation. Obstruction-freedom: guarantees progress of a process in absence of contention. Rules out deadlocks, but livelocks are possible.

Concepts and Background: Non-blocking Synchronization Algorithms Blocking vs. non-blocking The wait state disappears in non-blocking No deadlock, priority inversion or convoying in non-blocking Livelock can be addressed via contention management Tradeoffs between *-freedom properties: flexibility, simplicity and performance vs. desirability (strongest property)

Transactions and STM Transaction (Tx) = sequence of instructions that atomically modifies a set of concurrent objects Transaction – satisfies linearizability and atomicity properties; remember ACID (atomicity, consistency, isolation, durability) Software Transactional Memory (STM) = generic non-blocking synchronization construct

Original STM A transaction updates a concurrent object only after declaring its intention system-wide (transaction is owner of the object) Atomic acquiring/release of ownership: CAS, LL/SC At most one transaction at a time can own an object – ownership records An ownership record is null or points to its owner's transaction record

Original STM: Shared Data Structures

Original STM Tx commits only if it acquires all desired ownerships Otherwise, it aborts and releases all its ownerships On success: change state to COMMITTED, make updates, and release ownerships (mCAS update) Avoiding livelock: non-recursive helping mechanism based on total global ordering Limitations: double memory space reqs., and pre-knowledge of all objects accessed is required to ensure ordering

Hashtable-Based STM Hashtable used to store ownership records (orecs) STMStart, STMRead, STMWrite, STMAbort, STMCommit, STMValidate, STMWait 3 main data structures: Application Heap Hashtable of orecs Transaction descriptors

Hash STM: STM Heap showing an Active Transaction

Hash STM Acquiring orecs only takes place during STMCommit (commit = multi-word CAS) Commit: Acquire all desired ownerships Set status to COMMITTED Release phase: write back new value/version number in memory/orec Conflicts: when a Tx finds another Tx's descriptor in one of the orecs it reads (STMRead/Write) or acquires (STMCommit)

Hash STM: Conflict Resolution Read conflict: if conflicting Tx is ACTIVE, we abort it -> hence, obstruction-free design Acquire conflict: if conflicting Tx is ACTIVE, we abort it; otherwise, we could try to help the other Tx But ! helping causes a lot of contention => stealing: we copy & merge the conflicting Tx's orecs into our descriptor

Hash STM: Conflict Resolution Stale updates: during release, replace a newer value with an older one – when stealer Tx1 makes its updates before (older) updates of the victim Tx2 Solution: redo = current Tx redoes the updates from the stolen orec iff stealer is no longer in the ACTIVE state

Hash STM: Contention Management Tx1 aborts conflicting Tx2 = aggressive policy But polite contention management is more efficient ! Do not abort the other; backoff (exponential), and only abort the other after maximum backoff limit reached

Hash STM: Memory Blow-up During stealing, Tx merges all the orecs from the other Tx (including orecs that it doesn't need) Scalability issue for the merging step ! Moreover, this false sharing leads to merging long chains in a transaction descriptor This may become unacceptable in moderate/high contention More side-effects: Release phase becomes longer Long chains may thrash cache

Hash STM: LL/SC Approach Replace merge-redo (which requires mCAS – usually unavailable) with helping Instead of merging, the stealer writes the updates to memory from the conflicting Tx's descriptor Writing takes place as follows: LL on the target memory location Double-check the orec (was it stolen in the meantime ?) Do an SC to the memory location

Hash STM: LL/SC Approach Benefits: Reduced and simplified data structures (ref. counts not needed anymore) Greatly reduced complexity in the stealing process Significantly diminished space overhead of the hashtable Reduced cache thrashing Eliminates memory blow-up problem

Object-based STMs Object – level synchronization Better than word-based STMs especially for dynamic data structures Word-based STMs better for higher levels of granularity Conventional approaches: use synchronization But this is difficult and error-prone for complicated structures like (red-black) trees

Dynamic STM (DSTM) Transactional Memory Object (TM Object) Structure

DSTM Design – The Locator Most recent valid version of data object is determined by the state of the most recently modifying Tx Locator vs orec: Locator is referenced by a TM Object, orec is found through a hash function Locator points to old & new versions of object; orec points to a transaction descriptor which contains them Locator does not require a version number – it stores a pointer to the most recent valid version of the object

DSTM: Opening a TM Object Opening of a TM Object recently modified by a committed transaction

DSTM Data access is only through TM Objects In case of conflict while opening TM Objects – one of the two transactions is aborted (early conflict resolution) After updating the new version, a Tx tries to replace the old locator with the new one (CAS) Contention management protocol – abort itself or the other Tx, aggressive/polite

DSTM – Early Release Release an open object before committing to reduce contention Very helpful for tree-like structures Many transactions require only read access These would cause unnecessary contention So use separate semantics for read-only transactions

DSTM – Early Release DSTM uses a separate read-list of objects open in read-only mode Not visible/ accessible from the TM Object locators The others Tx’s are unaware of this Tx’s read-list, so they may change some of the objects To avoid inconsistencies, incremental validation is used Verify for consistency all objects in the read-list before opening a TM Object Also validate before committing

FSTM The basic Transactional Memory Structure in FSTM

FSTM Concurrent objects wrapped in object headers Transactions access objects by opening object headers Transaction descriptors maintain lists of in-use concurrent objects Read-only list and read-write list contain object handles Object handle – contain a shadow copy (local to each Tx), upon which all updates are made

FSTM – Accessing Objects Tx states: UNDECIDED, ABORTED, COMMITTED, READ-CHECKING Open object using object header Create object handle in Tx’s descriptor and place in appropriate list (read-only/read-write) Tx becomes visible to others only during commit So conflicts appear only with other Tx’s that are trying to commit

FSTM – Commit Operation Is a multi-word CAS, with 3 phases: Acquire phase: Tx gains exclusive ownership of opened objects Decision point: Tx commits or aborts Release phase: Tx releases ownership of all acquired objects

FSTM – Conflict Resolution Conflict resolution: a total global ordering is used to acquire concurrent objects If we conflict with a committed Tx, we abort If we conflict with an uncommitted Tx, we help it (recursive helping) Cyclical recursive helping is prevented by the total global ordering But we may still have livelocks !

FSTM – The Read Phase To avoid contention, we do not acquire and release objects opened in read-only mode Instead, we simply check the read-only list for consistency upon committing (have they changed ?) = the read phase Upon conflict with UNDECIDED/ABORTED Tx, we check its read-write list for consistency This may lead to non-serializability, and is avoided by the additional READ-CHECKING state

FSTM – The Read Phase Tx1 in READ-CHECKING Tx2 UNDECIDED – in case of inconsistency, Tx1 aborts (doesn’t help Tx2) Tx2 READ-CHECKING – Tx1 helps or aborts Tx2, based on global ordering of transactions: Lower global numbers abort higher numbers Higher global numbers help their predecessors

Qualitative Comparison 1: Object Acquire Semantics DSTM: Tx acquires an object using a CAS, hence Tx becomes visible early (eager acquire) FSTM, HashSTM: acquiring is done at commit time (lazy acquire) Eager semantics -> early conflict detection -> early conflict resolution (good) Lazy acquire -> long transactions (bad) But ! Eager acquire may lead to unnecessary aborts B aborts A, C aborts B, but C had no conflicts with A, so A was aborted unnecessarily

Qualitative Comparison 1: Object Acquire Semantics HashSTM can be modified to use eager acquire semantics DSTM can be modified to use lazy acquire FSTM cannot be changed to use eager acquire though (because lock-freedom is guaranteed by the global ordering) To preserve that, we’d need pre-knowledge about all objects a Tx will access Proof that obstruction-freedom is flexible, whereas lock-freedom is not

Qualitative Comparison 2: Indirection Overhead To update N objects (with no contention): DSTM requires N+1 CAS ops FSTM and HashSTM require 2N+1 Cause: an additional level of indirection in DSTM This will result in slower transactions in DSTM, but also in cheaper commit operations

Qualitative Comparison 3: Space Usage DSTM requires more than twice the space of FSTM for an object This may be bad for very large concurrent objects because much space would be used up by invalid copies

Qualitative Comparison 4: Search Overhead FSTM, HashSTM maintain lists of acquired objects that have to be parsed/search at commit, to get to the object that there is a conflict for This overhead does not exist in DSTM HashSTM could be improved by adding an extra pointer in the lists…

Qualitative Comparison 5: Contention Management FSTM uses recursive helping to ensure lock-freedom However obstruction-freedom allows for greater simplicity and flexibility Helping also produces high contention for cache blocks among processors Need empirical comparison between contention management and helping, though

Qualitative Comparison 6: Transaction Validation DSTM: incremental validation performs validation for each STM operation Great for safety/consistency and for making programmer’s life easier, but has some overhead FSTM provides a separate function for Tx validation to the programmer Also proposes a scheme to reduce incremental validation cost

Conclusions Presented/discussed three modern STMs Need experiments to quantitatively evaluate the tradeoffs presented in our qualitative comparison Need to study which data structures are best suited to which STM system Need to compare performance vs. performant locking-based algorithms