A Qualitative Survey of Modern Software Transactional Memory Systems

A Qualitative Survey of Modern Software Transactional Memory Systems
Virendra J. Marathe Michael L. Scott

Concepts and Background: Non-blocking Synchronization Algorithms
Wait-freedom: all processes contending for a set of objects make progress in a finite number of steps. This rules out deadlock and starvation. Lock-freedom: at least one process makes progress. This rules out deadlock but not starvation. Obstruction-freedom: guarantees progress of a process in absence of contention. Rules out deadlocks, but livelocks are possible.

Concepts and Background: Non-blocking Synchronization Algorithms
Blocking vs. non-blocking The wait state disappears in non-blocking No deadlock, priority inversion or convoying in non-blocking Livelock can be addressed via contention management Tradeoffs between *-freedom properties: flexibility, simplicity and performance vs. desirability (strongest property)

Transactions and STM Transaction (Tx) = sequence of instructions that atomically modifies a set of concurrent objects Transaction – satisfies linearizability and atomicity properties; remember ACID (atomicity, consistency, isolation, durability) Software Transactional Memory (STM) = generic non-blocking synchronization construct

Original STM A transaction updates a concurrent object only after declaring its intention system-wide (transaction is owner of the object) Atomic acquiring/release of ownership: CAS, LL/SC At most one transaction at a time can own an object – ownership records An ownership record is null or points to its owner's transaction record

Original STM: Shared Data Structures

Original STM Tx commits only if it acquires all desired ownerships
Otherwise, it aborts and releases all its ownerships On success: change state to COMMITTED, make updates, and release ownerships (mCAS update) Avoiding livelock: non-recursive helping mechanism based on total global ordering Limitations: double memory space reqs., and pre-knowledge of all objects accessed is required to ensure ordering

Hashtable-Based STM Hashtable used to store ownership records (orecs)
STMStart, STMRead, STMWrite, STMAbort, STMCommit, STMValidate, STMWait 3 main data structures: Application Heap Hashtable of orecs Transaction descriptors

Hash STM: STM Heap showing an Active Transaction

Hash STM Acquiring orecs only takes place during STMCommit (commit = multi-word CAS) Commit: Acquire all desired ownerships Set status to COMMITTED Release phase: write back new value/version number in memory/orec Conflicts: when a Tx finds another Tx's descriptor in one of the orecs it reads (STMRead/Write) or acquires (STMCommit)

Hash STM: Conflict Resolution
Read conflict: if conflicting Tx is ACTIVE, we abort it -> hence, obstruction-free design Acquire conflict: if conflicting Tx is ACTIVE, we abort it; otherwise, we could try to help the other Tx But ! helping causes a lot of contention => stealing: we copy & merge the conflicting Tx's orecs into our descriptor

Hash STM: Conflict Resolution
Stale updates: during release, replace a newer value with an older one – when stealer Tx1 makes its updates before (older) updates of the victim Tx2 Solution: redo = current Tx redoes the updates from the stolen orec iff stealer is no longer in the ACTIVE state

Hash STM: Contention Management
Tx1 aborts conflicting Tx2 = aggressive policy But polite contention management is more efficient ! Do not abort the other; backoff (exponential), and only abort the other after maximum backoff limit reached

Hash STM: Memory Blow-up
During stealing, Tx merges all the orecs from the other Tx (including orecs that it doesn't need) Scalability issue for the merging step ! Moreover, this false sharing leads to merging long chains in a transaction descriptor This may become unacceptable in moderate/high contention More side-effects: Release phase becomes longer Long chains may thrash cache

Hash STM: LL/SC Approach
Replace merge-redo (which requires mCAS – usually unavailable) with helping Instead of merging, the stealer writes the updates to memory from the conflicting Tx's descriptor Writing takes place as follows: LL on the target memory location Double-check the orec (was it stolen in the meantime ?) Do an SC to the memory location

Hash STM: LL/SC Approach
Benefits: Reduced and simplified data structures (ref. counts not needed anymore) Greatly reduced complexity in the stealing process Significantly diminished space overhead of the hashtable Reduced cache thrashing Eliminates memory blow-up problem

Object-based STMs Object – level synchronization
Better than word-based STMs especially for dynamic data structures Word-based STMs better for higher levels of granularity Conventional approaches: use synchronization But this is difficult and error-prone for complicated structures like (red-black) trees

Dynamic STM (DSTM) Transactional Memory Object (TM Object) Structure

DSTM Design – The Locator
Most recent valid version of data object is determined by the state of the most recently modifying Tx Locator vs orec: Locator is referenced by a TM Object, orec is found through a hash function Locator points to old & new versions of object; orec points to a transaction descriptor which contains them Locator does not require a version number – it stores a pointer to the most recent valid version of the object

DSTM: Opening a TM Object
Opening of a TM Object recently modified by a committed transaction

DSTM Data access is only through TM Objects
In case of conflict while opening TM Objects – one of the two transactions is aborted (early conflict resolution) After updating the new version, a Tx tries to replace the old locator with the new one (CAS) Contention management protocol – abort itself or the other Tx, aggressive/polite

DSTM – Early Release Release an open object before committing to reduce contention Very helpful for tree-like structures Many transactions require only read access These would cause unnecessary contention So use separate semantics for read-only transactions

DSTM – Early Release DSTM uses a separate read-list of objects open in read-only mode Not visible/ accessible from the TM Object locators The others Tx’s are unaware of this Tx’s read-list, so they may change some of the objects To avoid inconsistencies, incremental validation is used Verify for consistency all objects in the read-list before opening a TM Object Also validate before committing

FSTM The basic Transactional Memory Structure in FSTM

FSTM Concurrent objects wrapped in object headers
Transactions access objects by opening object headers Transaction descriptors maintain lists of in-use concurrent objects Read-only list and read-write list contain object handles Object handle – contain a shadow copy (local to each Tx), upon which all updates are made

FSTM – Accessing Objects
Tx states: UNDECIDED, ABORTED, COMMITTED, READ-CHECKING Open object using object header Create object handle in Tx’s descriptor and place in appropriate list (read-only/read-write) Tx becomes visible to others only during commit So conflicts appear only with other Tx’s that are trying to commit

FSTM – Commit Operation
Is a multi-word CAS, with 3 phases: Acquire phase: Tx gains exclusive ownership of opened objects Decision point: Tx commits or aborts Release phase: Tx releases ownership of all acquired objects

FSTM – Conflict Resolution
Conflict resolution: a total global ordering is used to acquire concurrent objects If we conflict with a committed Tx, we abort If we conflict with an uncommitted Tx, we help it (recursive helping) Cyclical recursive helping is prevented by the total global ordering But we may still have livelocks !

FSTM – The Read Phase To avoid contention, we do not acquire and release objects opened in read-only mode Instead, we simply check the read-only list for consistency upon committing (have they changed ?) = the read phase Upon conflict with UNDECIDED/ABORTED Tx, we check its read-write list for consistency This may lead to non-serializability, and is avoided by the additional READ-CHECKING state

FSTM – The Read Phase Tx1 in READ-CHECKING
Tx2 UNDECIDED – in case of inconsistency, Tx1 aborts (doesn’t help Tx2) Tx2 READ-CHECKING – Tx1 helps or aborts Tx2, based on global ordering of transactions: Lower global numbers abort higher numbers Higher global numbers help their predecessors

Qualitative Comparison 1: Object Acquire Semantics
DSTM: Tx acquires an object using a CAS, hence Tx becomes visible early (eager acquire) FSTM, HashSTM: acquiring is done at commit time (lazy acquire) Eager semantics -> early conflict detection -> early conflict resolution (good) Lazy acquire -> long transactions (bad) But ! Eager acquire may lead to unnecessary aborts B aborts A, C aborts B, but C had no conflicts with A, so A was aborted unnecessarily

Qualitative Comparison 1: Object Acquire Semantics
HashSTM can be modified to use eager acquire semantics DSTM can be modified to use lazy acquire FSTM cannot be changed to use eager acquire though (because lock-freedom is guaranteed by the global ordering) To preserve that, we’d need pre-knowledge about all objects a Tx will access Proof that obstruction-freedom is flexible, whereas lock-freedom is not

Qualitative Comparison 2: Indirection Overhead
To update N objects (with no contention): DSTM requires N+1 CAS ops FSTM and HashSTM require 2N+1 Cause: an additional level of indirection in DSTM This will result in slower transactions in DSTM, but also in cheaper commit operations

Qualitative Comparison 3: Space Usage
DSTM requires more than twice the space of FSTM for an object This may be bad for very large concurrent objects because much space would be used up by invalid copies

Qualitative Comparison 4: Search Overhead
FSTM, HashSTM maintain lists of acquired objects that have to be parsed/search at commit, to get to the object that there is a conflict for This overhead does not exist in DSTM HashSTM could be improved by adding an extra pointer in the lists…

Qualitative Comparison 5: Contention Management
FSTM uses recursive helping to ensure lock-freedom However obstruction-freedom allows for greater simplicity and flexibility Helping also produces high contention for cache blocks among processors Need empirical comparison between contention management and helping, though

Qualitative Comparison 6: Transaction Validation
DSTM: incremental validation performs validation for each STM operation Great for safety/consistency and for making programmer’s life easier, but has some overhead FSTM provides a separate function for Tx validation to the programmer Also proposes a scheme to reduce incremental validation cost

Conclusions Presented/discussed three modern STMs
Need experiments to quantitatively evaluate the tradeoffs presented in our qualitative comparison Need to study which data structures are best suited to which STM system Need to compare performance vs. performant locking-based algorithms

A Qualitative Survey of Modern Software Transactional Memory Systems

Similar presentations

Presentation on theme: "A Qualitative Survey of Modern Software Transactional Memory Systems"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

A Qualitative Survey of Modern Software Transactional Memory Systems

Similar presentations

Presentation on theme: "A Qualitative Survey of Modern Software Transactional Memory Systems"— Presentation transcript:

Similar presentations

About project

Feedback