1 Lecture 8: Eager Transactional Memory Topics: implementation details of eager TM, various TM pathologies.

Slides:



Advertisements
Similar presentations
CM20145 Concurrency Control
Advertisements

Optimistic Methods for Concurrency Control By : H.T. Kung & John T. Robinson Presenters: Munawer Saeed.
1 Concurrency Control Chapter Conflict Serializable Schedules  Two actions are in conflict if  they operate on the same DB item,  they belong.
1 Lecture 18: Transactional Memories II Papers: LogTM: Log-Based Transactional Memory, HPCA’06, Wisconsin LogTM-SE: Decoupling Hardware Transactional Memory.
1 Lecture 6: Directory Protocols Topics: directory-based cache coherence implementations (wrap-up of SGI Origin and Sequent NUMA case study)
Concurrency Control II. General Overview Relational model - SQL  Formal & commercial query languages Functional Dependencies Normalization Physical Design.
Cache Optimization Summary
1 Lecture 12: Hardware/Software Trade-Offs Topics: COMA, Software Virtual Memory.
1 Lecture 21: Transactional Memory Topics: consistency model recap, introduction to transactional memory.
1 Lecture 10: TM Pathologies Topics: scalable lazy implementation, paper on TM performance pathologies.
CS 7810 Lecture 19 Coherence Decoupling: Making Use of Incoherence J.Huh, J. Chang, D. Burger, G. Sohi Proceedings of ASPLOS-XI October 2004.
1 Lecture 7: Transactional Memory Intro Topics: introduction to transactional memory, “lazy” implementation.
1 Lecture 4: Directory-Based Coherence Details of memory-based (SGI Origin) and cache-based (Sequent NUMA-Q) directory protocols.
1 Lecture 23: Transactional Memory Topics: consistency model recap, introduction to transactional memory.
1 Lecture 8: Transactional Memory – TCC Topics: “lazy” implementation (TCC)
1 Lecture 22: Fault Tolerance Papers: Token Coherence: Decoupling Performance and Correctness, ISCA’03, Wisconsin A Low Overhead Fault Tolerant Coherence.
CS252/Patterson Lec /23/01 CS213 Parallel Processing Architecture Lecture 7: Multiprocessor Cache Coherency Problem.
1 Lecture 24: Transactional Memory Topics: transactional memory implementations.
Supporting Nested Transactional Memory in LogTM Authors Michelle J Moravan Mark Hill Jayaram Bobba Ben Liblit Kevin Moore Michael Swift Luke Yen David.
1 Lecture 6: TM – Eager Implementations Topics: Eager conflict detection (LogTM), TM pathologies.
1 Lecture 6: Lazy Transactional Memory Topics: TM semantics and implementation details of “lazy” TM.
1 Lecture 5: Directory Protocols Topics: directory-based cache coherence implementations.
1 Lecture 5: TM – Lazy Implementations Topics: TM design (TCC) with lazy conflict detection and lazy versioning, intro to eager conflict detection.
1 Lecture 4: Directory Protocols and TM Topics: corner cases in directory protocols, lazy TM.
NUMA coherence CSE 471 Aut 011 Cache Coherence in NUMA Machines Snooping is not possible on media other than bus/ring Broadcast / multicast is not that.
1 Lecture 9: TM Implementations Topics: wrap-up of “lazy” implementation (TCC), eager implementation (LogTM)
1 Lecture 7: Lazy & Eager Transactional Memory Topics: details of “lazy” TM, scalable lazy TM, implementation details of eager TM.
1 Lecture 3: Directory-Based Coherence Basic operations, memory-based and cache-based directories.
1 Lecture 10: TM Implementations Topics: wrap-up of eager implementation (LogTM), scalable lazy implementation.
LogTM: Log-Based Transactional Memory Kevin E. Moore, Jayaram Bobba, Michelle J. Moravan, Mark D. Hill, & David A. Wood Presented by Colleen Lewis.
Distributed Deadlocks and Transaction Recovery.
1 Lecture 24: Fault Tolerance Papers: Token Coherence: Decoupling Performance and Correctness, ISCA’03, Wisconsin A Low Overhead Fault Tolerant Coherence.
Kevin E. Moore, Jayaram Bobba, Michelle J. Moravan, Mark D. Hill & David A. Wood Presented by: Eduardo Cuervo.
© 2008 Multifacet ProjectUniversity of Wisconsin-Madison Pathological Interaction of Locks with Transactional Memory Haris Volos, Neelam Goyal, Michael.
1 Lecture 20: Speculation Papers: Is SC+ILP=RC?, Purdue, ISCA’99 Coherence Decoupling: Making Use of Incoherence, Wisconsin, ASPLOS’04.
1 Lecture 10: Transactional Memory Topics: lazy and eager TM implementations, TM pathologies.
1 Lecture 8: Snooping and Directory Protocols Topics: 4/5-state snooping protocols, split-transaction implementation details, directory implementations.
Lecture 8: Snooping and Directory Protocols
Lecture 20: Consistency Models, TM
Cache Coherence: Directory Protocol
Cache Coherence: Directory Protocol
Transactional Memory : Hardware Proposals Overview
Concurrency Control.
Multiprocessor Cache Coherency
Two Ideas of This Paper Using Permissions-only Cache to deduce the rate at which less-efficient overflow handling mechanisms are invoked. When the overflow.
Lecture 9: Directory-Based Examples II
Lecture 2: Snooping-Based Coherence
Lecture 11: Transactional Memory
Lecture 12: TM, Consistency Models
Lecture 5: Snooping Protocol Design Issues
Lecture: Transactional Memory, Networks
Lecture: Consistency Models, TM
Lecture 6: Transactions
Lecture 17: Transactional Memories I
Lecture 21: Transactional Memory
Lecture 8: Directory-Based Cache Coherence
Lecture 7: Directory-Based Cache Coherence
Chapter 15 : Concurrency Control
Lecture 22: Consistency Models, TM
Lecture 9: Directory Protocol Implementations
Lecture 9: Directory-Based Examples
Lecture: Interconnection Networks
Lecture 8: Directory-Based Examples
Performance Pathologies in Hardware Transactional Memory
Performance Pathologies in Hardware Transactional Memory
Lecture 23: Transactional Memory
Lecture 21: Transactional Memory
Lecture: Consistency Models, TM
Lecture: Transactional Memory
Lecture 10: Directory-Based Examples II
Presentation transcript:

1 Lecture 8: Eager Transactional Memory Topics: implementation details of eager TM, various TM pathologies

2 “Eager” Overview Topics: Logs Log optimization Conflict examples Handling deadlocks Sticky scenarios Aborts/commits/parallelism C Dir P R W C Dir P R W C Dir P R W C Dir P R W Scalable Non-broadcast Interconnect

3 “Eager” Implementation (Based Primarily on LogTM) A write is made permanent immediately (we do not wait until the end of the transaction) Can’t lose the old value (in case this transaction is aborted) – hence, before the write, we copy the old value into a log (the log is some space in virtual memory -- the log itself may be in cache, so not too expensive) This is eager versioning

4 Versioning Every overflowed write first requires a read and a write to log the old value – the log is maintained in virtual memory and will likely be found in cache Aborts are uncommon – typically only when the contention manager kicks in on a potential deadlock; the logs are walked through in reverse order If a block is already marked as being logged (wr-set), the next write by that transaction can avoid the re-log Log writes can be placed in a write buffer to reduce contention for L1 cache ports

5 Conflict Detection and Resolution Since Transaction-A’s writes are made permanent rightaway, it is possible that another Transaction-B’s rd/wr miss is re-directed to Tr-A At this point, we detect a conflict (neither transaction has reached its end, hence, eager conflict detection): two transactions handling the same cache line and at least one of them does a write One solution: requester stalls: Tr-A sends a NACK to Tr-B; Tr-B waits and re-tries again; hopefully, Tr-A has committed and can hand off the latest cache line to B  neither transaction needs to abort

6 Deadlocks Can lead to deadlocks: each transaction is waiting for the other to finish Need a separate (hw/sw) contention manager to detect such deadlocks and force one of them to abort Tr-A Tr-B write X write Y … … read Y read X Alternatively, every transaction maintains an “age” and a young transaction aborts and re-starts if it is keeping an older transaction waiting and itself receives a nack from an older transaction

7 Block Replacement If a block in a transaction’s rd/wr-set is evicted, the data is written back to memory if necessary, but the directory continues to maintain a “sticky” pointer to that node (subsequent requests have to confirm that the transaction has committed before proceeding) The sticky pointers are lazily removed over time (commits continue to be fast)

8 Paper on TM Pathologies LL: lazy versioning, lazy conflict detection, committing transaction wins conflicts EL: lazy versioning, eager conflict detection, requester succeeds and others abort EE: eager versioning, eager conflict detection, requester stalls

9 Two conflicting transactions that keep aborting each other Can do exponential back-off to handle livelock Fixable by doing requester stalls? VM: any CD: eager CR: requester wins Pathology 1: Friendly Fire

10 A writer has to wait for the reader to finish – but if more readers keep showing up, the writer is starved (note that the directory allows new readers to proceed by just adding them to the list of sharers) VM: any CD: eager CR: requester stalls Pathology 2: Starving Writer

11 If there’s a single commit token, transaction commit is serialized There are ways to alleviate this problem VM: lazy CD: lazy CR: any Pathology 3: Serialized Commit

12 A transaction is stalling on another transaction that ultimately aborts and takes a while to reinstate old values VM: any CD: eager CR: requester stalls Pathology 4: Futile Stall

13 Small successful transactions can keep aborting a large transaction The large transaction can eventually grab the token and not release it until after it commits VM: lazy CD: lazy CR: committer wins Pathology 5: Starving Elder

14 A number of similar (conflicting) transactions execute together – one wins, the others all abort – shortly, these transactions all return and repeat the process VM: lazy CD: lazy CR: committer wins Pathology 6: Restart Convoy

15 If two transactions both read the same object and then both decide to write it, a deadlock is created Exacerbated by the Futile Stall pathology Solution? VM: eager CD: eager CR: requester stalls Pathology 7: Dueling Upgrades

16 Four Extensions Predictor: predict if the read will soon be followed by a write and acquire write permissions aggressively Hybrid: if a transaction believes it is a Starving Writer, it can force other readers to abort; for everything else, use requester stalls Timestamp: In the EL case, requester wins only if it is the older transaction (handles Friendly Fire pathology) Backoff: in the LL case, aborting transactions invoke exponential back-off to prevent convoy formation

17 Title Bullet