1 Lecture 18: Transactional Memories II Papers: LogTM: Log-Based Transactional Memory, HPCA’06, Wisconsin LogTM-SE: Decoupling Hardware Transactional Memory.

Slides:



Advertisements
Similar presentations
1 Lecture 6: Directory Protocols Topics: directory-based cache coherence implementations (wrap-up of SGI Origin and Sequent NUMA case study)
Advertisements

1 Lecture 20: Speculation Papers: Is SC+ILP=RC?, Purdue, ISCA’99 Coherence Decoupling: Making Use of Incoherence, Wisconsin, ASPLOS’04 Selective, Accurate,
Transactional Memory (TM) Evan Jolley EE 6633 December 7, 2012.
1 Lecture 12: Hardware/Software Trade-Offs Topics: COMA, Software Virtual Memory.
1 Lecture 21: Transactional Memory Topics: consistency model recap, introduction to transactional memory.
1 Lecture 10: TM Pathologies Topics: scalable lazy implementation, paper on TM performance pathologies.
CS 7810 Lecture 19 Coherence Decoupling: Making Use of Incoherence J.Huh, J. Chang, D. Burger, G. Sohi Proceedings of ASPLOS-XI October 2004.
1 Lecture 7: Transactional Memory Intro Topics: introduction to transactional memory, “lazy” implementation.
1 Lecture 4: Directory-Based Coherence Details of memory-based (SGI Origin) and cache-based (Sequent NUMA-Q) directory protocols.
1 Lecture 8: Eager Transactional Memory Topics: implementation details of eager TM, various TM pathologies.
1 Lecture 8: Transactional Memory – TCC Topics: “lazy” implementation (TCC)
1 Lecture 22: Fault Tolerance Papers: Token Coherence: Decoupling Performance and Correctness, ISCA’03, Wisconsin A Low Overhead Fault Tolerant Coherence.
1 Lecture 2: Snooping and Directory Protocols Topics: Snooping wrap-up and directory implementations.
1 Lecture 24: Transactional Memory Topics: transactional memory implementations.
1 Lecture 3: Snooping Protocols Topics: snooping-based cache coherence implementations.
Supporting Nested Transactional Memory in LogTM Authors Michelle J Moravan Mark Hill Jayaram Bobba Ben Liblit Kevin Moore Michael Swift Luke Yen David.
1 Lecture 6: TM – Eager Implementations Topics: Eager conflict detection (LogTM), TM pathologies.
1 Lecture 5: Directory Protocols Topics: directory-based cache coherence implementations.
1 Lecture 15: Consistency Models Topics: sequential consistency, requirements to implement sequential consistency, relaxed consistency models.
Unbounded Transactional Memory Paper by Ananian et al. of MIT CSAIL Presented by Daniel.
1 Lecture 5: TM – Lazy Implementations Topics: TM design (TCC) with lazy conflict detection and lazy versioning, intro to eager conflict detection.
1 Lecture 4: Directory Protocols and TM Topics: corner cases in directory protocols, lazy TM.
NUMA coherence CSE 471 Aut 011 Cache Coherence in NUMA Machines Snooping is not possible on media other than bus/ring Broadcast / multicast is not that.
1 Lecture 9: TM Implementations Topics: wrap-up of “lazy” implementation (TCC), eager implementation (LogTM)
1 Lecture 7: Lazy & Eager Transactional Memory Topics: details of “lazy” TM, scalable lazy TM, implementation details of eager TM.
1 Lecture 3: Directory-Based Coherence Basic operations, memory-based and cache-based directories.
Architecture and Design of AlphaServer GS320 Kourosh Gharachorloo, Madhu Sharma, Simon Steely, and Stephen Van Doren ASPLOS’2000 Presented By: Alok Garg.
1 Lecture 10: TM Implementations Topics: wrap-up of eager implementation (LogTM), scalable lazy implementation.
LogTM: Log-Based Transactional Memory Kevin E. Moore, Jayaram Bobba, Michelle J. Moravan, Mark D. Hill, & David A. Wood Presented by Colleen Lewis.
A Qualitative Survey of Modern Software Transactional Memory Systems Virendra J. Marathe Michael L. Scott.
1 Lecture 12: Hardware/Software Trade-Offs Topics: COMA, Software Virtual Memory.
1 Lecture 24: Fault Tolerance Papers: Token Coherence: Decoupling Performance and Correctness, ISCA’03, Wisconsin A Low Overhead Fault Tolerant Coherence.
Kevin E. Moore, Jayaram Bobba, Michelle J. Moravan, Mark D. Hill & David A. Wood Presented by: Eduardo Cuervo.
Software Transactional Memory Should Not Be Obstruction-Free Robert Ennals Presented by Abdulai Sei.
1 Lecture 3: Coherence Protocols Topics: consistency models, coherence protocol examples.
© 2006 Mulitfacet ProjectUniversity of Wisconsin-Madison LogTM: Log-Based Transactional Memory Kevin E. Moore, Jayaram Bobba, Michelle J. Moravan, Mark.
1 Lecture 20: Speculation Papers: Is SC+ILP=RC?, Purdue, ISCA’99 Coherence Decoupling: Making Use of Incoherence, Wisconsin, ASPLOS’04.
1 Lecture 10: Transactional Memory Topics: lazy and eager TM implementations, TM pathologies.
1 Lecture 7: PCM Wrap-Up, Cache coherence Topics: handling PCM errors and writes, cache coherence intro.
1 Lecture 7: Implementing Cache Coherence Topics: implementation details.
1 Lecture 8: Snooping and Directory Protocols Topics: 4/5-state snooping protocols, split-transaction implementation details, directory implementations.
Lecture 8: Snooping and Directory Protocols
Lecture 20: Consistency Models, TM
Minh, Trautmann, Chung, McDonald, Bronson, Casper, Kozyrakis, Olukotun
Architecture and Design of AlphaServer GS320
Hardware Transactional Memory
Two Ideas of This Paper Using Permissions-only Cache to deduce the rate at which less-efficient overflow handling mechanisms are invoked. When the overflow.
Lecture 19: Transactional Memories III
Lecture 9: Directory-Based Examples II
Lecture 2: Snooping-Based Coherence
Lecture 11: Transactional Memory
Lecture 5: Snooping Protocol Design Issues
Lecture: Transactional Memory, Networks
Lecture: Consistency Models, TM
Lecture 6: Transactions
Lecture 17: Transactional Memories I
Lecture 21: Transactional Memory
Lecture 22: Consistency Models, TM
Lecture 9: Directory Protocol Implementations
Lecture 9: Directory-Based Examples
Lecture 10: Consistency Models
LogTM-SE: Decoupling Hardware Transactional Memory from Caches
Lecture: Interconnection Networks
Lecture 8: Directory-Based Examples
Lecture 23: Transactional Memory
Lecture 21: Transactional Memory
Lecture: Consistency Models, TM
Lecture: Transactional Memory
Lecture 10: Directory-Based Examples II
Lecture 11: Consistency Models
Presentation transcript:

1 Lecture 18: Transactional Memories II Papers: LogTM: Log-Based Transactional Memory, HPCA’06, Wisconsin LogTM-SE: Decoupling Hardware Transactional Memory from Caches, HPCA’07, Wisconsin

2 LogTM Underlying conventional directory-based protocol Eager Versioning: writes happen “in place”, old values are stored away in virtual memory (hence, can handle large transactions), old values will be reinstated in case of an abort (expensive abort), not much to be done on a commit (fast commit) Eager Conflict Detection: coherence permissions are acquired immediately, a conflict is signaled if the latest value is obtained from a transaction that has not yet committed the write

3 Handling Reads P2 detects that someone is trying to read a value in its uncommitted write set; P2 not ready to make A available (has other work that it wants to finish “atomically”); sends a “conflict” message to requestor P1 P2 Wr-set: A Rd A Dir A: P2 Rdreq Rqfwd nack Dealing with the conflict: Both can abort – livelock! P1 can abort… it is possible that P2 also tries to read an uncommitted write of P1 and also aborts – livelock! P2 can abort… as above, can lead to livelock P1 waits for a while and re-tries; if T2 has finished, A is available; can cause deadlock as P2 may also wait for P1; eventually a software contention manager kicks in

4 Handling Writes The same rules apply, whether A is in P2’s read or write set P1 P2 Rd-set: A Wr A Dir A: P2 Rdexc Inval nack Dealing with the conflict: Both can abort – livelock! P1 can abort… it is possible that P2 also tries to read an uncommitted write of P1 and also aborts – livelock! P2 can abort… as above, can lead to livelock P1 waits for a while and re-tries; if T2 has finished, A is available; can cause deadlock as P2 may also wait for P1; eventually a software contention manager kicks in

5 Block Replacement If a block in a transaction’s rd/wr-set is evicted, the data is written back to memory if necessary, but the directory continues to maintain a “sticky” pointer to that node (subsequent requests have to confirm that the transaction has committed before proceeding) The sticky pointers are lazily removed over time (commits continue to be fast)

6 Versioning Every write first requires a read and a write to log the old value – the log is maintained in virtual memory and will likely be found in cache Aborts are uncommon – typically only when the contention manager kicks in on a potential deadlock; the logs are walked through in reverse order If a block is already marked as being logged (wr-set), the next write by that transaction can avoid the re-log Log writes can be placed in a write buffer to reduce contention for L1 cache ports

7 Results

8 Few stalls and abortsSmall write log buffer will avoid L1 port contention Writes are preceded by reads, so a log imposes low overhead

9 LogTM-SE Motivation: decouple conflict detection from cache implementation: maintain a signature to capture read and write sets  eliminates R and W bits for every cache block  eliminates the need to flash-clear above bits on commit  easy to save transaction state (simplifies nesting, context switching, etc.) Conflict detection can now yield false positives: signature indicates there is a conflict, but in reality, none exists

10 Signatures

11 Signature Behavior P : perfect (rd and write sets) BS : bit select (2048 bit signature) CBS : coarse bit select DBS : double bit select BS_64 : bit select (64 bit signature)

12 Title Bullet