© 2006 Mulitfacet ProjectUniversity of Wisconsin-Madison Supporting Nested Transactional Memory in LogTM Michelle J. Moravan, Jayaram Bobba, Kevin E. Moore,

Slides:



Advertisements
Similar presentations
Virtual Hierarchies to Support Server Consolidation Michael Marty and Mark Hill University of Wisconsin - Madison.
Advertisements

Coherence Ordering for Ring-based Chip Multiprocessors Mike Marty and Mark D. Hill University of Wisconsin-Madison.
1 Lecture 18: Transactional Memories II Papers: LogTM: Log-Based Transactional Memory, HPCA’06, Wisconsin LogTM-SE: Decoupling Hardware Transactional Memory.
A KTEC Center of Excellence 1 Cooperative Caching for Chip Multiprocessors Jichuan Chang and Gurindar S. Sohi University of Wisconsin-Madison.
Multi-core systems System Architecture COMP25212 Daniel Goodman Advanced Processor Technologies Group.
ECE 454 Computer Systems Programming Parallel Architectures and Performance Implications (II) Ding Yuan ECE Dept., University of Toronto
CS252: Systems Programming Ninghui Li Program Interview Questions.
Hardware Transactional Memory for GPU Architectures Wilson W. L. Fung Inderpeet Singh Andrew Brownsword Tor M. Aamodt University of British Columbia In.
McRT-Malloc: A Scalable Non-Blocking Transaction Aware Memory Allocator Ali Adl-Tabatabai Ben Hertzberg Rick Hudson Bratin Saha.
Thread-Level Transactional Memory Decoupling Interface and Implementation UW Computer Architecture Affiliates Conference Kevin Moore October 21, 2004.
Transactional Memory (TM) Evan Jolley EE 6633 December 7, 2012.
Nested Transactional Memory: Model and Preliminary Architecture Sketches J. Eliot B. Moss Antony L. Hosking.
1 MetaTM/TxLinux: Transactional Memory For An Operating System Hany E. Ramadan, Christopher J. Rossbach, Donald E. Porter and Owen S. Hofmann Presenter:
1 Lecture 24: Transactional Memory Topics: transactional memory implementations.
Supporting Nested Transactional Memory in LogTM Authors Michelle J Moravan Mark Hill Jayaram Bobba Ben Liblit Kevin Moore Michael Swift Luke Yen David.
Unbounded Transactional Memory Paper by Ananian et al. of MIT CSAIL Presented by Daniel.
LogTM: Log-Based Transactional Memory Kevin E. Moore, Jayaram Bobba, Michelle J. Moravan, Mark D. Hill, & David A. Wood Presented by Colleen Lewis.
Transactions and Reliability. File system components Disk management Naming Reliability  What are the reliability issues in file systems? Security.
Lecture 3 Process Concepts. What is a Process? A process is the dynamic execution context of an executing program. Several processes may run concurrently,
A Qualitative Survey of Modern Software Transactional Memory Systems Virendra J. Marathe Michael L. Scott.
Computer Network Lab. Korea University Computer Networks Labs Se-Hee Whang.
Kevin E. Moore, Jayaram Bobba, Michelle J. Moravan, Mark D. Hill & David A. Wood Presented by: Eduardo Cuervo.
StealthTest: Low Overhead Online Software Testing Using Transactional Memory Jayaram Bobba, Weiwei Xiong*, Luke Yen †, Mark D. Hill, and David A. Wood.
Processes and Virtual Memory
© 2008 Multifacet ProjectUniversity of Wisconsin-Madison Pathological Interaction of Locks with Transactional Memory Haris Volos, Neelam Goyal, Michael.
CS 540 Database Management Systems
© 2006 Mulitfacet ProjectUniversity of Wisconsin-Madison LogTM: Log-Based Transactional Memory Kevin E. Moore, Jayaram Bobba, Michelle J. Moravan, Mark.
4 November 2005 CS 838 Presentation 1 Nested Transactional Memory: Model and Preliminary Sketches J. Eliot B. Moss and Antony L. Hosking Presented by:
Where Testing Fails …. Problem Areas Stack Overflow Race Conditions Deadlock Timing Reentrancy.
Running Commodity Operating Systems on Scalable Multiprocessors Edouard Bugnion, Scott Devine and Mendel Rosenblum Presentation by Mark Smith.
The University of Adelaide, School of Computer Science

Hathi: Durable Transactions for Memory using Flash
Transactions and Reliability
Lecture: Large Caches, Virtual Memory
Speculative Lock Elision
Minh, Trautmann, Chung, McDonald, Bronson, Casper, Kozyrakis, Olukotun
Transactional Memory : Hardware Proposals Overview
The University of Adelaide, School of Computer Science
The University of Adelaide, School of Computer Science
Hardware Transactional Memory
Multiprocessor Cache Coherency
Log-Based Transactional Memory
The University of Adelaide, School of Computer Science
CMSC 611: Advanced Computer Architecture
Two Ideas of This Paper Using Permissions-only Cache to deduce the rate at which less-efficient overflow handling mechanisms are invoked. When the overflow.
Lecture 12: Cache Innovations
The University of Adelaide, School of Computer Science
Lecture 11: Transactional Memory
Log-based Transactional Memory
Lecture 6: Transactions
Lecture: Cache Innovations, Virtual Memory
15-740/ Computer Architecture Lecture 5: Precise Exceptions
Lecture 21: Transactional Memory
Improving Multiple-CMP Systems with Token Coherence
Yiannis Nikolakopoulos
LogTM-SE: Decoupling Hardware Transactional Memory from Caches
Lecture: Cache Hierarchies
The University of Adelaide, School of Computer Science
Performance Pathologies in Hardware Transactional Memory
Lecture 17 Multiprocessors and Thread-Level Parallelism
Transactions in Distributed Systems
CS510 Operating System Foundations
Performance Pathologies in Hardware Transactional Memory
Lecture 17 Multiprocessors and Thread-Level Parallelism
Lecture 23: Transactional Memory
Lecture 21: Transactional Memory
Lecture: Transactional Memory
The University of Adelaide, School of Computer Science
Lecture 17 Multiprocessors and Thread-Level Parallelism
Presentation transcript:

© 2006 Mulitfacet ProjectUniversity of Wisconsin-Madison Supporting Nested Transactional Memory in LogTM Michelle J. Moravan, Jayaram Bobba, Kevin E. Moore, Luke Yen, Mark D. Hill, Ben Liblit, Michael M. Swift, David A. Wood Twelfth International Conference on Architectural Support for Programming Languages and Operating Systems

2 ASPLOS'06 10/25Wisconsin Multifacet Project Motivation Composability –Complex software built from existing modules Concurrency –“Intel has 10 projects in the works that contain four or more computing cores per chip” -- Paul Otellini, Intel CEO, Fall ’05 Communication –Operating system services –Run-time system interactions

3 ASPLOS'06 10/25Wisconsin Multifacet Project Nested LogTM Closed nesting: transactions remain isolated until parent commits Open nesting: transactions commit independent of parent Escape actions: non- transactional execution for system calls and I/O Log Frame Header Undo record Header Undo record Level 0 Level 1 Nested LogTM Splits log into “frames” Replicates R/W bits Software abort processing naturally executes compesating actions Nested LogTM Splits log into “frames” Replicates R/W bits Software abort processing naturally executes compesating actions Log Ptr

© 2006 Mulitfacet ProjectUniversity of Wisconsin-Madison Composability

5 ASPLOS'06 10/25Wisconsin Multifacet Project Composability: Closed Nested Transactions Modules expose interfaces, not implementations Must allow transactions within transactions Example –Insert() calls getID() from within a transaction –The getID() transaction is nested inside the insert() transaction int getID() { begin_transaction(); id = global_id++; commit_transaction(); return id; } void insert(object o){ begin_transaction(); n = find_node(); n.insert(getID(), o); commit_transaction(); } Parent Child

6 ASPLOS'06 10/25Wisconsin Multifacet Project Closed Nesting On Commit child transaction is merged with its parent Flat –Nested transactions “flattened” into a single transaction –Only outermost begins/commits are meaningful –Any conflict aborts to outermost transaction Partial rollback –Child transaction can be aborted independently –Can avoid costly re-execution of parent transaction –But child merges transaction state with parent on commit So most conflicts with child end up affecting the parent Child transactions remain isolated until parent commits

7 ASPLOS'06 10/25Wisconsin Multifacet Project M (RW) Conflict Detection in Nested LogTM Flat LogTM detects conflicts with directory coherence and Read (R ) and Write (W ) bits in caches Nested LogTM replicates R/W bits for each level Flash-Or circuit merges child and parent R/W bits RWTagData Data Caches RW Directory P1 I (-- ) P0 Conflict!

8 ASPLOS'06 10/25Wisconsin Multifacet Project 1 2 Version Management in Nested LogTM Flat LogTM’s log is a single frame (header + undo records) Nested LogTM’s log is a stack of frames A frame contains: –Header (including saved registers and pointer to parent’s frame) –Undo records (block address, old value pairs) –Garbage headers (headers of committed closed transactions) –Commit action records –Compensating action records LogBase LogPtr TM countHeader Undo record Header Undo record LogFrame

9 ASPLOS'06 10/25Wisconsin Multifacet Project Closed Nested Commit Merge child’s log frame with parent’s –Mark child’s header as a “garbage header” –Copy pointer from child’s header to LogFrame LogFrame LogPtr TM count Undo record Header Undo record 1 2 Header

© 2006 Mulitfacet ProjectUniversity of Wisconsin-Madison Concurrency

11 ASPLOS'06 10/25Wisconsin Multifacet Project Increased Concurrency: Open Nesting insert(int key, int value) { begin_transaction(); leaf = find_leaf(key); entry = insert_into_leaf(key, value); // lock entry to isolate node commit_transaction(); } insert_set(set S) { begin_transaction(); while ((key,value) = next(S)){ insert(key, value); } commit_transaction(); }

12 ASPLOS'06 10/25Wisconsin Multifacet Project Open Nesting Higher-level atomicity –Child’s memory updates not undone if parent aborts –Use compensating action to undo the child’s forward action at a higher-level of abstraction E.g., malloc() compensated by free() Higher-level isolation –Release memory-level isolation –Programmer enforce isolation at higher level (e.g., locks) –Use commit action to release isolation at parent commit Child transaction exposes state on commit (i.e., before the parent commits)

13 ASPLOS'06 10/25Wisconsin Multifacet Project Increased Concurrency: Open Nesting insert(int key, int value) { open_begin; leaf = find_leaf(key); entry = insert_into_leaf(key, value); // lock entry to isolate node entry->lock = 1; open_commit(abort_action(delete(key)), commit_action(unlock(key))); } insert_set(set S) { open_begin; while ((key,value) = next(S)) insert(key, value); open_commit(abort_action(delete_set(S))); }  Isolate entry at higher-level of abstraction  Release high-level isolation on ancestor commit  Delete entry if ancestor aborts  Replace compensating action with higher- level action on commit

14 ASPLOS'06 10/25Wisconsin Multifacet Project Commit and Compensating Actions Commit Actions –Execute in FIFO order when innermost open ancestor commits Outermost transaction is considered open Compensating Actions –Discard when innermost open ancestor commits –Execute in LIFO order when ancestor aborts –Execute “in the state that held when its forward action commited” [Moss, TRANSACT ‘06]

15 ASPLOS'06 10/25Wisconsin Multifacet Project Timing of Compensating Actions // initialize to 0 counter = 0; transaction_begin(); // top-level 1 counter++; // counter gets 1 open_begin(); // level 2 counter++; // counter gets 2 open_commit(abort_action(counter--));... // Abort and run compensating action // Expect counter to be restored to 0... transaction_commit(); // not executed

16 ASPLOS'06 10/25Wisconsin Multifacet Project Condition O1 If condition O1 holds programmers need not reason about the interaction between compensation and undo All implementations of nesting (so far) agree on semantics when O1 holds Condition O1: An open nested child transaction never modifies a memory location that has been modified by any ancestor.

17 ASPLOS'06 10/25Wisconsin Multifacet Project Open Nesting in LogTM Conflict Detection –R/W bits cleared on open commit –(no flash or) Version Management –Open commit pops the most recent frame off the log –(Optionally) add commit and compensating action records –Compensating actions are run by the software abort handler –Software handler interleaves restoration of memory state and compensating action execution

18 ASPLOS'06 10/25Wisconsin Multifacet Project Open Nested Commit Discard child’s log frame (Optionally) append commit and compensating actions to log LogFrame LogPtr TM countHeader Undo record Header Undo record 1 2 Commit Action Comp Action

19 ASPLOS'06 10/25Wisconsin Multifacet Project Timing of Compensating Actions // initialize to 0 counter = 0; transaction_begin(); // top-level 1 counter++; // counter gets 1 open_begin(); // level 2 counter++; // counter gets 2 open_commit(abort_action(counter--));... // Abort and run compensating action // Expect counter to be restored to 0... transaction_commit(); // not executed Condition O1: No writes to blocks written by an ancestor transaction. LogTM behaves correctly: Compensating action sees the state of the counter when the open transaction committed (2) Decrement restores the value to what it was before the open nest executed (1) Undo of the parent restores the value back to (0) LogTM behaves correctly: Compensating action sees the state of the counter when the open transaction committed (2) Decrement restores the value to what it was before the open nest executed (1) Undo of the parent restores the value back to (0)

© 2006 Mulitfacet ProjectUniversity of Wisconsin-Madison Communication

21 ASPLOS'06 10/25Wisconsin Multifacet Project Communication: Escape Actions “Real world” is not transactional Current OS’s are not transactional Systems should allow non-transactional escapes from a transaction Interact with OS, VM, devices, etc.

22 ASPLOS'06 10/25Wisconsin Multifacet Project Escape Actions Escape actions never: –Abort –Stall –Cause other transactions to abort –Cause other transactions to stall Commit and compensating actions –similar to open nested transactions Escape actions bypass transaction isolation and version management. Not recommended for the average programmer!

23 ASPLOS'06 10/25Wisconsin Multifacet Project Case Study: System Calls in Solaris Category#Examples Read-only57getpid, times, stat, access, mincore, sync, pread, gettimeofday Undoable (without global side effects) 40chdir, dup, umask, seteuid, nice, seek, mprotect Undoable (with global side effects) 27chmod, mkdir, link, mknod, stime Calls not handled by escape actions 89kill, fork, exec, umount

24 ASPLOS'06 10/25Wisconsin Multifacet Project Escape Actions in LogTM Loads and stores to non-transactional blocks behave as normal coherent accesses Loads return the latest value in coherent memory –Loads to a transactionally modified cache block triggers a writeback (sticky-M state) –Memory responds with an uncacheable copy of the block Stores modify coherent memory –Stores to transactionally modified blocks trigger writebacks (sticky-M) –Updates the value in memory (non-cacheable write through)

25 ASPLOS'06 10/25Wisconsin Multifacet Project Methods Simulated Machine: 32-way non-CMP 32 SPARC V9 processors running Solaris 9 OS 1 GHz in-order processors w/ ideal IPC=1 & private caches 16 kB 4-way split L1 cache, 1 cycle latency 4 MB 4-way unified L2 cache, 12 cycle latency 4 GB main memory, 80-cycle access latency Full-bit vector directory w/ directory cache Hierarchical switch interconnect, 14-cycle latency Simulation Infrastructure –Virtutech Simics for full-system function –Multifacet GEMS for memory system timing (Ruby only) GPL Release: –Magic no-ops instructions for begin_transaction() etc.

26 ASPLOS'06 10/25Wisconsin Multifacet Project B-Tree: Closed Nesting

27 ASPLOS'06 10/25Wisconsin Multifacet Project B-Tree: Closed Nesting

28 ASPLOS'06 10/25Wisconsin Multifacet Project B-Tree: Open Nesting

29 ASPLOS'06 10/25Wisconsin Multifacet Project Conclusions Closed Nesting (partial rollback) –Easy to implement--segment the transaction log (stack of log frames)/Replicate R & W bits –Small performance gains for LogTM Open Nesting –Easy to implement--software abort handling allows easy execution of commit actions and compensating actions –Big performance gains –Added software complexity Escape Actions –Provide non-transactional operations inside transactions –Sufficient for most Solaris system calls

30 ASPLOS'06 10/25Wisconsin Multifacet Project BACKUP SLIDES

31 ASPLOS'06 10/25Wisconsin Multifacet Project How Do Transactional Memory Systems Differ? (Data) Version Management –Eager: record old values “elsewhere”; update “in place” –Lazy: update “elsewhere”; keep old values “in place” (Data) Conflict Detection –Eager: detect conflict on every read/write –Lazy: detect conflict at end (commit/abort)  Fast commit  Less wasted work

32 ASPLOS'06 10/25Wisconsin Multifacet Project Microbenchmark Analysis Shared Counter –All threads update the same counter –High contention –Small Transactions LogTM v. Locks –EXP - Test-And-Test-And- Set Locks with Exponential Backoff –MCS - Software Queue- Based Locks BEGIN_TRANSACTION(); new_total = total.count + 1; private_data[id].count++; total.count = new_total; COMMIT_TRANSACTION();

33 ASPLOS'06 10/25Wisconsin Multifacet Project Locks are Hard // WITH LOCKS void move(T s, T d, Obj key){ LOCK(s); LOCK(d); tmp = s.remove(key); d.insert(key, tmp); UNLOCK(d); UNLOCK(s); } DEADLOCK! move(a, b, key1); move(b, a, key2); Thread 0 Thread 1 Moreover Coarse-grain locking limits concurrency Fine-grain locking difficult

34 ASPLOS'06 10/25Wisconsin Multifacet Project Eager Version Management Discussion Advantages: –No extra indirection (unlike STM) –Fast Commits No copying Common case Disadvantages –Slow/Complex Aborts Undo aborting transaction –Relies on Eager Conflict Detection/Prevention

35 ASPLOS'06 10/25Wisconsin Multifacet Project Log-Based Transactional Memory (LogTM) HPCA LogTM: Log-Based Transactional Memory, Kevin E. Moore, Jayaram Bobba, Michelle J. Moravan, Mark D. Hill and David A. Wood New values stored in place (even in main memory) Old values stored in a thread-private transaction log Aborts processed in software Virtual Memory New Values Old Values Transaction Logs

36 ASPLOS'06 10/25Wisconsin Multifacet Project Strided Array

37 ASPLOS'06 10/25Wisconsin Multifacet Project Sticky States MSI MM EE SS ISticky-MSticky-SI Directory State Cache State

38 ASPLOS'06 10/25Wisconsin Multifacet Project Flat LogTM (HPCA’06) C New values stored in place Old values stored in the transaction log –A per-thread linear (virtual) address space (like the stack) –Filled by hardware (during transactions) –Read by software (on abort) R/W bits Data BlockVA Log Base Log Ptr TM count R W c Transaction Log

39 ASPLOS'06 10/25Wisconsin Multifacet Project Conflict Detection in LogTM Conflict Detection –Nested LogTM replicates R/W bits for each level –Flash-Or circuit merges child and parent R/W bits Version Management –Nested LogTM segments the log into frames –(similar to a stack of activation records) RWTagData Data Caches Processor RegistersRegister Checkpoint LogBase LogPtr TMcount RW LogFrame

40 ASPLOS'06 10/25Wisconsin Multifacet Project Motivation: Transactional Memory Chip-multiprocessors/Multi-core/Many-core are here –“Intel has 10 projects in the works that contain four or more computing cores per chip” -- Paul Otellini, Intel CEO, Fall ’05 We must effectively program these systems –But programming with locks is challenging –“Blocking on a mutex is a surprisingly delicate dance” -- OpenSolaris, mutex.c

41 ASPLOS'06 10/25Wisconsin Multifacet Project Conclusions Nested LogTM supports: –Closed nesting facilitates software composition –Open nesting increases concurrency (but adds complexity) –Escape actions support non-transactional actions Nested LogTM Version Management –Segments the transaction log (stack of log frames) –Software abort handling allows easy execution of commit actions and compensating actions Nested LogTM Conflict Detection –Replicates R/W in caches –Flash-Or merges closed child with parent