1 Lecture 23: Transactional Memory Topics: consistency model recap, introduction to transactional memory.

Slides:



Advertisements
Similar presentations
Bounded Model Checking of Concurrent Data Types on Relaxed Memory Models: A Case Study Sebastian Burckhardt Rajeev Alur Milo M. K. Martin Department of.
Advertisements

Symmetric Multiprocessors: Synchronization and Sequential Consistency.
1 Lecture 20: Synchronization & Consistency Topics: synchronization, consistency models (Sections )
Optimistic Methods for Concurrency Control By : H.T. Kung & John T. Robinson Presenters: Munawer Saeed.
CS 162 Memory Consistency Models. Memory operations are reordered to improve performance Hardware (e.g., store buffer, reorder buffer) Compiler (e.g.,
1 Lecture 20: Speculation Papers: Is SC+ILP=RC?, Purdue, ISCA’99 Coherence Decoupling: Making Use of Incoherence, Wisconsin, ASPLOS’04 Selective, Accurate,
1 Lecture 21: Transactional Memory Topics: consistency model recap, introduction to transactional memory.
CS 7810 Lecture 19 Coherence Decoupling: Making Use of Incoherence J.Huh, J. Chang, D. Burger, G. Sohi Proceedings of ASPLOS-XI October 2004.
1 Lecture 7: Transactional Memory Intro Topics: introduction to transactional memory, “lazy” implementation.
1 Lecture 8: Transactional Memory – TCC Topics: “lazy” implementation (TCC)
1 Lecture 21: Synchronization Topics: lock implementations (Sections )
1 Lecture 7: Consistency Models Topics: sequential consistency, requirements to implement sequential consistency, relaxed consistency models.
1 Lecture 24: Transactional Memory Topics: transactional memory implementations.
Lecture 13: Consistency Models
1 Lecture 6: TM – Eager Implementations Topics: Eager conflict detection (LogTM), TM pathologies.
Scalable, Reliable, Power-Efficient Communication for Hardware Transactional Memory Seth Pugsley, Manu Awasthi, Niti Madan, Naveen Muralimanohar and Rajeev.
Computer Architecture II 1 Computer architecture II Lecture 9.
1 Lecture 6: Lazy Transactional Memory Topics: TM semantics and implementation details of “lazy” TM.
1 Lecture 15: Consistency Models Topics: sequential consistency, requirements to implement sequential consistency, relaxed consistency models.
Unbounded Transactional Memory Paper by Ananian et al. of MIT CSAIL Presented by Daniel.
1 Lecture 5: TM – Lazy Implementations Topics: TM design (TCC) with lazy conflict detection and lazy versioning, intro to eager conflict detection.
1 Lecture 4: Directory Protocols and TM Topics: corner cases in directory protocols, lazy TM.
1 Lecture 9: TM Implementations Topics: wrap-up of “lazy” implementation (TCC), eager implementation (LogTM)
Synchron. CSE 4711 The Need for Synchronization Multiprogramming –“logical” concurrency: processes appear to run concurrently although there is only one.
1 Lecture 10: TM Implementations Topics: wrap-up of eager implementation (LogTM), scalable lazy implementation.
1 Lecture 22: Synchronization & Consistency Topics: synchronization, consistency models (Sections )
1 Lecture 20: Protocols and Synchronization Topics: distributed shared-memory multiprocessors, synchronization (Sections )
1 Computer Architecture Research Overview Focus on: Transactional Memory Rajeev Balasubramonian School of Computing, University of Utah
Memory Consistency Models Alistair Rendell See “Shared Memory Consistency Models: A Tutorial”, S.V. Adve and K. Gharachorloo Chapter 8 pp of Wilkinson.
Shared Memory Consistency Models. SMP systems support shared memory abstraction: all processors see the whole memory and can perform memory operations.
1 Lecture 19: Scalable Protocols & Synch Topics: coherence protocols for distributed shared-memory multiprocessors and synchronization (Sections )
1 Lecture 3: Coherence Protocols Topics: consistency models, coherence protocol examples.
1 Lecture 20: Speculation Papers: Is SC+ILP=RC?, Purdue, ISCA’99 Coherence Decoupling: Making Use of Incoherence, Wisconsin, ASPLOS’04.
1 Lecture 25: Multiprocessors Today’s topics:  Synchronization  Consistency  Shared memory vs message-passing  Simultaneous multi-threading (SMT)
1 Lecture 9: Directory Protocol, TM Topics: corner cases in directory protocols, coherence vs. message-passing, TM intro.
Novel Paradigms of Parallel Programming Prof. Smruti R. Sarangi IIT Delhi.
Symmetric Multiprocessors: Synchronization and Sequential Consistency
Lecture 20: Consistency Models, TM
Lecture 19: Coherence and Synchronization
The University of Adelaide, School of Computer Science
Lecture 11: Consistency Models
Lecture 19: Transactional Memories III
Symmetric Multiprocessors: Synchronization and Sequential Consistency
Shared Memory Consistency Models: A Tutorial
Lecture 12: TM, Consistency Models
Lecture 5: Snooping Protocol Design Issues
Lecture: Consistency Models, TM
Symmetric Multiprocessors: Synchronization and Sequential Consistency
Lecture 6: Transactions
Lecture 21: Transactional Memory
Lecture 21: Synchronization and Consistency
Lecture 22: Consistency Models, TM
Lecture: Coherence and Synchronization
Lecture: Consistency Models, TM
Lecture 10: Consistency Models
The University of Adelaide, School of Computer Science
Lecture 17 Multiprocessors and Thread-Level Parallelism
Lecture 24: Multiprocessors
Lecture 10: Coherence, Transactional Memory
Lecture 17 Multiprocessors and Thread-Level Parallelism
Lecture 23: Transactional Memory
Lecture 21: Transactional Memory
Lecture 21: Synchronization & Consistency
Lecture: Coherence and Synchronization
Lecture: Consistency Models, TM
Lecture: Transactional Memory
Lecture 18: Coherence and Synchronization
The University of Adelaide, School of Computer Science
Lecture 17 Multiprocessors and Thread-Level Parallelism
Lecture 11: Consistency Models
Presentation transcript:

1 Lecture 23: Transactional Memory Topics: consistency model recap, introduction to transactional memory

2 Example Programs Initially, A = B = 0 P1 P2 A = 1 B = 1 if (B == 0) if (A == 0) critical section critical section Initially, A = B = 0 P1 P2 P3 A = 1 if (A == 1) B = 1 if (B == 1) register = A P1 P2 Data = 2000 while (Head == 0) Head = 1 { } … = Data

3 Sequential Consistency P1 P2 Instr-a Instr-A Instr-b Instr-B Instr-c Instr-C Instr-d Instr-D … … We assume: Within a program, program order is preserved Each instruction executes atomically Instructions from different threads can be interleaved arbitrarily Valid executions: abAcBCDdeE… or ABCDEFabGc… or abcAdBe… or aAbBcCdDeE… or …..

4 Sequential Consistency A multiprocessor is sequentially consistent if the result of the execution is achieveable by maintaining program order within a processor and interleaving accesses by different processors in an arbitrary fashion Can implement sequential consistency by requiring the following: program order, write serialization, everyone has seen an update before a value is read – very intuitive for the programmer, but extremely slow This is very slow… alternatives:  Add optimizations to the hardware  Offer a relaxed memory consistency model and fences

5 Fences P1 P2 { { Region of code Region of code with no races with no races } } Fence Acquire_lock Fence { { Racy code Racy code } } Fence Release_lock Fence

6 Transactions New paradigm to simplify programming  instead of lock-unlock, use transaction begin-end  locks are blocking, transactions execute speculative in the hope that there will be no conflicts Can yield better performance; Eliminates deadlocks Programmer can freely encapsulate code sections within transactions and not worry about the impact on performance and correctness (for the most part) Programmer specifies the code sections they’d like to see execute atomically – the hardware takes care of the rest (provides illusion of atomicity)

7 Transactions Transactional semantics:  when a transaction executes, it is as if the rest of the system is suspended and the transaction is in isolation  the reads and writes of a transaction happen as if they are all a single atomic operation  if the above conditions are not met, the transaction fails to commit (abort) and tries again transaction begin read shared variables arithmetic write shared variables transaction end

8 Example 1 lock (lock1) counter = counter + 1; unlock (lock1) transaction begin counter = counter + 1; transaction end No apparent advantage to using transactions (apart from fault resiliency)

9 Example 2 Producer-consumer relationships – producers place tasks at the tail of a work-queue and consumers pull tasks out of the head Enqueue Dequeue transaction begin transaction begin if (tail == NULL) if (head->next == NULL) update head and tail update head and tail else else update tail update head transaction end transaction end With locks, neither thread can proceed in parallel since head/tail may be updated – with transactions, enqueue and dequeue can proceed in parallel – transactions will be aborted only if the queue is nearly empty

10 Example 3 Hash table implementation transaction begin index = hash(key); head = bucket[index]; traverse linked list until key matches perform operations transaction end Most operations will likely not conflict  transactions proceed in parallel Coarse-grain lock  serialize all operations Fine-grained locks (one for each bucket)  more complexity, more storage, concurrent reads not allowed, concurrent writes to different elements not allowed

11 TM Implementation Core Cache Caches track read-sets and write-sets Writes are made visible only at the end of the transaction At transaction commit, make your writes visible; others may abort

12 Detecting Conflicts – Basic Implementation Writes can be cached (can’t be written to memory) – if the block needs to be evicted, flag an overflow (abort transaction for now) – on an abort, invalidate the written cache lines Keep track of read-set and write-set (bits in the cache) for each transaction When another transaction commits, compare its write set with your own read set – a match causes an abort At transaction end, express intent to commit, broadcast write-set (transactions can commit in parallel if their write-sets do not intersect)

13 Summary of TM Benefits As easy to program as coarse-grain locks Performance similar to fine-grain locks Speculative parallelization Avoids deadlock Resilient to faults

14 Design Space Data Versioning  Eager: based on an undo log  Lazy: based on a write buffer Conflict Detection  Optimistic detection: check for conflicts at commit time (proceed optimistically thru transaction)  Pessimistic detection: every read/write checks for conflicts (reduces work during commit)

15 Relation to LL-SC Transactions can be viewed as an extension of LL-SC LL-SC ensures that the read-modify-write for a single variable is atomic; a transaction ensures atomicity for all variables accessed between trans-begin and trans-end Vers-1 Vers-2 Vers-3 ll a ll a trans-begin ld b ll b ld a st b sc b ld b sc a sc a st b st a trans-end

16 Title Bullet