Transactional Memory Lecturer: Danny Hendler.  Speeding up uni-processors is harder and harder  Intel, Sun (RIP), AMD, IBM now focusing on “multi-core”

Slides:



Advertisements
Similar presentations
Transactional Memory Parag Dixit Bruno Vavala Computer Architecture Course, 2012.
Advertisements

Time-based Transactional Memory with Scalable Time Bases Torvald Riegel, Christof Fetzer, Pascal Felber Presented By: Michael Gendelman.
Impossibilities for Disjoint-Access Parallel Transactional Memory : Alessia Milani [Guerraoui & Kapalka, SPAA 08] [Attiya, Hillel & Milani, SPAA 09]
CS 162 Memory Consistency Models. Memory operations are reordered to improve performance Hardware (e.g., store buffer, reorder buffer) Compiler (e.g.,
D u k e S y s t e m s Time, clocks, and consistency and the JMM Jeff Chase Duke University.
Ch. 7 Process Synchronization (1/2) I Background F Producer - Consumer process :  Compiler, Assembler, Loader, · · · · · · F Bounded buffer.
Chapter 6: Process Synchronization
Background Concurrent access to shared data can lead to inconsistencies Maintaining data consistency among cooperating processes is critical What is wrong.
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 5: Process Synchronization.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 6: Process Synchronization.
Process Synchronization. Module 6: Process Synchronization Background The Critical-Section Problem Peterson’s Solution Synchronization Hardware Semaphores.
Virendra J. Marathe, William N. Scherer III, and Michael L. Scott Department of Computer Science University of Rochester Presented by: Armand R. Burks.
Software Transactional Memory Kevin Boos. Two Papers Software Transactional Memory for Dynamic-Sized Data Structures (DSTM) – Maurice Herlihy et al –
Hybrid Transactional Memory Nir Shavit MIT and Tel-Aviv University Joint work with Alex Matveev (and describing the work of many in this summer school)
Ali Saoud Object Based Transactional Memory. Introduction Resent trends go towards object based SMT because it’s dynamic Word-based STM systems are more.
Transactional Memory (TM) Evan Jolley EE 6633 December 7, 2012.
PARALLEL PROGRAMMING with TRANSACTIONAL MEMORY Pratibha Kona.
Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed.
1 Lecture 21: Transactional Memory Topics: consistency model recap, introduction to transactional memory.
1 MetaTM/TxLinux: Transactional Memory For An Operating System Hany E. Ramadan, Christopher J. Rossbach, Donald E. Porter and Owen S. Hofmann Presenter:
CS 582 / CMPE 481 Distributed Systems Concurrency Control.
EPFL - March 7th, 2008 Interfacing Software Transactional Memory Simplicity vs. Flexibility Vincent Gramoli.
1 Lecture 23: Transactional Memory Topics: consistency model recap, introduction to transactional memory.
Computer Laboratory Practical non-blocking data structures Tim Harris Computer Laboratory.
Software Transactional Memory for Dynamic-sized Data Structures Maurice Herlihy, Victor Luchango, Mark Moir, William N. Scherer III Presented by: Irina.
CS510 Concurrent Systems Class 13 Software Transactional Memory Should Not be Obstruction-Free.
Language Support for Lightweight transactions Tim Harris & Keir Fraser Presented by Narayanan Sundaram 04/28/2008.
Software Transaction Memory for Dynamic-Sized Data Structures presented by: Mark Schall.
Why The Grass May Not Be Greener On The Other Side: A Comparison of Locking vs. Transactional Memory Written by: Paul E. McKenney Jonathan Walpole Maged.
An Introduction to Software Transactional Memory
Software Transactional Memory for Dynamic-Sized Data Structures Maurice Herlihy, Victor Luchangco, Mark Moir, William Scherer Presented by: Gokul Soundararajan.
Programming Paradigms for Concurrency Part 2: Transactional Memories Vasu Singh
Concurrency, Mutual Exclusion and Synchronization.
Reduced Hardware NOrec: A Safe and Scalable Hybrid Transactional Memory Alexander Matveev Nir Shavit MIT.
1 Hardware Transactional Memory (Herlihy, Moss, 1993) Some slides are taken from a presentation by Royi Maimon & Merav Havuv, prepared for a seminar given.
Software Transactional Memory Yoav Cohen Seminar in Distributed Computing Spring 2007 Yoav Cohen Seminar in Distributed Computing Spring 2007.
A Qualitative Survey of Modern Software Transactional Memory Systems Virendra J. Marathe Michael L. Scott.
CS5204 – Operating Systems Transactional Memory Part 2: Software-Based Approaches.
Optimistic Design 1. Guarded Methods Do something based on the fact that one or more objects have particular states  Make a set of purchases assuming.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Mutual Exclusion.
On the Performance of Window-Based Contention Managers for Transactional Memory Gokarna Sharma and Costas Busch Louisiana State University.
Transactional Memory Lecturer: Danny Hendler. 2 2 From the New York Times…
Wait-Free Multi-Word Compare- And-Swap using Greedy Helping and Grabbing Håkan Sundell PDPTA 2009.
Practical concurrent algorithms Mihai Letia Concurrent Algorithms 2012 Distributed Programming Laboratory Slides by Aleksandar Dragojevic.
CS399 New Beginnings Jonathan Walpole. 2 Concurrent Programming & Synchronization Primitives.
Consistency Oblivious Programming Hillel Avni Tel Aviv University.
Software Transactional Memory Should Not Be Obstruction-Free Robert Ennals Presented by Abdulai Sei.
1 Synchronization via Transactions. 2 Concurrency Quiz If two threads execute this program concurrently, how many different final values of X are there?
MULTIVIE W Slide 1 (of 21) Software Transactional Memory Should Not Be Obstruction Free Paper: Robert Ennals Presenter: Emerson Murphy-Hill.
Concurrent Computing Seminar Introductory Lecture Instructor: Danny Hendler
Agenda  Quick Review  Finish Introduction  Java Threads.
Novel Paradigms of Parallel Programming Prof. Smruti R. Sarangi IIT Delhi.
Lecture 20: Consistency Models, TM
Maurice Herlihy, Victor Luchangco, Mark Moir, William N. Scherer III
Maurice Herlihy and J. Eliot B. Moss,  ISCA '93
Minh, Trautmann, Chung, McDonald, Bronson, Casper, Kozyrakis, Olukotun
Part 2: Software-Based Approaches
Atomic Operations in Hardware
Atomic Operations in Hardware
Challenges in Concurrent Computing
Maurice Herlihy, Victor Luchangco, Mark Moir, William N. Scherer III
A Qualitative Survey of Modern Software Transactional Memory Systems
Part 1: Concepts and Hardware- Based Approaches
Lecture 22: Consistency Models, TM
Hybrid Transactional Memory
Concurrency: Mutual Exclusion and Process Synchronization
Lecturer: Danny Hendler
Chapter 6: Synchronization Tools
Controlled Interleaving for Transactions
CSE 542: Operating Systems
Presentation transcript:

Transactional Memory Lecturer: Danny Hendler

 Speeding up uni-processors is harder and harder  Intel, Sun (RIP), AMD, IBM now focusing on “multi-core” architectures  Already, most computers are multiprocessors How can we write correct and efficient algorithms for multiprocessors ? The Future of Computing

A fundamental problem of thread-level parallelism. Account[i] = Account[i]-X; Account[j] = Account[j]+X;.... Account[i] = Account[i]-X; Account[j] = Account[j]+X;... Thread A Thread B But what if execution is concurrent? Must avoid race conditions

Inter-thread synch. alternatives

What is a transaction? A transaction is a sequence of memory reads and writes, executed by a single thread, that either commits or aborts If a transaction commits, all the reads and writes appear to have executed atomically If a transaction aborts, none of its stores take effect Transaction operations aren't visible until they commit (if they do)

Transactions properties: A transaction satisfies the following key property: Atomicity: Each transaction either commits (its changes seem to take effect atomically) or aborts (its changes have no effect). Serializability: all committed transactions issue the same operations and receive the same responses as in some sequential history consisting only of committed transactions. Some work considers weaker or stronger requirements Isolation: Transaction writes are not visible outside the transaction until it commits

Transactional Memory Goals A new multiprocessor architecture The goal: Implementing nonblocking synchronization that is – efficient – easy to use compared with conventional techniques based on mutual exclusion Implemented by hardware support (such as straightforward extensions to multiprocessor cache- coherence protocols) and / or by software mechanisms

A Usage Example Locks: Lock(L[i]); Lock(L[j]); Account[i] = Account[i] – X; Account[j] = Account[j] + X; Unlock(L[j]); Unlock(L[i]); Transactional Memory: atomic { Account[i] = Account[i] – X; Account[j] = Account[j] + X; }; Account[i] = Account[i]-X; Account[j] = Account[j]+X;

Transactions execute in commit order ld 0xdddd... st 0xbeef Transaction A Time ld 0xbeef Transaction C ld 0xbeef Re-execute with new data Commit ld 0xdddd... ld 0xbbbb Transaction B Commit Violation! 0xbeef Taken from a presentation by Royi Maimon & Merav Havuv, prepared for a seminar given by Prof. Yehuda Afek. Transactions interaction

Software Transactional Memory for Dynamic-Sized Data Structures (DSTM – Dynamic STM) Maurice Herlihy, Victor Luchangco, Mark Moir, William N. Scherer III PODC 2003 Prepared by Adi Suissa

Motivation Transactional Memory – simplifies parallel programming STM – Software based TM ▫ Usually simpler than Hardware based TM ▫ Can handle situations where HTM fails However: ▫ It is immature (supports static data sets and static transactions) ▫ It is complicated

Overview Short recap and what’s new? How to use DSTM? Example Diving into DSTM Example 2 Improving performance Obstruction freedom

Transactions Transaction – a sequence of steps executed by a single thread Transactions are atomic: each transaction either commits (it takes effect) or aborts (its effects are discarded) Transactions are linearizable: they appear to take effect in a one-at-a-time order

The computation model Starting transaction Read-Transactional(o1) Write-Transactional(o2) Read(o3) Write(o4) Commit-Transaction

The computation model Committing a transaction can have two outcomes: ▫ Success: the transaction’s operations take effect ▫ Failure: the operations are discarded Implemented in Java and in C++

Previous STM designs Only static memory – need to declare the memory that can be transactioned statically ▫ We want the ability to create transactional objects dynamically Only static transactions – transactions need to declare which addresses they are going to access before the transaction begins ▫ We want to let transactions determine which object to access based on information of objects read inside a transaction and this is why it is called Dynamic Software Transactional Memory

Overview Short recap and what’s new? How to use DSTM? Example Diving into DSTM Example 2 Improving performance Obstruction freedom

Threads A thread that executes transactions must be inherited from TMThread Each thread can run a single transaction at a time class TMThread : Thread { void beginTransaction(); bool commitTransaction(); void abortTransaction(); } Don’t forget the run() method

Objects (1) All transactinal objects must implement the TMCloneable interface: This method clones the object, but clone implementors don’t need to handle synchronization issues inteface TMCloneable { Object clone(); }

Objects (2) In order to make an object transactional, need to wrap it TMObject is a container for regular Java objects Object TMObject

Opening an object Before using a TMObject in a transaction, it must be opened An object can either be opened for READ or WRITE (and read) class TMObject { TMObject(Object obj); enum Mode {READ, WRITE}; Object open(Mode mode); }

Overview Short recap and what’s new? How to use DSTM? Example Diving into DSTM Example 2 Improving performance Obstruction freedom

An atomic counter (1) The counter has a single data member and two operations: The object is shared by multiple threads class Counter : TMCloneable { int counterValue = 0; void inc(); // increment the value int value(); // returns the value Object clone(); }

An atomic counter (2) When a thread wants to access the counter in a transaction, it must first open the object using the encapsulated version: Counter counter = new Counter(); TMObject tranCounter = new TMObject(counter); ((TMThread)Thread.currentThread).beginTransaction(); … Counter counter = (Counter)tranCounter.open(WRITE); counter.inc(); … ((TMThread)Thread.currentThread).commitTransaction(); Returns true/false to indicate commit status

Overview Short recap and what’s new? How to use DSTM? Example Diving into DSTM Example 2 Improving performance Obstruction freedom

DSTM implementation Transactional object structure: start TMObject transaction new object old object status Data Locator

Current object version The current object version is determined by the status of the transaction that most recently opened the object in WRITE mode: ▫ committed: the new object is the current ▫ aborted: the old object is the current ▫ active: the old object is the current, and the new is tentative The actual version only changes when a commit is successful

Opening an object (1) Let's assume transaction A tries to open object o in WRITE mode. Let transaction B be the transaction that most recently opened o in WRITE mode. We need to distinguish between the following cases: ▫ B is committed ▫ B is aborted ▫ B is active

Opening an object (2) – B committed start o transaction new object old object committed Data B’s Locator 1 A creates a new Locator transaction new object old object A’s Locator 2 A clones the previous new object, and sets new Data clone 3 A sets old object to the previous new active 4 Use CAS in order to replace locator If CAS fails, A restarts from the beginning

Opening an object (3) – B aborted start o transaction new object old object aborted Data B’s Locator 1 A creates a new Locator transaction new object old object A’s Locator 2 A clones the previous old object, and sets new Data clone 3 A sets old object to the previous old active 4 Use CAS in order to replace locator

Opening an object (4) – B active Problem: B is active and can either commit or abort, so which version (old/new) should we use? Answer: A and B are conflicting transactions, that run at the same time Use Contention Manager to decide which should continue and which should abort If B needs to abort, try to change its status to aborted (using CAS)

Opening an object (5) Lets assume transaction A opens object o in READ mode ▫ Fetch the current version just as before ▫ Add the pair (o, v) to the readers list (read- only table)

Committing a transaction The commit needs to do the following: 1.Validate the transaction 2.Change the transaction’s status from active to committed (using CAS)

Validating transactions What? ▫ Validate the objects (only) read by the transaction Why? ▫ To make sure that the transaction observes a consistent state How? 1.For each pair (o, v) in the read-only table, verify that v is still the most recently committed version of o 2.Check that (status == active) If the validation fails, throw an exception so the user will restart the transaction from the beginning

Validation inconsistency Assume two threads A and B If B after A, then o1 = 2, o2 = 1; If A after B, then o1 = 1, o2 = 2 If they run concurrently we can have o1 = 1, o2 = 1 which is illegal Thread A 1. x <- read(o1) 2. w(o2, x + 1) Thread B 1. y <- read(o2) 2. w(o1, y + 1) Initially: o1 = 0 o2 = 0

Conflicts Conflicts are detected when: ▫ A transaction first opens an object and finds that it is open for modification by another transaction ▫ When the transaction validates its read set (on opening an object or commit)

Overview Short recap and what’s new? How to use DSTM? Example Diving into DSTM Example 2 Improving performance Obstruction freedom

Ordered Integer List – IntSet (1) Min348Max 6

Ordered Integer List – IntSet (2) class List implements TMCloneable { int value; TMObject next; List(int v) { value = v; } public Object clone() { List newList = new List(value); newList.next = next; return newList; }

Ordered Integer List – IntSet (3) class IntSet { TMObject first; // the list’s anchor IntSet() { List firstList = new List (Integer.MIN_VALUE); first = new TMObject(firstList); firstList.next = new TMObject( new List(Integer.MAX_VALUE)); }

Ordered Integer List – IntSet (4) class IntSet { boolean insert(int v) { List newList = new List(v); TMObject newNode = new TMObject(newList); TMThread thread = Thread.currentThread(); while (true) { thread.beginTransaction(); boolean result = true; try { … } catch (Denied d) {} if (thread.commitTransaction()) return result; }

Ordered Integer List – IntSet (5) try { List prevList = (List)this.first.open(WRITE); List currList = (List)prevList.next.open(WRITE); while (currList.value < v) { prevList = currList; currList = (List)currList.next.open(WRITE); } if (currList.value == v) { result = false; } else { result = true; newList.next = prevList.next; prevList.next = newNode; }

Overview Short recap and what’s new? How to use DSTM? Example Diving into DSTM Example 2 Improving performance Obstruction freedom

Single entrance What is the problem with the previous example? How can it be solved? ▫ Opening for READ on traversal ▫ Maybe something more sophisticated?

Releasing an object An object that was open for READ can be released What does it imply? ▫ Careful planning ▫ Can increase performance ▫ What happens if we open an object, release it and open it again in the same transaction? ▫ Can lead to validation problems

Overview Short recap and what’s new? How to use DSTM? Example Diving into DSTM Example 2 Improving performance Obstruction freedom

Non-Blocking Algorithms A family of algorithms on a shared data Each sub-family satisfies different progress guarantees Usually, there is a correlation between the progress guarantee strength and the complexity of the algorithm

Wait-Free algorithms An algorithm is wait-free if every operation has a bound on the number of steps it will take before completing No Starvation

Lock-Free algorithms An algorithm is lock-free if every step taken achieves global progress Even if n-1 processes fail (while doing operations on the shared memory), the last processor can still complete its operation Example: Shavit & Touitou’s STM implementation

Obstruction-Free algorithms An algorithm is obstruction-free if at any point, a single thread executed in isolation for a bounded number of steps will complete its operation Doesn’t avoid live-locks Example: DSTM implementation What is it good for?

Contention Manager (CM) The contention manager arbitrates between two conflicting transactions Given two (conflicting) transactions T A, T B, then CM(T A, T B ): 1.Decides who wins 2.Decides what the loser should do (abort/wait/retry) Conflicts policy