CSE 230 Concurrency: STM Slides due to: Kathleen Fisher, Simon Peyton Jones, Satnam Singh, Don Stewart.

Slides:

Advertisements

Similar presentations

Copyright 2008 Sun Microsystems, Inc Better Expressiveness for HTM using Split Hardware Transactions Yossi Lev Brown University & Sun Microsystems Laboratories.

Advertisements

Concurrency unlocked transactional memory for composable concurrency Tim Harris Maurice Herlihy Simon Marlow Simon Peyton Jones.

50.003: Elements of Software Construction Week 6 Thread Safety and Synchronization.

CS492B Analysis of Concurrent Programs Lock Basics Jaehyuk Huh Computer Science, KAIST.

Kathleen Fisher cs242 Reading: “Beautiful Concurrency”,Beautiful Concurrency “The Transactional Memory / Garbage Collection Analogy”The Transactional Memory.

Concurrency The need for speed. Why concurrency? Moore’s law: 1. The number of components on a chip doubles about every 18 months 2. The speed of computation.

Software Transactional Memory Steve Severance – Alpha Heavy Industries.

Concurrency 101 Shared state. Part 1: General Concepts 2.

CMPT 401 Summer 2007 Dr. Alexandra Fedorova Lecture X: Transactions.

1/25 Concurrency and asynchronous computing How do we deal with evaluation when we have a bunch of processors involved?

Simon Peyton Jones (Microsoft Research) Tokyo Haskell Users Group April 2010.

Designing a thread-safe class  Store all states in public static fields  Verifying thread safety is hard  Modifications to the program hard  Design.

CS444/CS544 Operating Systems Synchronization 2/16/2006 Prof. Searleman

CMPT Dr. Alexandra Fedorova Lecture X: Transactions.

Threading Part 2 CS221 – 4/22/09. Where We Left Off Simple Threads Program: – Start a worker thread from the Main thread – Worker thread prints messages.

Parallelism and Concurrency Koen Lindström Claessen Chalmers University Gothenburg, Sweden Ulf Norell.

1 Lecture 21: Transactional Memory Topics: consistency model recap, introduction to transactional memory.

1 Lecture 7: Transactional Memory Intro Topics: introduction to transactional memory, “lazy” implementation.

Synchronization in Java Nelson Padua-Perez Bill Pugh Department of Computer Science University of Maryland, College Park.

Language Support for Lightweight transactions Tim Harris & Keir Fraser Presented by Narayanan Sundaram 04/28/2008.

1 New Architectures Need New Languages A triumph of optimism over experience! Ian Watson 3 rd July 2009.

Concurrency and Software Transactional Memories Satnam Singh, Microsoft Faculty Summit 2005.

Synchronization in Java Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.

1 Sharing Objects – Ch. 3 Visibility What is the source of the issue? Volatile Dekker’s algorithm Publication and Escape Thread Confinement Immutability.

29-Jun-15 Java Concurrency. Definitions Parallel processes—two or more Threads are running simultaneously, on different cores (processors), in the same.

Highly Available ACID Memory Vijayshankar Raman. Introduction §Why ACID memory? l non-database apps: want updates to critical data to be atomic and persistent.

David Walker Optional Reading: “Beautiful Concurrency”, “The Transactional Memory / Garbage Collection Analogy” “A Tutorial on Parallel and Concurrent.

Programming Paradigms for Concurrency Part 2: Transactional Memories Vasu Singh

Kathleen Fisher cs242 Reading: “Beautiful Concurrency”,Beautiful Concurrency “The Transactional Memory / Garbage Collection Analogy”The Transactional Memory.

Software & the Concurrency Revolution by Sutter & Larus ACM Queue Magazine, Sept For CMPS Halverson 1.

Threading and Concurrency Issues ● Creating Threads ● In Java ● Subclassing Thread ● Implementing Runnable ● Synchronization ● Immutable ● Synchronized.

CS5204 – Operating Systems Transactional Memory Part 2: Software-Based Approaches.

Games Development 2 Concurrent Programming CO3301 Week 9.

Optimistic Design 1. Guarded Methods Do something based on the fact that one or more objects have particular states  Make a set of purchases assuming.

Introduction to Database Systems1. 2 Basic Definitions Mini-world Some part of the real world about which data is stored in a database. Data Known facts.

CSC321 Concurrent Programming: §5 Monitors 1 Section 5 Monitors.

ICS 313: Programming Language Theory Chapter 13: Concurrency.

Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451.

QIAN XI COS597C 10/28/2010 Explicit Concurrent Programming in Haskell.

Giovanni Chierico | May 2012 | Дубна Data Concurrency, Consistency and Integrity.

CS533 – Spring Jeanie M. Schwenk Experiences and Processes and Monitors with Mesa What is Mesa? “Mesa is a strongly typed, block structured programming.

Transaction Management Transparencies. ©Pearson Education 2009 Chapter 14 - Objectives Function and importance of transactions. Properties of transactions.

CoreDet: A Compiler and Runtime System for Deterministic Multithreaded Execution Tom Bergan Owen Anderson, Joe Devietti, Luis Ceze, Dan Grossman To appear.

Fall 2008Programming Development Techniques 1 Topic 20 Concurrency Section 3.4.

1 Synchronization via Transactions. 2 Concurrency Quiz If two threads execute this program concurrently, how many different final values of X are there?

AtomCaml: First-class Atomicity via Rollback Michael F. Ringenburg and Dan Grossman University of Washington International Conference on Functional Programming.

Optimistic Design CDP 1. Guarded Methods Do something based on the fact that one or more objects have particular states Make a set of purchases assuming.

COMP 430 Intro. to Database Systems Transactions, concurrency, & ACID.

Parallelism and Concurrency Koen Lindström Claessen Chalmers University Gothenburg, Sweden Patrik Jansson.

Parallelism and Concurrency

(Nested) Open Memory Transactions in Haskell

Part 2: Software-Based Approaches

Faster Data Structures in Transactional Memory using Three Paths

Software Transactional Memory

Changing thread semantics

Lecture 6: Transactions

Lecture 22: Consistency Models, TM

Background and Motivation

Java Concurrency 17-Jan-19.

Design and Implementation Issues for Atomicity

Kernel Synchronization II

CSE 451: Operating Systems Autumn 2003 Lecture 7 Synchronization

CSE 451: Operating Systems Autumn 2005 Lecture 7 Synchronization

CSE 451: Operating Systems Winter 2003 Lecture 7 Synchronization

Java Concurrency.

CSE 153 Design of Operating Systems Winter 19

Java Concurrency.

Java Concurrency 29-May-19.

Problems with Locks Andrew Whitaker CSE451.

CSE 332: Concurrency and Locks

Presentation transcript:

CSE 230 Concurrency: STM Slides due to: Kathleen Fisher, Simon Peyton Jones, Satnam Singh, Don Stewart

The Grand Challenge How to properly use multi-cores? Need new programming models!

Parallelism vs Concurrency A parallel program exploits real parallel computing resources to run faster while computing the same answer. – Expectation of genuinely simultaneous execution – Deterministic A concurrent program models independent agents that can communicate and synchronize. – Meaningful on a machine with one processor – Non-deterministic

Concurrent Programming Essential For Multicore Performance

Concurrent Programming State-of-the-art is 30 years old! Locks and condition variables Java: synchronized, wait, notify Locks etc. Fundamentally Flawed “Building a sky-scraper out of matchsticks”

Races Forgotten locks lead to inconsistent views Deadlock Locks acquired in “wrong” order Lost Wakeups Forgotten notify to condition variables Diabolical Error recovery Restore invariants, release locks in exception handlers What’s Wrong With Locks?

A Correct bank Account class Write code to transfer funds between accounts Even Worse! Locks Don’t Compose class Account{ float balance; synchronized void deposit(float amt) { balance += amt; } synchronized void withdraw(float amt) { if (balance < amt) throw new OutOfMoneyError(); balance -= amt; }

Even Worse! Locks Don’t Compose 1 st Attempt transfer = withdraw then deposit class Account{ float balance; synchronized void deposit(float amt) { balance += amt; } synchronized void withdraw(float amt) { if(balance < amt) throw new OutOfMoneyError(); balance -= amt; } void transfer(Acct other, float amt) { other.withdraw(amt); this.deposit(amt);} }

Even Worse! Locks Don’t Compose 1 st Attempt transfer = withdraw then deposit class Account{ float balance; synchronized void deposit(float amt) { balance += amt; } synchronized void withdraw(float amt) { if(balance < amt) throw new OutOfMoneyError(); balance -= amt; } void transfer(Acct other, float amt) { other.withdraw(amt); this.deposit(amt);} } Race Condition Wrong sum of balances

Even Worse! Locks Don’t Compose 2 st Attempt: synchronized transfer class Account{ float balance; synchronized void deposit(float amt){ balance += amt; } synchronized void withdraw(float amt){ if(balance < amt) throw new OutOfMoneyError(); balance -= amt; } synchronized void transfer(Acct other, float amt){ other.withdraw(amt); this.deposit(amt);} }

Even Worse! Locks Don’t Compose 2 st Attempt: synchronized transfer class Account{ float balance; synchronized void deposit(float amt){ balance += amt; } synchronized void withdraw(float amt){ if(balance < amt) throw new OutOfMoneyError(); balance -= amt; } synchronized void transfer(Acct other, float amt){ other.withdraw(amt); this.deposit(amt);} } Deadlocks with Concurrent reverse transfer

Locks are absurdly hard to get right Scalable double-ended queue: one lock per cell No interference If ends “far” apart But watch out! If queue is 0, 1, or 2 elements long

Coding Style Difficulty of queue implementation Sequential codeUndergraduate Locks are absurdly hard to get right

Coding Style Difficulty of queue implementation Sequential codeUndergraduate Locks & ConditionsMajor publishable result* *Simple, fast, and practical non-blocking and blocking concurrent queue algorithms Locks are absurdly hard to get right

What we have Hardware Library Locks and condition variables Locks and Conditions: Hard to use & Don’t compose Library

What we want Hardware Concurrency primitives Library Libraries Build Layered Concurrency Abstractions

Idea: Replace locks with atomic blocks Atomic Blocks/STM: Easy to use & Do compose Hardware Atomic Blocks: atomic, retry, orElse Atomic Blocks: atomic, retry, orElse Library

Coding Style Difficulty of queue implementation Sequential codeUndergraduate Locks & Conditions Major publishable result* Atomic blocks(STM)Undergraduate *Simple, fast, and practical non-blocking and blocking concurrent queue algorithms Locks are absurdly hard to get right

Atomic Memory Transactions Wrap atomic around sequential code All-or-nothing semantics: atomic commit atomic {...sequential code...} cf “ACID” database transactions

Atomic Memory Transactions Atomic Block Executes in Isolation No Data Race Conditions! atomic {...sequential code...} cf “ACID” database transactions

Atomic Memory Transactions There Are No Locks Hence, no deadlocks! atomic {...sequential code...} cf “ACID” database transactions

How it Works Optimistic Concurrency Execute code without any locks. Record reads/writes in thread-local transaction log Writes go to the log only, not to memory. At the end, transaction validates the log If valid, atomically commit changes to memory If invalid, re-run from start, discarding changes read y read z write 10 x write 42 z … atomic {...sequential code...}

Why it Doesn’t Work… Logging Memory Effects is Expensive Huge slowdown on memory read/write Cannot “Re-Run”, Arbitrary Effects How to “retract” ? How to “un-launch” missile? atomic {...sequential code...}

STM in Haskell

Haskell Fits the STM Shoe Haskellers brutally trained from birth to use memory/IO effects sparingly!

Issue: Logging Memory Is Expensive Haskell already partitions world into Immutable values (zillions and zillions) Mutable locations (very few) Solution: Only log mutable locations!

Issue: Logging Memory Is Expensive Haskell already paid the bill! Reading and Writing locations are Expensive function calls Logging Overhead Lower than in imperative languages

Issue: Undoing Arbitrary IO Types control where IO effects happen Easy to keep them out of transactions Monads Ideal For Building Transactions Implicitly (invisibly) passing logs

Tracking Effects with Types main = do { putStr (reverse “yes”); putStr “no” } (reverse “yes”) :: String -- No effects (putStr “no” ) :: IO () -- Effects okay main :: IO () Main program is a computation with effects

Mutable State

Mutable State via the IO Monad newRef :: a -> IO (IORef a) readRef :: IORef a -> IO a writeRef :: IORef a -> a -> IO () Reads and Writes are 100% Explicit (r+6) is rejected as r :: IORef Int

Mutable State via the IO Monad newRef :: a -> IO (IORef a) readRef :: IORef a -> IO a writeRef :: IORef a -> a -> IO () main = do r IO () incR = do v <- readIORef r writeIORef r (v+1)

Concurrency in Haskell forkIO function spawns a thread Takes an IO action as argument forkIO :: IO a -> IO ThreadId

Concurrency in Haskell newRef :: a -> IO (IORef a) readRef :: IORef a -> IO a writeRef :: IORef a -> a -> IO () forkIO :: IORef a -> IO ThreadId main = do r IO () incR = do v <- readIORef r writeIORef r (v+1) Data Race

Atomic Blocks in Haskell atomically act Executes ` act` atomically atomically :: IO a -> IO a

Atomic Blocks in Haskell main = do r <- newRef 0 forkIO $ atomically $ incR r atomically $ incR r atomic Ensures No Data Races! atomically :: IO a -> IO a

Atomic Blocks in Haskell What if we use incR outside block? Yikes! Races in code inside & outside! main = do r <- newRef 0 forkIO $ incR r atomically $ incR r Data Race

STM = Trans-actions Tvar = Imperative transaction variables A Better Type for Atomic atomic :: STM a -> IO a newTVar :: a -> STM (TVar a) readTVar :: TVar a -> STM a writeTVar :: TVar a -> a -> STM () Types ensure Tvar only touched in STM action

incT :: TVar Int -> STM () incT r = do v <- readTVar r writeTVar r (v+1) main = do r <- atomically $ newTvar 0 TVar forkIO $ atomically $ incT r atomically $ incT r... You cannot forget atomically Only way to execute STM action Type System Guarantees

Outside Atomic Block Can’t fiddle with TVars Inside Atomic Block Can’t do IO, Can’t manipulate imperative variables atomic $ if x<y then launchMissiles

Type System Guarantees Note: atomically is a function not a special syntactic construct...and, so, best of all...

(Unlike Locks) STM Actions Compose! Glue STM Actions Arbitrarily Wrap with atomic to get an IO action Types ensure STM action is atomic incT :: TVar Int -> STM () incT r = do {v <- readTVar r; writeTVar r (v+1)} incT2 :: TVar Int -> STM () incT2 r = do {incT r; incT r} foo :: IO () foo =...atomically $ incT2 r...

STM Type Supports Exceptions No need to restore invariants, or release locks! In ` atomically act ` if ` act ` throws exception: 1. Transaction is aborted with no effect, 2. Exception is propagated to enclosing IO code* *Composable Memory Transactions throw :: Exception -> STM a catch :: STM a ->(Exception->STM a)-> STM a

Transaction Combinators

#1 retry : Compositional Blocking “Abort current transaction & re-execute from start” retry :: STM () withdraw :: TVar Int -> Int -> STM () withdraw acc n = do bal <- readTVar acc if bal < n then retry writeTVar acc (bal-n)

#1 retry : Compositional Blocking withdraw :: TVar Int -> Int -> STM () withdraw acc n = do bal <- readTVar acc if bal < n then retry writeTVar acc (bal-n) Implementation Avoids Busy Waiting Uses logged reads to block till a read-var (eg. acc ) changes retry :: STM ()

#1 retry : Compositional Blocking withdraw :: TVar Int -> Int -> STM () withdraw acc n = do bal <- readTVar acc if bal < n then retry writeTVar acc (bal-n) No Condition Variables! Uses logged reads to block till a read-var (eg. acc) changes Retrying thread is woken on write, so no forgotten notifies retry :: STM ()

#1 retry : Compositional Blocking withdraw :: TVar Int -> Int -> STM () withdraw acc n = do bal <- readTVar acc if bal < n then retry writeTVar acc (bal-n) No Condition Variables! No danger of forgetting to test conditions On waking as transaction runs from the start. retry :: STM ()

Why is retry Compositional? Waits untill ` a1>3` AND ` a2>7` Without changing/knowing ` withdraw ` code atomic $ do withdraw a1 3 withdraw a2 7 Can appear anywhere in an STM Transaction Nested arbitrarily deeply inside a call

Hoisting Guards Is Not Compositional atomic (a1>3 && a2>7) {...stuff... } Breaks abstraction of “...stuff...” Need to know code to expose guards

atomically $ do withdraw a1 3`orelse` withdraw a2 3 deposit b 3 #2 orElse : Choice How to transfer 3$ from a1 or a2 to b ? Try this…...and if it retries, try this...and and then do this orElse :: STM a -> STM a -> STM a

Choice Is Composable Too! atomically $ transfer a1 a2 b `orElse` transfer a3 a4 b transfer calls orElse But calls to it can be composed with orElse transfer a1 a2 b = do withdraw a1 3`orElse` withdraw a2 3 deposit b 3

Ensuring Correctness of Concurrent Accesses? e.g. account should never go below 0

Assumed on Entry, Verified on Exit Only Tested If Invariant’s TVar changes Transaction Invariants

always :: STM Bool -> STM () checkBal :: TVar Int -> STM Bool checkBal v = do cts <- readTVar v return (v > 0) newAccount :: STM (TVar Int) newAccount = do v <- newTVar 0 always $ checkBal v v return v An arbitrary boolean valued STM action Every Transaction that touches acct will check invariant If the check fails, the transaction restarts #3 always : Enforce Invariants

Adds a new invariant to a global pool Conceptually, all invariants checked on all commits Implementation Checks Relevant Invariants That read TVars written by the transaction #3 always : Enforce Invariants always :: STM Bool -> STM ()

Recap: Composing Transactions A transaction is a value of type STM a Transactions are first-class values Big Tx By Composing Little Tx sequence, choice, block... To Execute, Seal The Transaction atomically :: STM a -> IO a

Complete Implementation in GHC6 Performance is similar to Shared-Var Need more experience using STM in practice… You can play with it* Final will have some STM material * Beautiful Concurrency

STM in Mainstream Languages Proposals for adding STM to Java etc. class Account { float balance; void deposit(float amt) { atomic { balance += amt; } } void withdraw(float amt) { atomic { if(balance < amt) throw new OutOfMoneyError(); balance -= amt; } } void transfer(Acct other, float amt) { atomic { // Can compose withdraw and deposit. other.withdraw(amt); this.deposit(amt); } }

Mainstream Types Don’t Control Effects So Code Inside Tx Can Conflict with Code Outside! Weak Atomicity Outside code sees inconsistent memory Avoid by placing all shared mem access in Tx Strong Atomicity Outside code guaranteed consistent memory view Causes big performance hit

A Monadic Skin In C/Java, IO is Everywhere No need for special type, all code is in “IO monad” Haskell Gives You A Choice When to be in IO monad vs when to be purely functional Haskell Can Be Imperative BUT C/Java Cannot Be Pure! Mainstream PLs lack a statically visible pure subset The separation facilitates concurrent programming…

Conclusions STM raises abstraction for concurrent programming Think high-level language vs assembly code Whole classes of low-level errors are eliminated. But not a silver bullet! Can still write buggy programs Concurrent code still harder than sequential code Only for shared memory, not message passing There is a performance hit But it seems acceptable, and things can only get better…