Concurrent Objects Companion slides for

Slides:



Advertisements
Similar presentations
1 Chapter 5 Concurrency: Mutual Exclusion and Synchronization Principals of Concurrency Mutual Exclusion: Hardware Support Semaphores Readers/Writers Problem.
Advertisements

D u k e S y s t e m s Time, clocks, and consistency and the JMM Jeff Chase Duke University.
Ch. 7 Process Synchronization (1/2) I Background F Producer - Consumer process :  Compiler, Assembler, Loader, · · · · · · F Bounded buffer.
Chapter 6: Process Synchronization
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 5: Process Synchronization.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 6: Process Synchronization.
Process Synchronization. Module 6: Process Synchronization Background The Critical-Section Problem Peterson’s Solution Synchronization Hardware Semaphores.
CH7 discussion-review Mahmoud Alhabbash. Q1 What is a Race Condition? How could we prevent that? – Race condition is the situation where several processes.
Concurrent Objects and Consistency The Art of Multiprocessor Programming Spring 2007.
Chair of Software Engineering Concurrent Object-Oriented Programming Prof. Dr. Bertrand Meyer Lecture 5: Concurrent Objects.
1 Principles of Reliable Distributed Systems Lecture 10: Atomic Shared Memory Objects and Shared Memory Emulations Spring 2007 Prof. Idit Keidar.
Shared Counters and Parallelism Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.
Concurrent Objects Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit Modified by Rajeev Alur for CIS 640, University.
Chapter 6: Process Synchronization. Outline Background Critical-Section Problem Peterson’s Solution Synchronization Hardware Semaphores Classic Problems.
Computer Architecture II 1 Computer architecture II Lecture 9.
1 Sharing Objects – Ch. 3 Visibility What is the source of the issue? Volatile Dekker’s algorithm Publication and Escape Thread Confinement Immutability.
Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit Concurrent Skip Lists.
Instructor: Umar KalimNUST Institute of Information Technology Operating Systems Process Synchronization.
Operating Systems CSE 411 CPU Management Oct Lecture 13 Instructor: Bhuvan Urgaonkar.
Memory Consistency Models Some material borrowed from Sarita Adve’s (UIUC) tutorial on memory consistency models.
Concurrent Objects Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.
Concurrent Objects Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit Please read sections 3.7 and 3.8.
Linearizability By Mila Oren 1. Outline  Sequential and concurrent specifications.  Define linearizability (intuition and formal model).  Composability.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 3 (26/01/2006) Instructor: Haifeng YU.
11/18/20151 Operating Systems Design (CS 423) Elsa L Gunter 2112 SC, UIUC Based on slides by Roy Campbell, Sam.
Memory Consistency Models. Outline Review of multi-threaded program execution on uniprocessor Need for memory consistency models Sequential consistency.
Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451.
Fundamentals of Parallel Computer Architecture - Chapter 71 Chapter 7 Introduction to Shared Memory Multiprocessors Yan Solihin Copyright.
Concurrent Objects MIT 6.05s by Maurice Herlihy & Nir Shavit.
Monitors and Blocking Synchronization Dalia Cohn Alperovich Based on “The Art of Multiprocessor Programming” by Herlihy & Shavit, chapter 8.
Concurrent Queues Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.
Programming Language Basics Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.
Multiprocessor Architecture Basics Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.
Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit Concurrent Skip Lists.
The Relative Power of Synchronization Operations Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.
Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit Concurrent Skip Lists.
Symmetric Multiprocessors: Synchronization and Sequential Consistency
Lecture 20: Consistency Models, TM
CSE 486/586 Distributed Systems Consistency --- 2
CS703 – Advanced Operating Systems
Background on the need for Synchronization
Concurrent Objects Companion slides for
Memory Consistency Models
Atomic Operations in Hardware
Atomic Operations in Hardware
Memory Consistency Models
Lecture 25 More Synchronized Data and Producer/Consumer Relationship
Concurrent Objects Companion slides for
Chapter 5: Process Synchronization
Concurrent Objects.
Threads and Memory Models Hal Perkins Autumn 2011
Symmetric Multiprocessors: Synchronization and Sequential Consistency
Consistency Models.
Designing Parallel Algorithms (Synchronization)
Symmetric Multiprocessors: Synchronization and Sequential Consistency
Module 7a: Classic Synchronization
Grades.
Shared Memory Consistency Models: A Tutorial
Chapter 6: Process Synchronization
CSCI1600: Embedded and Real Time Software
Memory Consistency Models
CSE 153 Design of Operating Systems Winter 19
Chapter 6: Synchronization Tools
CSCI1600: Embedded and Real Time Software
Programming with Shared Memory Specifying parallelism
Problems with Locks Andrew Whitaker CSE451.
CSE 486/586 Distributed Systems Consistency --- 2
Parallel Data Structures
Presentation transcript:

Concurrent Objects Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit

Concurrent Computation memory object object Art of Multiprocessor Programming

Art of Multiprocessor Programming Objectivism What is a concurrent object? How do we describe one? How do we implement one? How do we tell if we’re right? In this lecture we will focus on the first and last items. Art of Multiprocessor Programming

Art of Multiprocessor Programming Objectivism What is a concurrent object? How do we describe one? How do we tell if we’re right? Art of Multiprocessor Programming

FIFO Queue: Enqueue Method q.enq( ) Consider a FIFO queue. Art of Multiprocessor Programming

FIFO Queue: Dequeue Method q.deq()/ Art of Multiprocessor Programming

Art of Multiprocessor Programming Lock-Based Queue 1 tail head x y 2 7 3 6 5 4 capacity = 8 Art of Multiprocessor Programming

Art of Multiprocessor Programming Lock-Based Queue 1 tail head x y 2 7 3 6 Fields protected by single shared lock 5 4 capacity = 8 Art of Multiprocessor Programming

Art of Multiprocessor Programming A Lock-Based Queue 1 capacity-1 2 head tail y z class LockBasedQueue<T> { int head, tail; T[] items; Lock lock; public LockBasedQueue(int capacity) { head = 0; tail = 0; lock = new ReentrantLock(); items = (T[]) new Object[capacity]; } Here is code for implementing a concurrent FIFO queue. Explain it IN DETAIL to the students. It’s the first algorithm they see which is not a mutual exclusion algorithm. The queue is implemented in an array. Initially the Head and Tail fields are equal and the queue is empty. All operations are synchronized Via the objects lock. If the Head and Tail differ by the queue size, then the queue is full. The Enq() method reads the Head() field, and if the queue is full, it waits until the queue is no longer full. It then stores the object in the array, and increments the Tail() field. Fields protected by single shared lock Art of Multiprocessor Programming

Art of Multiprocessor Programming Lock-Based Queue Initially: head = tail tail 1 head 2 7 3 6 5 4 Art of Multiprocessor Programming

Art of Multiprocessor Programming A Lock-Based Queue 1 capacity-1 2 head tail y z class LockBasedQueue<T> { int head, tail; T[] items; Lock lock; public LockBasedQueue(int capacity) { head = 0; tail = 0; lock = new ReentrantLock(); items = (T[]) new Object[capacity]; } Here is code for implementing a concurrent FIFO queue. Explain it IN DETAIL to the students. It’s the first algorithm they see which is not a mutual exclusion algorithm. The queue is implemented in an array. Initially the Head and Tail fields are equal and the queue is empty. All operations are synchronized Via the objects lock. If the Head and Tail differ by the queue size, then the queue is full. The Enq() method reads the Head() field, and if the queue is full, it waits until the queue is no longer full. It then stores the object in the array, and increments the Tail() field. Initially head = tail Art of Multiprocessor Programming

Art of Multiprocessor Programming Lock-Based deq() 1 tail head x y 7 2 6 3 5 4 Art of Multiprocessor Programming

Art of Multiprocessor Programming Acquire Lock 1 tail head x y My turn … x y Waiting to enqueue… 7 2 6 3 5 4 Art of Multiprocessor Programming

Implementation: deq() public T deq() throws EmptyException { lock.lock(); try { if (tail == head) throw new EmptyException(); T x = items[head % items.length]; head++; return x; } finally { lock.unlock(); } Acquire lock at method start 1 capacity-1 2 head tail y z Art of Multiprocessor Programming

Art of Multiprocessor Programming Check if Non-Empty tail 1 head Not equal? x y Waiting to enqueue… 7 2 6 3 5 4 Art of Multiprocessor Programming

Implementation: deq() public T deq() throws EmptyException { lock.lock(); try { if (tail == head) throw new EmptyException(); T x = items[head % items.length]; head++; return x; } finally { lock.unlock(); } 1 capacity-1 2 head tail y z If queue empty throw exception Art of Multiprocessor Programming

Art of Multiprocessor Programming Modify the Queue head 1 tail head x y Waiting to enqueue… 7 2 6 3 5 4 Art of Multiprocessor Programming

Implementation: deq() public T deq() throws EmptyException { lock.lock(); try { if (tail == head) throw new EmptyException(); T x = items[head % items.length]; head++; return x; } finally { lock.unlock(); } 1 capacity-1 2 head tail y z Queue not empty? Remove item and update head Art of Multiprocessor Programming

Implementation: deq() public T deq() throws EmptyException { lock.lock(); try { if (tail == head) throw new EmptyException(); T x = items[head % items.length]; head++; return x; } finally { lock.unlock(); } 1 capacity-1 2 head tail y z Return result Art of Multiprocessor Programming

Art of Multiprocessor Programming Release the Lock head 1 tail y Waiting… 7 2 6 3 x 5 4 Art of Multiprocessor Programming

Art of Multiprocessor Programming Release the Lock head 1 tail y My turn! 7 2 6 3 x 5 4 Art of Multiprocessor Programming

Implementation: deq() public T deq() throws EmptyException { lock.lock(); try { if (tail == head) throw new EmptyException(); T x = items[head % items.length]; head++; return x; } finally { lock.unlock(); } 1 capacity-1 2 head tail y z Release lock no matter what! Art of Multiprocessor Programming

Implementation: deq() public T deq() throws EmptyException { lock.lock(); try { if (tail == head) throw new EmptyException(); T x = items[head % items.length]; head++; return x; } finally { lock.unlock(); } modifications are mutually exclusive… Should be correct because Art of Multiprocessor Programming

Now consider the following implementation The same thing without mutual exclusion For simplicity, only two threads One thread enq only The other deq only This is sometimes called a producer-consumer channel or buffer. Art of Multiprocessor Programming

Wait-free 2-Thread Queue 1 tail head x y 7 2 6 3 5 4 capacity = 8 Art of Multiprocessor Programming

Wait-free 2-Thread Queue head 1 tail x y deq() enq(z) 7 2 6 3 5 4 z Art of Multiprocessor Programming

Wait-free 2-Thread Queue head 1 tail x y result = x queue[tail] = z 7 2 6 3 5 4 z Art of Multiprocessor Programming

Wait-free 2-Thread Queue head 1 tail y head++ tail-- 7 2 z 6 3 x 5 4 Art of Multiprocessor Programming

Wait-free 2-Thread Queue 1 capacity-1 2 head tail y z public class WaitFreeQueue { int head = 0, tail = 0; items = (T[]) new Object[capacity]; public void enq(Item x) { if (tail-head == capacity) throw new FullException(); items[tail % capacity] = x; tail++; } public Item deq() { if (tail == head) throw new EmptyException(); Item item = items[head % capacity]; head++; return item; }} No lock needed ! Art of Multiprocessor Programming

Wait-free 2-Thread Queue public T deq() throws EmptyException { lock.lock(); try { if (tail == head) throw new EmptyException(); T x = items[head % items.length]; head++; return x; } finally { lock.unlock(); } How do we define “correct” when modifications are not mutually exclusive? Art of Multiprocessor Programming

What is a Concurrent Queue? Need a way to specify a concurrent queue object Need a way to prove that an algorithm implements the object’s specification Lets talk about object specifications … In general, to make our intuitive understanding that the algorithm is correct we need a way to specify a concurrent queue object. Need a way to prove that the algorithm implements the object’s specification. We saw two types of object implementations, one that used locks and intuitively felt like a queue, and one that did not use locks, it was infact, wait-free. What properties are they actually providing, are they both implementations of a concurrent queue? In what ways are they similar and in what ways do they differ? Lets try and answer these questions. Art of Multiprocessor Programming

Correctness and Progress In a concurrent setting, we need to specify both the safety and the liveness properties of an object Need a way to define when an implementation is correct the conditions under which it guarantees progress There are two elements in which a concurrent specification imposes terms on the implementation: safety and liveness, or, in our case, correctness and progress. Lets begin with correctness Art of Multiprocessor Programming

Art of Multiprocessor Programming Sequential Objects Each object has a state Usually given by a set of fields Queue example: sequence of items Each object has a set of methods Only way to manipulate state Queue example: enq and deq methods Keep going back to the queue example throughout this lecture. Later, if you need to talk about multiple objects, give as an example a program that uses a stack and a queue. Art of Multiprocessor Programming

Sequential Specifications If (precondition) the object is in such-and-such a state before you call the method, Then (postcondition) the method will return a particular value or throw a particular exception. and (postcondition, con’t) the object will be in some other state when the method returns, Art of Multiprocessor Programming

Pre and PostConditions for Dequeue Precondition: Queue is non-empty Postcondition: Returns first item in queue Removes first item in queue Art of Multiprocessor Programming

Pre and PostConditions for Dequeue Precondition: Queue is empty Postcondition: Throws Empty exception Queue state unchanged Art of Multiprocessor Programming

Why Sequential Specifications Totally Rock Interactions among methods captured by side-effects on object state State meaningful between method calls Documentation size linear in number of methods Each method described in isolation Can add new methods Without changing descriptions of old methods Art of Multiprocessor Programming

What About Concurrent Specifications ? Methods? Documentation? Adding new methods? Art of Multiprocessor Programming

Art of Multiprocessor Programming Methods Take Time In the sequential world, we treat method calls as if they were instantaneous, but in fact, they take time. time time Art of Multiprocessor Programming

Art of Multiprocessor Programming Methods Take Time invocation 12:00 q.enq(...) In the sequential world, we treat method calls as if they were instantaneous, but in fact, they take time. time time Art of Multiprocessor Programming

Art of Multiprocessor Programming Methods Take Time invocation 12:00 q.enq(...) Method call In the sequential world, we treat method calls as if they were instantaneous, but in fact, they take time. time time Art of Multiprocessor Programming

Art of Multiprocessor Programming Methods Take Time invocation 12:00 q.enq(...) Method call In the sequential world, we treat method calls as if they were instantaneous, but in fact, they take time. time time Art of Multiprocessor Programming

Art of Multiprocessor Programming Methods Take Time invocation 12:00 void response 12:01 q.enq(...) Method call In the sequential world, we treat method calls as if they were instantaneous, but in fact, they take time. time time Art of Multiprocessor Programming

Sequential vs Concurrent Methods take time? Who knew? Concurrent Method call is not an event Method call is an interval. Art of Multiprocessor Programming

Concurrent Methods Take Overlapping Time Art of Multiprocessor Programming

Concurrent Methods Take Overlapping Time Method call time time Art of Multiprocessor Programming

Concurrent Methods Take Overlapping Time Method call Method call time time Art of Multiprocessor Programming

Concurrent Methods Take Overlapping Time Method call Method call Method call time time Art of Multiprocessor Programming

Sequential vs Concurrent Object needs meaningful state only between method calls Concurrent Because method calls overlap, object might never be between method calls Art of Multiprocessor Programming

Sequential vs Concurrent Each method described in isolation Concurrent Must characterize all possible interactions with concurrent calls What if two enq() calls overlap? Two deq() calls? enq() and deq()? … Art of Multiprocessor Programming

Sequential vs Concurrent Can add new methods without affecting older methods Concurrent: Everything can potentially interact with everything else Art of Multiprocessor Programming

Sequential vs Concurrent Can add new methods without affecting older methods Concurrent: Everything can potentially interact with everything else Panic! Art of Multiprocessor Programming

Art of Multiprocessor Programming The Big Question What does it mean for a concurrent object to be correct? What is a concurrent FIFO queue? FIFO means strict temporal order Concurrent means ambiguous temporal order Art of Multiprocessor Programming

Art of Multiprocessor Programming Intuitively… public T deq() throws EmptyException { lock.lock(); try { if (tail == head) throw new EmptyException(); T x = items[head % items.length]; head++; return x; } finally { lock.unlock(); } Art of Multiprocessor Programming

Intuitively… All queue modifications are mutually exclusive public T deq() throws EmptyException { lock.lock(); try { if (tail == head) throw new EmptyException(); T x = items[head % items.length]; head++; return x; } finally { lock.unlock(); } All queue modifications are mutually exclusive Art of Multiprocessor Programming

Art of Multiprocessor Programming Intuitively Lets capture the idea of describing the concurrent via the sequential q.deq lock() unlock() enq deq deq q.enq enq Behavior is “Sequential” On an intuitive level, since we use mutual exclusion locks, then each of the actual updates happens in a non-overlapping interval so the behavior as a whole looks sequential. This fits our intuition, and the question is if we can generalize this intuition. Stress the fact that we understand the concurrent execution because we have a way of mapping it to the sequential one, in which we know how a FIFO queue behaves. time Art of Multiprocessor Programming

Art of Multiprocessor Programming Linearizability Each method should “take effect” Instantaneously Between invocation and response events Object is correct if this “sequential” behavior is correct Any such concurrent object is Linearizable™ Linearizability is the generalization of this intuition to general objects, with our without mutual exclusion. Art of Multiprocessor Programming

Is it really about the object? Each method should “take effect” Instantaneously Between invocation and response events Sounds like a property of an execution… A linearizable object: one all of whose possible executions are linearizable Art of Multiprocessor Programming

Art of Multiprocessor Programming Example Take you time with this example…don’t let them guess this one, show them, and let them guess the linearizability of the subsequent ones. time time Art of Multiprocessor Programming

Art of Multiprocessor Programming Example q.enq(x) time time Art of Multiprocessor Programming

Art of Multiprocessor Programming Example q.enq(x) q.enq(y) time time Art of Multiprocessor Programming

Art of Multiprocessor Programming Example q.enq(x) q.enq(y) q.deq(x) time time Art of Multiprocessor Programming

Art of Multiprocessor Programming Example q.enq(x) q.deq(y) q.enq(y) q.deq(x) time time Art of Multiprocessor Programming

Art of Multiprocessor Programming Example linearizable q.enq(x) q.enq(x) q.enq(y) q.deq(x) q.deq(y) q.deq(y) q.enq(y) q.deq(x) time time Art of Multiprocessor Programming

Art of Multiprocessor Programming Example q.enq(x) q.enq(x) Valid? q.deq(y) q.deq(y) q.enq(y) q.enq(y) q.deq(x) q.deq(x) time time Art of Multiprocessor Programming

Art of Multiprocessor Programming Example time Art of Multiprocessor Programming

Art of Multiprocessor Programming Example q.enq(x) time Art of Multiprocessor Programming

Art of Multiprocessor Programming Example q.enq(x) q.deq(y) time Art of Multiprocessor Programming

Art of Multiprocessor Programming Example q.enq(x) q.deq(y) q.enq(y) time Art of Multiprocessor Programming

Art of Multiprocessor Programming Example q.enq(x) q.enq(y) q.enq(x) q.deq(y) q.enq(y) time Art of Multiprocessor Programming

Art of Multiprocessor Programming Example not linearizable q.enq(x) q.enq(x) q.deq(y) q.enq(y) q.enq(y) time Art of Multiprocessor Programming

Art of Multiprocessor Programming Example time time Art of Multiprocessor Programming

Art of Multiprocessor Programming Example q.enq(x) time time Art of Multiprocessor Programming

Art of Multiprocessor Programming Example q.enq(x) q.deq(x) time time Art of Multiprocessor Programming

Art of Multiprocessor Programming Example q.enq(x) q.deq(x) q.enq(x) q.deq(x) time time Art of Multiprocessor Programming

Art of Multiprocessor Programming Example linearizable q.enq(x) q.deq(x) q.enq(x) q.deq(x) time time Art of Multiprocessor Programming

Art of Multiprocessor Programming Example q.enq(x) time time Art of Multiprocessor Programming

Art of Multiprocessor Programming Example q.enq(x) q.enq(y) time time Art of Multiprocessor Programming

Art of Multiprocessor Programming Example q.enq(x) q.deq(y) q.enq(y) time time Art of Multiprocessor Programming

Art of Multiprocessor Programming Example q.enq(x) q.deq(y) q.enq(y) q.deq(x) time time Art of Multiprocessor Programming

Art of Multiprocessor Programming Example Comme ci Comme ça multiple orders OK q.enq(x) q.deq(y) linearizable q.enq(y) q.deq(x) time Art of Multiprocessor Programming

Read/Write Register Example time time Art of Multiprocessor Programming

Read/Write Register Example write(1) already happened time time Art of Multiprocessor Programming

Read/Write Register Example write(1) already happened time time Art of Multiprocessor Programming

Read/Write Register Example not linearizable write(0) read(1) write(2) write(1) write(1) read(0) write(1) already happened time time Art of Multiprocessor Programming

Read/Write Register Example write(1) already happened time time Art of Multiprocessor Programming

Read/Write Register Example write(1) already happened time time Art of Multiprocessor Programming

Read/Write Register Example not linearizable write(0) read(1) write(2) write(2) write(1) write(1) read(1) write(1) already happened time time Art of Multiprocessor Programming

Read/Write Register Example time time Art of Multiprocessor Programming

Read/Write Register Example time time Art of Multiprocessor Programming

Read/Write Register Example linearizable write(0) write(1) write(2) write(2) write(1) read(1) time time Art of Multiprocessor Programming

Talking About Executions Why? Can’t we specify the linearization point of each operation without describing an execution? Not Always In some cases, linearization point depends on the execution Art of Multiprocessor Programming

Formal Model of Executions Define precisely what we mean Ambiguity is bad when intuition is weak Allow reasoning Formal But mostly informal In the long run, actually more important Ask me why! We now define a formal model to allow us to precisely define linearizability and other consistency conditions. Art of Multiprocessor Programming

Split Method Calls into Two Events Invocation method name & args q.enq(x) Response result or exception q.enq(x) returns void q.deq() returns x q.deq() throws empty Art of Multiprocessor Programming

Art of Multiprocessor Programming Invocation Notation A q.enq(x) Art of Multiprocessor Programming

Art of Multiprocessor Programming Invocation Notation thread A q.enq(x) Art of Multiprocessor Programming

Art of Multiprocessor Programming Invocation Notation thread method A q.enq(x) Art of Multiprocessor Programming

Art of Multiprocessor Programming Invocation Notation thread object method A q.enq(x) Art of Multiprocessor Programming

Art of Multiprocessor Programming Invocation Notation thread object method arguments A q.enq(x) Art of Multiprocessor Programming

Art of Multiprocessor Programming Response Notation A q: void Art of Multiprocessor Programming

Art of Multiprocessor Programming Response Notation thread A q: void Art of Multiprocessor Programming

Art of Multiprocessor Programming Response Notation thread A q: void result Art of Multiprocessor Programming

Art of Multiprocessor Programming Response Notation thread object A q: void result Art of Multiprocessor Programming

Art of Multiprocessor Programming Response Notation thread object A q: void Method is implicit result Art of Multiprocessor Programming

Art of Multiprocessor Programming Response Notation thread object exception A q: empty() Method is implicit Art of Multiprocessor Programming

History - Describing an Execution A q.enq(3) A q:void A q.enq(5) B p.enq(4) B p:void B q.deq() B q:3 H = Sequence of invocations and responses Art of Multiprocessor Programming

Art of Multiprocessor Programming Definition Invocation & response match if Object names agree Thread names agree A q.enq(3) A q:void Method call Art of Multiprocessor Programming

Art of Multiprocessor Programming Object Projections A q.enq(3) A q:void B p.enq(4) B p:void B q.deq() B q:3 H = Art of Multiprocessor Programming

Art of Multiprocessor Programming Object Projections A q.enq(3) A q:void B p.enq(4) B p:void B q.deq() B q:3 H|q = Art of Multiprocessor Programming

Art of Multiprocessor Programming Thread Projections A q.enq(3) A q:void B p.enq(4) B p:void B q.deq() B q:3 H = Art of Multiprocessor Programming

Art of Multiprocessor Programming Thread Projections A q.enq(3) A q:void B p.enq(4) B p:void B q.deq() B q:3 H|B = Art of Multiprocessor Programming

An invocation is pending if it has no matching respnse Complete Subhistory A q.enq(3) A q:void A q.enq(5) B p.enq(4) B p:void B q.deq() B q:3 H = An invocation is pending if it has no matching respnse Art of Multiprocessor Programming

May or may not have taken effect Complete Subhistory A q.enq(3) A q:void A q.enq(5) B p.enq(4) B p:void B q.deq() B q:3 H = May or may not have taken effect Art of Multiprocessor Programming

discard pending invocations Complete Subhistory A q.enq(3) A q:void A q.enq(5) B p.enq(4) B p:void B q.deq() B q:3 H = discard pending invocations Art of Multiprocessor Programming

Art of Multiprocessor Programming Complete Subhistory A q.enq(3) A q:void B p.enq(4) B p:void B q.deq() B q:3 Complete(H) = Art of Multiprocessor Programming

Art of Multiprocessor Programming Sequential Histories A q.enq(3) A q:void B p.enq(4) B p:void B q.deq() B q:3 A q:enq(5) Art of Multiprocessor Programming

Art of Multiprocessor Programming Sequential Histories match A q.enq(3) A q:void B p.enq(4) B p:void B q.deq() B q:3 A q:enq(5) Art of Multiprocessor Programming

Art of Multiprocessor Programming Sequential Histories match A q.enq(3) A q:void B p.enq(4) B p:void B q.deq() B q:3 A q:enq(5) match Art of Multiprocessor Programming

Art of Multiprocessor Programming Sequential Histories match A q.enq(3) A q:void B p.enq(4) B p:void B q.deq() B q:3 A q:enq(5) match match Art of Multiprocessor Programming

Final pending invocation OK Sequential Histories match A q.enq(3) A q:void B p.enq(4) B p:void B q.deq() B q:3 A q:enq(5) match match Final pending invocation OK Art of Multiprocessor Programming

Method calls of different threads do not interleave Sequential Histories match A q.enq(3) A q:void B p.enq(4) B p:void B q.deq() B q:3 A q:enq(5) Method calls of different threads do not interleave match match Final pending invocation OK Art of Multiprocessor Programming

Well-Formed Histories A q.enq(3) B p.enq(4) B p:void B q.deq() A q:void B q:3 H= Art of Multiprocessor Programming

Well-Formed Histories Per-thread projections sequential B p.enq(4) B p:void B q.deq() B q:3 H|B= A q.enq(3) B p.enq(4) B p:void B q.deq() A q:void B q:3 H= Art of Multiprocessor Programming

Well-Formed Histories Per-thread projections sequential B p.enq(4) B p:void B q.deq() B q:3 H|B= A q.enq(3) B p.enq(4) B p:void B q.deq() A q:void B q:3 H= A q.enq(3) A q:void H|A= Art of Multiprocessor Programming

Threads see the same thing in both Equivalent Histories H|A = G|A H|B = G|B Threads see the same thing in both A q.enq(3) B p.enq(4) B p:void B q.deq() A q:void B q:3 A q.enq(3) A q:void B p.enq(4) B p:void B q.deq() B q:3 H= G= Art of Multiprocessor Programming

Sequential Specifications A sequential specification is some way of telling whether a Single-thread, single-object history Is legal For example: Pre and post-conditions But plenty of other techniques exist … Art of Multiprocessor Programming

Art of Multiprocessor Programming Legal Histories A sequential (multi-object) history H is legal if For every object x H|x is in the sequential spec for x Art of Multiprocessor Programming

Art of Multiprocessor Programming Precedence A q.enq(3) B p.enq(4) B p.void A q:void B q.deq() B q:3 A method call precedes another if response event precedes invocation event Method call Method call Art of Multiprocessor Programming

Some method calls overlap one another Non-Precedence A q.enq(3) B p.enq(4) B p.void B q.deq() A q:void B q:3 Some method calls overlap one another . Method call Method call Art of Multiprocessor Programming

Art of Multiprocessor Programming Notation Given History H method executions m0 and m1 in H We say m0 H m1, if m0 precedes m1 Relation m0 H m1 is a Partial order Total order if H is sequential m0 m1 Maybe the main point to notice is that in general, given a history, precedence only defines a partial order – because we have methods that overlaps. So in a sequential history, this is a total order, because we have no overlapping methods. The fact that this is only partial, is exactly the ambiguity about the order that we do not like, and that linearizability comes to fix. So what we really want, is to take the partial order, and somehow complete it to a total order – and that’s how we’ll define linearizability in the next slide. Art of Multiprocessor Programming

Art of Multiprocessor Programming Linearizability History H is linearizable if it can be extended to G by Appending zero or more responses to pending invocations Discarding other pending invocations So that G is equivalent to Legal sequential history S where G  S First part: takes care of pending invocations. If the operation took effect, we want to append a response, otherwise, let’s drop it. Second part, takes care of completing the missing part in the partial order – it tells us to find an equivalent legal sequential history (in which precedence relation defines total order), such that this total order respects the partial order in the original history. Of course, the correctness part hides in the fact that the sequential history is legal. Explain that equivalence captures, apart from the ordering, the fact that the input/output relation of the methods must be the same as in the sequential execution. The ordering part is also covered by the G \subset S requirement but the fac that the values must be the same is captured only by the equivalence requirement. When we later go to seq consistency the only ordering property will come from the equivalence property. Art of Multiprocessor Programming

Art of Multiprocessor Programming Remarks Some pending invocations Took effect, so keep them Discard the rest Condition G  S Means that S respects “real-time order” of G Art of Multiprocessor Programming

Art of Multiprocessor Programming Ensuring G  S G = {ac,bc} S = {ab,ac,bc} A limitation on the Choice of S! a G b c The G  S is a limitation on S NOT ON G! Its saying that you cannot just pick any S, you must pick an  S so that it contains G, that is, its saying that when we pick the linearization points, they have got to be within the intervals, and we can only pick an arbitrary order when the intervals are unordered. G time time S Art of Multiprocessor Programming

Art of Multiprocessor Programming Example A q.enq(3) B q.enq(4) B q:void B q.deq() B q:4 B q:enq(6) A q.enq(3) B q.enq(4) B q.deq(4) B q.enq(6) time Art of Multiprocessor Programming

Art of Multiprocessor Programming Example A q.enq(3) B q.enq(4) B q:void B q.deq() B q:4 B q:enq(6) Complete this pending invocation A q.enq(3) B q.enq(4) B q.deq(3) B q.enq(6) time Art of Multiprocessor Programming

Art of Multiprocessor Programming Example A q.enq(3) B q.enq(4) B q:void B q.deq() B q:4 B q:enq(6) A q:void Complete this pending invocation A q.enq(3) B q.enq(4) B q.deq(4) B q.enq(6) time Art of Multiprocessor Programming

Art of Multiprocessor Programming Example A q.enq(3) B q.enq(4) B q:void B q.deq() B q:4 B q:enq(6) A q:void discard this one A q.enq(3) B q.enq(4) B q.deq(4) B q.enq(6) time Art of Multiprocessor Programming

Art of Multiprocessor Programming Example A q.enq(3) B q.enq(4) B q:void B q.deq() B q:4 A q:void discard this one A q.enq(3) B q.enq(4) B q.deq(4) time Art of Multiprocessor Programming

Art of Multiprocessor Programming Example A q.enq(3) B q.enq(4) B q:void B q.deq() B q:4 A q:void A q.enq(3) B q.enq(4) B q.deq(4) time Art of Multiprocessor Programming

Art of Multiprocessor Programming Example A q.enq(3) B q.enq(4) B q:void B q.deq() B q:4 A q:void B q.enq(4) B q:void A q.enq(3) A q:void B q.deq() B q:4 A q.enq(3) B q.enq(4) B q.deq(4) time Art of Multiprocessor Programming

Art of Multiprocessor Programming Example Equivalent sequential history A q.enq(3) B q.enq(4) B q:void B q.deq() B q:4 A q:void B q.enq(4) B q:void A q.enq(3) A q:void B q.deq() B q:4 A q.enq(3) B q.enq(4) B q.deq(4) time Art of Multiprocessor Programming

Art of Multiprocessor Programming Concurrency How much concurrency does linearizability allow? When must a method invocation block? The following slides about non-blocking can be presented to graduate student sbut not clear its needed for undergradutes Art of Multiprocessor Programming

Art of Multiprocessor Programming Concurrency Focus on total methods Defined in every state Example: deq() that throws Empty exception Versus deq() that waits … Why? Otherwise, blocking unrelated to synchronization Art of Multiprocessor Programming

Art of Multiprocessor Programming Concurrency Question: When does linearizability require a method invocation to block? Answer: never. Linearizability is non-blocking Art of Multiprocessor Programming

Art of Multiprocessor Programming Non-Blocking Theorem If method invocation A q.inv(…) is pending in history H, then there exists a response A q:res(…) such that H + A q:res(…) is linearizable This is important because consistency conditions such as serializability do not provide it, transactions may have to be rolled back to work but linearizability means you don’t need to role back. Art of Multiprocessor Programming

Art of Multiprocessor Programming Proof Pick linearization S of H If S already contains Invocation A q.inv(…) and response, Then we are done. Otherwise, pick a response such that S + A q.inv(…) + A q:res(…) Possible because object is total. Since H is linearizable, let’s pick up the sequential history S that defines the linearization of H. If S contains the pending invocation and a response – then we’re done (I want to prove that we don’t need to throw it away in order to be linearizable, so if the linearization S did not through it away – there is nothing to do). If it did throw it away – then we need to show we can add it somewhere – so what we’ll do is adding it at the end of S. It is possible because all methods are total – so they are defined in the state at the end of S (meaning that we make the operation take effect last). I claim that what I ended up with is a valid linearization of H: It is still equivalent – I only added the invocation as it was in the original history, with the same arguments, and an appropriate response for both the original and the new sequential histories. These invocation and response are the last operation done by thread A in H, and it is the last operation done by thread A in the new sequential history we’ve created. So each thread’s view of the two histories is the same. It is legal: because we added it up at the end, it cannot effect the returned values of previous calls. Finally, it does not violate the partial order, because we can linearize that operation anywhere between it’s invocation and response, and in this case there was no response – so we can definitely linearize it last without violating any ordering. Art of Multiprocessor Programming

Composability Theorem History H is linearizable if and only if For every object x H|x is linearizable We care about objects only! (Materialism?) Art of Multiprocessor Programming

Why Does Composability Matter? Modularity Can prove linearizability of objects in isolation Can compose independently-implemented objects Art of Multiprocessor Programming

Reasoning About Linearizability: Locking public T deq() throws EmptyException { lock.lock(); try { if (tail == head) throw new EmptyException(); T x = items[head % items.length]; head++; return x; } finally { lock.unlock(); } 1 capacity-1 2 head tail y z Art of Multiprocessor Programming

Reasoning About Linearizability: Locking public T deq() throws EmptyException { lock.lock(); try { if (tail == head) throw new EmptyException(); T x = items[head % items.length]; head++; return x; } finally { lock.unlock(); } 1 capacity-1 2 head tail y z Linearization points are when locks are released Art of Multiprocessor Programming

More Reasoning: Wait-free public class WaitFreeQueue { int head = 0, tail = 0; items = (T[]) new Object[capacity]; public void enq(Item x) { if (tail-head == capacity) throw new FullException(); items[tail % capacity] = x; tail++; } public Item deq() { if (tail == head) throw new EmptyException(); Item item = items[head % capacity]; head++; return item; }} 1 capacity-1 2 head tail y z head tail 1 y capacity-1 z 2 Art of Multiprocessor Programming

More Reasoning: Wait-free public class WaitFreeQueue { int head = 0, tail = 0; items = (T[]) new Object[capacity]; public void enq(Item x) { if (tail-head == capacity) throw new FullException(); items[tail % capacity] = x; tail++; } public Item deq() { if (tail == head) throw new EmptyException(); Item item = items[head % capacity]; head++; return item; }} Linearization order is order head and tail fields modified Remember that there Is only one enqueuer and only one dequeuer Art of Multiprocessor Programming

Art of Multiprocessor Programming Strategy Identify one atomic step where method “happens” Critical section Machine instruction Doesn’t always work Might need to define several different steps for a given method We will see this phenomena that you might need different steps for different executions in lists chapter. Art of Multiprocessor Programming

Linearizability: Summary Powerful specification tool for shared objects Allows us to capture the notion of objects being “atomic” Don’t leave home without it We will see this phenomena that you might need different steps for different executions in lists chapter. Art of Multiprocessor Programming

Alternative: Sequential Consistency History H is Sequentially Consistent if it can be extended to G by Appending zero or more responses to pending invocations Discarding other pending invocations So that G is equivalent to a Legal sequential history S The difference from linearizability is in the second property, that it preserves the real time order…notice that the fact that G is EQUIVALENT to S implies that S must preserve the PROGRAM ORDER in each thread. So dropping Where G  S only allows reordering operations by different threads in order to establish a global total (sequential) order. Where G  S Differs from linearizability Art of Multiprocessor Programming

Sequential Consistency No need to preserve real-time order Cannot re-order operations done by the same thread Can re-order non-overlapping operations done by different threads Often used to describe multiprocessor memory architectures Art of Multiprocessor Programming

Art of Multiprocessor Programming Example time Art of Multiprocessor Programming

Art of Multiprocessor Programming Example q.enq(x) time Art of Multiprocessor Programming

Art of Multiprocessor Programming Example q.enq(x) q.deq(y) time Art of Multiprocessor Programming

Art of Multiprocessor Programming Example q.enq(x) q.deq(y) q.enq(y) time Art of Multiprocessor Programming

Art of Multiprocessor Programming Example q.enq(x) q.enq(y) q.enq(x) q.deq(y) q.enq(y) time Art of Multiprocessor Programming

Art of Multiprocessor Programming Example not linearizable q.enq(x) q.enq(x) q.deq(y) q.enq(y) q.enq(y) time Art of Multiprocessor Programming

Yet Sequentially Consistent Example Yet Sequentially Consistent q.enq(x) q.enq(x) q.deq(y) q.enq(y) q.enq(y) time Art of Multiprocessor Programming

Art of Multiprocessor Programming Theorem Sequential Consistency is not composable Art of Multiprocessor Programming

Art of Multiprocessor Programming FIFO Queue Example p.enq(x) q.enq(x) p.deq(y) time time Art of Multiprocessor Programming

Art of Multiprocessor Programming FIFO Queue Example p.enq(x) q.enq(x) p.deq(y) q.enq(y) p.enq(y) q.deq(x) time time Art of Multiprocessor Programming

Art of Multiprocessor Programming FIFO Queue Example p.enq(x) q.enq(x) p.deq(y) q.enq(y) p.enq(y) q.deq(x) History H time time Art of Multiprocessor Programming

H|p Sequentially Consistent p.enq(x) q.enq(x) p.deq(y) q.enq(y) p.enq(y) q.deq(x) time time Art of Multiprocessor Programming

H|q Sequentially Consistent p.enq(x) q.enq(x) p.deq(y) q.enq(y) p.enq(y) q.deq(x) time time Art of Multiprocessor Programming

Art of Multiprocessor Programming Ordering imposed by p p.enq(x) q.enq(x) p.deq(y) q.enq(y) p.enq(y) q.deq(x) time time Art of Multiprocessor Programming

Art of Multiprocessor Programming Ordering imposed by q p.enq(x) q.enq(x) p.deq(y) q.enq(y) p.enq(y) q.deq(x) time time Art of Multiprocessor Programming

Ordering imposed by both p.enq(x) q.enq(x) p.deq(y) q.enq(y) p.enq(y) q.deq(x) time time Art of Multiprocessor Programming

Art of Multiprocessor Programming Combining orders p.enq(x) q.enq(x) p.deq(y) q.enq(y) p.enq(y) q.deq(x) time time Art of Multiprocessor Programming

Art of Multiprocessor Programming Fact Most hardware architectures don’t support sequential consistency Because they think it’s too strong Here’s another story … Art of Multiprocessor Programming

Art of Multiprocessor Programming The Flag Example y.read(0) x.write(1) y.write(1) x.read(0) time time Art of Multiprocessor Programming

Art of Multiprocessor Programming The Flag Example y.read(0) x.write(1) y.write(1) x.read(0) Each thread’s view is sequentially consistent It went first time Art of Multiprocessor Programming

Art of Multiprocessor Programming The Flag Example y.read(0) x.write(1) y.write(1) x.read(0) Entire history isn’t sequentially consistent Can’t both go first time Art of Multiprocessor Programming

Art of Multiprocessor Programming The Flag Example y.read(0) x.write(1) y.write(1) x.read(0) Is this behavior really so wrong? We can argue either way … time Art of Multiprocessor Programming

Art of Multiprocessor Programming Opinion: It’s Wrong This pattern Write mine, read yours Is exactly the flag principle Beloved of Alice and Bob Heart of mutual exclusion Peterson Bakery, etc. It’s non-negotiable! Art of Multiprocessor Programming

Art of Multiprocessor Programming © 2003 Herlihy and Shavit Peterson's Algorithm public void lock() { flag[i] = true; victim = i; while (flag[j] && victim == i) {}; } public void unlock() { flag[i] = false; We now combine the two algorithms, one that does not deadlock when they are concurrent, and the other that does not deadlock when they are sequential, to derive an algorithm that never deadlocks. Art of Multiprocessor Programming

Art of Multiprocessor Programming © 2003 Herlihy and Shavit Crux of Peterson Proof (1) writeB(flag[B]=true)writeB(victim=B) (3) writeB(victim=B)writeA(victim=A) (2) writeA(victim=A)readA(flag[B])  readA(victim) writeB(victim=B) and writeA(victim=A) appear twice so remove them Art of Multiprocessor Programming 181 181

Art of Multiprocessor Programming © 2003 Herlihy and Shavit Crux of Peterson Proof (1) writeB(flag[B]=true)writeB(victim=B) (3) writeB(victim=B)writeA(victim=A) (2) writeA(victim=A)readA(flag[B])  readA(victim) writeB(victim=B) and writeA(victim=A) appear twice so remove them Observation: proof relied on fact that if a location is stored, a later load by some thread will return this or a later stored value. Art of Multiprocessor Programming 182 182

Opinion: But It Feels So Right … Many hardware architects think that sequential consistency is too strong Too expensive to implement in modern hardware OK if flag principle violated by default Honored by explicit request Art of Multiprocessor Programming

Slide used with permission of Charles E. Leiserson Hardware Consistancy Initially, a = b = 0. Processor 0 Processor 1 mov 1, a ;Store mov b, %ebx ;Load mov 1, b ;Store mov a, %eax ;Load What are the final possible values of %eax and %ebx after both processors have executed? Sequential consistency implies that no execution ends with %eax= %ebx = 0 Slide used with permission of Charles E. Leiserson

Art of Multiprocessor Programming Hardware Consistency No modern-day processor implements sequential consistency. Hardware actively reorders instructions. Compilers may reorder instructions, too. Why? Because most of performance is derived from a single thread’s unsynchronized execution of code. Art of Multiprocessor Programming

Instruction Reordering mov 1, a ;Store mov b, %ebx ;Load mov b, %ebx ;Load mov 1, a ;Store Program Order Execution Order Q. Why might the hardware or compiler decide to reorder these instructions? A. To obtain higher performance by covering load latency — instruction-level parallelism. Slide used with permission of Charles E. Leiserson

Instruction Reordering mov 1, a ;Store mov b, %ebx ;Load mov b, %ebx ;Load mov 1, a ;Store Program Order Execution Order Q. When is it safe for the hardware or compiler to perform this reordering? A. When a ≠ b. A′. And there’s no concurrency. Slide used with permission of Charles E. Leiserson

Slide used with permission of Charles E. Leiserson Hardware Reordering Load Bypass Memory System Store Buffer Processor Network Processor can issue stores faster than the network can handle them ⇒ store buffer. Loads take priority, bypassing the store buffer. Except if a load address matches an address in the store buffer, the store buffer returns the result. Slide used with permission of Charles E. Leiserson

Art of Multiprocessor Programming X86: Memory Consistency Thread’s Code Store1 Loads are not reordered with loads. Stores are not reordered with stores. Stores are not reordered with prior loads. A load may be reordered with a prior store to a different location but not with a prior store to the same location. Stores to the same location respect a global total order. Store2 Load1 Load2 Store3 Store4 Total store order Load3 Load4 Load5 Art of Multiprocessor Programming

Art of Multiprocessor Programming X86: Memory Consistency Thread’s Code Store1 Loads are not reordered with loads. Stores are not reordered with stores. Stores are not reordered with prior loads. A load may be reordered with a prior store to a different location but not with a prior store to the same location. Stores to the same location respect a global total order. Store2 Total Store Ordering (TSO)…weaker than sequential consistency Load1 Load2 Store3 Loads Store4 Total store order Load3 Load4 OK! Load5 Art of Multiprocessor Programming

Memory Barriers (Fences) A memory barrier (or memory fence) is a hardware action that enforces an ordering constraint between the instructions before and after the fence. A memory barrier can be issued explicitly as an instruction (x86: mfence) The typical cost of a memory fence is comparable to that of an L2-cache access. Art of Multiprocessor Programming

X86: Memory Consistency Thread’s Code Store1 Store2 Loads are not reordered with loads. Stores are not reordered with stores. Stores are not reordered with prior loads. A load may be reordered with a prior store to a different location but not with a prior store to the same location. Stores to the same location respect a global total order. Store2 Total Store Ordering + properly placed memory barriers = sequential consistency Load1 Load2 Store3 Store4 Total store order Barrier Load3 Load4 Load5

Art of Multiprocessor Programming Memory Barriers Explicit Synchronization Memory barrier will Flush write buffer Bring caches up to date Compilers often do this for you Entering and leaving critical sections Art of Multiprocessor Programming

Art of Multiprocessor Programming Volatile Variables In Java, can ask compiler to keep a variable up-to-date by declaring it volatile Adds a memory barrier after each store Inhibits reordering, removing from loops, & other “compiler optimizations” Will talk about it in detail in later lectures Art of Multiprocessor Programming

Art of Multiprocessor Programming Summary: Real-World Hardware weaker than sequential consistency Can get sequential consistency at a price Linearizability better fit for high-level software Art of Multiprocessor Programming

Art of Multiprocessor Programming Linearizability Linearizability Operation takes effect instantaneously between invocation and response Uses sequential specification, locality implies composablity Art of Multiprocessor Programming

Art of Multiprocessor Programming Summary: Correctness Sequential Consistency Not composable Harder to work with Good way to think about hardware models We will use linearizability as our consistency condition in the remainder of this course unless stated otherwise Art of Multiprocessor Programming

Art of Multiprocessor Programming Progress We saw an implementation whose methods were lock-based (deadlock-free) We saw an implementation whose methods did not use locks (lock-free) How do they relate? So far we have talked about correctness. Now its time to discuss progress. Art of Multiprocessor Programming

Art of Multiprocessor Programming Progress Conditions Deadlock-free: some thread trying to acquire the lock eventually succeeds. Starvation-free: every thread trying to acquire the lock eventually succeeds. Lock-free: some thread calling a method eventually returns. Wait-free: every thread calling a method eventually returns. The above are definitions of progress conditions we have used and will use in the coming lectures. We give here informal definitions of progress conditions…formal ones need to talk about fair histories which is beyond the scope of this lecture…for the above conditions: A method implementation is deadlock-free if it guarantees min- imal progress in every fair history, and maximal progress in some fair history. A method implementation is starvation-free if it guarantees maximal progress in every fair history. A method implementation is lock-free if it guarantees minimal progress in every history and maximal progress in some history. A method implementation is wait-free if it guarantees maximal progress in every history. Art of Multiprocessor Programming

Art of Multiprocessor Programming Progress Conditions Non-Blocking Blocking Everyone makes progress Wait-free Starvation-free Someone makes progress Lock-free Deadlock-free Art of Multiprocessor Programming

Art of Multiprocessor Programming Summary We will look at linearizable blocking and non-blocking implementations of objects. Art of Multiprocessor Programming

Art of Multiprocessor Programming           This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License. You are free: to Share — to copy, distribute and transmit the work to Remix — to adapt the work Under the following conditions: Attribution. You must attribute the work to “The Art of Multiprocessor Programming” (but not in any way that suggests that the authors endorse you or your use of the work). Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under the same, similar or a compatible license. For any reuse or distribution, you must make clear to others the license terms of this work. The best way to do this is with a link to http://creativecommons.org/licenses/by-sa/3.0/. Any of the above conditions can be waived if you get permission from the copyright holder. Nothing in this license impairs or restricts the author's moral rights. Art of Multiprocessor Programming