Download presentation
Presentation is loading. Please wait.
Published byAnabel Flowers Modified over 8 years ago
1
Concurrent Computing Seminar Introductory Lecture Instructor: Danny Hendler http://www.cs.bgu.ac.il/~satcc112
2
2 Requirements Select a paper and notify me by March, 1’st Study the paper and prepare a good presentation Give the seminar talk Participate in at least 80% of seminar talks
3
33 Seminar Plan Introductory lecture #1 21/2/11 Paper assignment published Semester ends 28/2/11 Introductory lecture #2 Papers list published, students send their 3 preferences 3/3/11 Student talks 7/3/11 Student talks start
4
4 Talk outline Motivation Locks Nonblocking synchronization Shared objects and linearizability Transactional memory Consensus
5
5 5 From the New York Times…
6
66 Moore’s law Exponential growth in computing power
7
7 7 Speeding up uni-processors is harder and harder Intel, Sun, AMD, IBM now focusing on “multi- core” architectures Already, most computers are multiprocessors How can we write correct and scalable algorithms for multiprocessors ? The Future of Computing
8
8 Race conditions - a fundamental problem of thread-level parallelism. Account[i] = Account[i]-X; Account[j] = Account[j]+X;.... Account[i] = Account[i]-X; Account[j] = Account[j]+X;... Thread A Thread B But what if execution is concurrent? Must avoid race conditions
9
9 Key synchronization alternatives Mutual exclusion locks Coarse-grained locks Find-grained locks Nonblocking synchronization Transactional memory
10
10 Talk outline Motivation Locks Nonblocking synchronization Shared objects and linearizability Transactional memory Consensus
11
11 The mutual exclusion problem (Dijkstra, 1965) We need to devise a protocol that guarantees mutually exclusive access by processes to a shared resource (such as a file, printer, etc.)
12
12 The problem model Shared-memory multiprocessor: multiple processes Processes can apply atomic reads, writes or stronger read-modify-write operations to shared variables Completely asynchronous
13
13 Mutual exclusion: algorithm structure loop forever Remainder code Entry code Critical section (CS) Exit code end loop Remainder code Entry code CS Exit code
14
14 Mutual exclusion: No two processes are at their CS at the same time. Deadlock-freedom: If a process is trying to enter its critical section, then some process eventually enters its critical section. Starvation-freedom (optional): If a process is trying to enter its critical section, then this process must eventually enter its critical section. Assumption: processes do not fail-stop while performing the entry, CS, or exit code. Mutual exclusion: formal definitions
15
15 Candidate algorithm Program for process 0 1.await turn=0 2.CS of process 0 3.turn:=1 Program for process 1 1.await turn=1 2.CS of process 1 3.turn:=0 initially: turn=0 Does algorithm1 satisfy mutex? Does it satisfy deadlock-freedom? Yes No
16
16 Peterson’s 2-process algorithm (Peterson, 1981) initially: b[0]=false, b[1]=false, turn=0 or 1 Program for process 0 1.b[0]:=true 2.turn:=0 3.await (b[1]=false or turn=1) 4.CS 5.b[0]:=false Program for process 1 1.b[1]:=true 2.turn:=1 3.await (b[0]=false or turn=0) 4.CS 5.b[1]:=false
17
17 Mutual exclusion for n processes: Tournament trees 0 0 1 0 1 2 3 0 1 2 34 56 7 Level 0 Level 1 Level 2 Processes A tree-node is identified by: [level, node#] Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006
18
18
19
19
20
20 Talk outline Motivation Locks Nonblocking synchronization Shared objects and linearizability Transactional memory Consensus
21
21 Synchronization alternatives Non-blocking synchronization Various progress guarantees Wait-freedom Lock-freedom Obstruction-freedom Generally requires strong synchronization Pros Potentially scalable Avoids lock hazards Cons Typically complicated to program
22
Read-modify-write operations 22 The use of locks can sometimes be avoided if the hardware supports stronger Read-Modify-Write operations and not just read and write: Test-and-set Fetch-and-add Compare-and-swap Test-and-set(w) atomically v read from w w 1 return v; Fetch-and-add(w, delta) atomically v read from w w v+delta return v;
23
The compare-and-swap (CAS) operation 23 Comare&swap(w, expected, new) atomically v read from w if (v = expected) { w new return success } else return failure; MIPS PowerPC DECAlpha Motorola 680x0 IBM 370 Sun SPARC 80X86, Pentium
24
An example CAS usage: Treiber’s stack algorithm 24 Push(int v, Stack S) 1.n := new NODE ;create node for new stack item 2.n.val := v ;write item value 3.do forever ;repeat until success 4. node top := S.top 5. n.next := top ;next points to current top (LIFO order) 6. if compare&swap(S.top, top, n) ; try to add new item 7. return ; return if succeeded 8.od Top val next val next … val next
25
Treiber’s stack algorithm (cont’d) Top val next val next … val next Pop(Stack S) 1.do forever 2. top := S.top 3. if top = null 4. return empty 5. if compare&swap(S.top, top, top.next) 6. return-val=top.val 7. free top 8. return return-val 9.od 25
26
Nonblocking Progress Conditions Wait-freedom: Each thread terminates its operation in a finite number of its steps Lock-freedom: After a finite number of steps, some thread terminates its operation Obstruction-freedom: If a thread runs by itself long enough, it will finish its operation 26
27
27 Talk outline Motivation Locks Nonblocking synchronization Shared objects and linearizability Transactional memory Consensus
28
28 Shared objects
29
29 Shared objects and implementations Each object has a state –Usually given by a set of shared memory fields Objects may be implemented from simpler base objects. Each object supports a set of operations –Only way to manipulate state –E.g. – a shared stack supports the push and pop operation.
30
30 time q.enq(x) q.enq(y) q.deq(x) q.deq(y) Enqueue Dequeue Enqueue Dequeue time Executions induce partial order on operations There is only a partial order between operations InvocationResponse
31
Correctness condition: linearizability 31 Linearizability: (an intuitive definition) Can find a point within the time-interval of each operation, where the operation took place, such that the operations order is legal.
32
32 Example: A queue time q.enq(x) q.enq(y)q.deq(x) q.deq(y) linearizable q.enq(x) q.enq(y)q.deq(x) q.deq(y) time (6)
33
33 Example time q.enq(x) q.enq(y) q.deq(y) not linearizable q.enq(x) q.enq(y) (5)
34
34 Example time q.enq(x) q.deq(x) q.enq(x) q.deq(x) linearizable time (4)
35
35 Example time q.enq(x) q.enq(y) q.deq(y) linearizable q.deq(x) time q.enq(x) q.enq(y) q.deq(y) q.deq(x) q.enq(x) q.enq(y) q.deq(y) q.deq(x) multiple orders OK (8)
36
Lin. points in Treiber’s algorithm 36 Pop(Stack S) 1.do forever 2. top := S.top 3. if top = null 4. return empty 5. if compare&swap(S.top, top, top.next) 6. return-val=top.val 7. free top 8. return return-val 9.od Push(int v, Stack S) 1.n := new NODE ;create node for new stack item 2.n.val := v ;write item value 3.do forever ;repeat until success 4. node top := S.top 5. n.next := top ;next points to current top (LIFO order) 6. if compare&swap(S.top, top, n) ; try to add new item 7. return ; return if succeeded 8.od
37
37 Talk outline Motivation Locks Nonblocking synchronization Shared objects and linearizability Transactional memory Consensus
38
38 Transactional Memory A transaction is a sequence of memory reads and writes, executed by a single thread, that either commits or aborts If a transaction commits, all the reads and writes appear to have executed atomically If a transaction aborts, none of its stores take effect Transaction operations aren't visible until they commit (if they do)
39
39 Transactional Memory Goals A new multiprocessor architecture The goal: Implementing lock-free synchronization that is – efficient – easy to use compared with conventional techniques based on mutual exclusion Implemented by hardware support (such as straightforward extensions to multiprocessor cache- coherence protocols) and / or by software mechanisms
40
40 A Usage Example Locks: Lock(L[i]); Lock(L[j]); Account[i] = Account[i] – X; Account[j] = Account[j] + X; Unlock(L[j]); Unlock(L[i]); Transactional Memory: atomic { Account[i] = Account[i] – X; Account[j] = Account[j] + X; }; Account[i] = Account[i]-X; Account[j] = Account[j]+X;
41
41 Transactions execute in commit order ld 0xdddd... st 0xbeef Transaction A Time ld 0xbeef Transaction C ld 0xbeef Re-execute with new data Commit ld 0xdddd... ld 0xbbbb Transaction B Commit Violation! 0xbeef Taken from a presentation by Royi Maimon & Merav Havuv, prepared for a seminar given by Prof. Yehuda Afek. Transactions interaction
42
42 Motivation Locks Nonblocking synchronization Shared objects and linearizability Transactional memory Consensus Talk outline
43
43
44
44
45
45
46
46 Formally: the ConsensusObject -Supports a single operation: decide -Each process p i calls decide with some input v i from some domain. decide returns a value from the same domain. -The following requirements must be met: - Agreement: In any execution E, all decide operations must return the same value. - Validity: The values returned by the operations must equal one of the inputs.
47
47 Wait-free consensus can be solved easily by compare&swap Comare&swap(b,old,new) atomically v read from b if (v = old) { b new return success } else return failure; MIPS PowerPC DECAlpha Motorola 680x0 IBM 370 Sun SPARC 80X86
48
48 Would this consensus algorithm from reads/writes work? Decide(v) ; code for p i, i=0,1 1.if (decision = null) 2. decision=v 3. return v 4.else 5. return decision Initially decision=null
49
49 A proof that wait-free consensus for 2 or more processes cannot be solved by registers.
50
50 A FIFO queue Supports 2 operations: q.enqueue(x) – returns ack q.dequeue – returns the first item in the queue or empty if the queue is empty.
51
51 FIFO queue + registers can implement 2-process consensus Decide(v) ; code for p i, i=0,1 1.Prefer[i]:=v 2.qval=Q.deq() 3.if (qval = 0) then return v 4.else return Prefer[1-i] Initially Q= and Prefer[i]=null, i=0,1 There is no wait-free implementation of a FIFO queue shared by 2 or more processes from registers
52
52 A proof that wait-free consensus for 3 or more processes cannot be solved by FIFO queue (+ registers)
53
53 The wait-free hierarchy We say that object type X solves wait-free n-process consensus if there exists a wait-free consensus algorithm for n processes using only shared objects of type X and registers. The consensus number of object type X is n, denoted CN(X)=n, if n is the largest integer for which X solves wait-free n-process consensus. It is defined to be infinity if X solves consensus for every n. Lemma: If CN(X)=m and CN(Y)=n>m, then there is no wait- free implementation of Y from instances of X and registers in a system with more than m processes.
54
54 The wait-free hierarchy (cont’d) Compare-and-swap … 2FIFO queue, stack, test-and-set 1registers
55
55 The universality of conensus An object is universal if, together with registers, it can implement any other object in a wait-free manner. It can be shown that any object X with consensus number n is universal in a system with n or less processes
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.