Download presentation
Presentation is loading. Please wait.
Published byMarley Chum Modified over 9 years ago
1
Dependence Precedence
2
Precedence & Dependence Can we execute a 1000 line program with 1000 processors in one step? What are the issues to deal with in various parallelizing situations: –Parallel Programming? –Instruction Level Parallelism? What type analysis is used to study concurrent database operation?
3
Dependence
4
Making Use of Processors In parallelizing algorithms, we want to use as many processors as possible in an effort to finish in as little time as possible. Often, it is not possible to make complete use of all processors in all time units –Some instructions (or sections of instructions) depend upon others –Others have a different, related problem called precedence (next section)
5
Input and Output Input and output cannot be parallelized in the strict sense because we’re dealing with a user. We assume multiple, parallel streams of input and output (modems, etc.).
6
Read and Print statements Read(x)x <- keyboard Print(x)screen <- x
7
Dependency Relationships Dependencies are relationships between the steps of an algorithm such that one step depends upon another. (S1)read (a) (S2)b <- a * 3 (S3)c <- b * a
8
Dependency Relationships Dependencies are relationships between the steps of an algorithm such that one step depends upon another. (S1)a <- keyboard (S2)b <- a * 3 (S3)c <- b * a Here, S2 is dependent on S1 to provide the appropriate value of a. Similarly, S3 is dependent on both S1 (for a’s value) and S2 (for b’s value). Since S2 needs a also, we can simply say that S3 is dependent on S2. Don’t need
9
Dependence Defined by a “read after write”* relationship This means moving from the left to the right side of the assignment operator. a <- 5 b <- a + 2 *Note: “Read” and “Write” in this case refer to reading the value from a memory location and writing a value to a memory location. Not Input/Output.
10
Graphing Dependence Relations S1 S2 Processors Time
11
(S1)read (a) (S2)b <- a * 3 (S3)c <- b * a Dependency Graphs
12
(S1)a <- keyboard (S2)b <- a * 3 (S3)c <- b * a In this case, it does not matter how many processors we have; we can use only one processor to finish in 3 time units. Dependency Graphs S1 S2 S3 Processors Time
13
What If There Are No Dependencies? S1S2S3 (S1) read (a) (S2) b <- b + 3 (S3) c <- c + 4 We can use three processors to get it done in a single time chunk. Processors Time
14
A Dependency Example (S1) read (a) (S2) read (b) (S3) c <- a * 4 (S4) d <- b / 3 (S5) e <- c * d (S6) f <- d + 8
15
A Dependency Example (S1) a <- keyboard (S2) b <- keyboard (S3) c <- a * 4 (S4) d <- b / 3 (S5) e <- c * d (S6) f <- d + 8
16
A Dependency Example (S1) a <- keyboard (S2) b <- keyboard (S3) c <- a * 4 (S4) d <- b / 3 (S5) e <- c * d (S6) f <- d + 8 S1S2
17
A Dependency Example (S1) a <- keyboard (S2) b <- keyboard (S3) c <- a * 4 (S4) d <- b / 3 (S5) e <- c * d (S6) f <- d + 8 S1S2 S3
18
A Dependency Example (S1) a <- keyboard (S2) b <- keyboard (S3) c <- a * 4 (S4) d <- b / 3 (S5) e <- c * d (S6) f <- d + 8 S1S2 S3S4
19
A Dependency Example (S1) a <- keyboard (S2) b <- keyboard (S3) c <- a * 4 (S4) d <- b / 3 (S5) e <- c * d (S6) f <- d + 8 S1S2 S3S4 S5
20
A Dependency Example (S1) a <- keyboard (S2) b <- keyboard (S3) c <- a * 4 (S4) d <- b / 3 (S5) e <- c * d (S6) f <- d + 8 S1S2 S3S4 S5 S6
21
A Dependency Example (S1) a <- keyboard (S2) b <- keyboard (S3) c <- a * 4 (S4) d <- b / 3 (S5) e <- c * d (S6) f <- d + 8 S1S2 S3S4 S5 S6 Using 2 processors, we finish 6 instructions in 3 units of time.
22
Dependence and Iteration Ignore steps that are not part of loop (overhead costs similar to making parallelism work) –Don’t worry about loop, exitif, counter variables, endloop, etc. Use notation to indicate passes: ‘ “ “‘ Unroll the loop, replacing the counter variable with a literal value.
23
An Iterative Example I <- 1 loop exitif (I > MAX_ARRAY) (S1)read (A[I]) (S2)B[I] <- A[I] + 4 (S3)C[I] <- A[I] / 3 (S4)D[I] <- B[I] / C[I] I <- I + 1 endloop
24
(S1) read (A[1]) (S2) B[1] <- A[1] + 4 (S3) C[1] <- A[1] / 3 (S4) D[1] <- B[1] / C[1] (S1’) read (A[2]) (S2’) B[2] <- A[2] + 4 (S3’) C[2] <- A[2] / 3 (S4’) D[2] <- B[2] / C[2] (S1”) read (A[3]) (S2”) B[3] <- A[3] + 4 (S3”) C[3] <- A[3] / 3 (S4”) D[3] <- B[3] / C[3] S1 S2 S4 S3 One iteration
25
An Iterative Example S1 S2 S4 S3 S1’ S2’ S4’ S3’ S1” S2” S4” S3”
26
Limited Number of Processors What if the number of processors is fixed? Some processors may be being used by another program/user If the number of processors available are less than the number of processors that can be utilized, shift instructions into lower time units.
27
A Limited Processor Example S1 S2 S4 S3 S1’ S2’ S4’ S3’ S1” S2” S4” S3”
28
Questions?
29
Precedence
30
Precedence Relationships Exists if a statement would contaminate the data needed by another, preceding instruction. (S1)read (a) (S2)print (a) (S3)a <- a * 7 (S4)print (a)
31
Precedence Relationships Exists if a statement would contaminate the data needed by another, preceding instruction. (S1)a <- keyboard (S2)screen <- a (S3)a <- a * 7 (S4)screen <- a S2 and S3 are dependent on S1 (for the initial value of a).
32
Precedence Relationships Exists if a statement would contaminate the data needed by another, preceding instruction. (S1)a <- keyboard (S2)screen <- a (S3)a <- a * 7 (S4)screen <- a S2 and S3 are dependent on S1 (for the initial value of a). S4 is dependent on S3 (for updated a).
33
Precedence Relationships Exists if a statement would contaminate the data needed by another, preceding instruction. (S1)a <- keyboard (S2)screen <- a (S3)a <- a * 7 (S4)screen <- a S2 and S3 are dependent on S1 (for the initial value of a). S4 is dependent on S3 (for updated a). There is also a precedence relationship between S2 and S3.
34
Precedence Relationships Exists if a statement would contaminate the data needed by another, preceding instruction. (S1)a <- keyboard (S2)screen <- a (S3)a <- a * 7 (S4)screen <- a S2 and S3 are dependent on S1 (for the initial value of a). S4 is dependent on S3 (for updated a). There is also a precedence relationship between S2 and S3. S3 must follow S2, else S3 could corrupt what S2 does.
35
Precedence Relationships Exists if a statement would contaminate the data needed by another, preceding instruction. (S1)a <- keyboard (S2)screen <- a (S3)a <- a * 7 (S4)screen <- a S2 and S3 are dependent on S1 (for the initial value of a). S4 is dependent on S3 (for updated a). There is also a precedence relationship between S2 and S3. S3 must follow S2, else S3 will corrupt what S2 does.
36
Precedence Defined by a “write after write” or “write after read” relationship. This means using the variable on the left side of the assignment operator after it has appeared previously on the right or left. b <- a + 2a <- 7a <- 5
37
Showing Precedence Relations S1 S2 Processors Time
38
Precedence Graphs (S1)read (a) (S2)print (a) (S3)a <- a * 7 (S4)print (a)
39
S1 S2 S3 S4 Precedence Graphs (S1)a <- keyboard (S2)screen <- a (S3)a <- a * 7 (S4)screen <- a Precedence arrow blocks S3 from executing until S2 is finished.
40
S1 S2 S3 S4 Precedence Graphs (S1)a <- keyboard (S2)screen <- a (S3)a <- a * 7 (S4)screen <- a Precedence arrow blocks S3 from executing until S2 is finished. Dependency arrow between S1 and S3 is superfluous
41
What if there is No Precedence? (S1) read (a) (S2) b <- b + 3 (S3) c <- c + 4 We can use three processors to get it done in a single time chunk. S1 S2 S3
42
Precedence and Iteration Ignore steps that are not part of loop (overhead costs similar to making parallelism work) –Don’t worry about loop, exitif, counter variables, endloop, etc. Use notation to indicated passes: ‘ “ “‘ Unroll the loop, replacing the counter variable with a literal value.
43
An Iterative Example i <- 1 loop exitif (i > 3) (S1)read (a) (S2)print (a) (S3)a <- a * 7 (S4)print (a) i <- i + 1 endloop
44
An Iterative Example i <- 1 loop exitif (i > 3) (S1)a <- keyboard (S2)screen <- a (S3)a <- a * 7 (S4)screen <- a i <- i + 1 endloop
45
(S1) a <- keyboard (S2) screen <- a (S3) a <- a * 7 (S4) screen <- a (S1’) a <- keyboard (S2’) screen <- a (S3’) a <- a * 7 (S4’) screen <- a (S1”) a <- keyboard (S2”) screen <- a (S3”) a <- a * 7 (S4”) screen <- a S4 S1 S2 S3 S1’
46
S1 S2 S3 S4 S1’ S2’ S3’ S4’ S1”S2” S3” S4” Iteration and Precedence Graphs
47
Space vs. Time We can optimize time performance by changing shared variable to an array of independent variables. i <- 1 loop exitif (i > 3) (S1)read (a[i]) (S2)print (a[i]) (S3)a[i] <- a[i] * 7 (S4)print (a[i]) i <- i + 1 endloop
48
S1 S2 S3 S4 S1’ S2’ S3’ S4’ S1” S2” S3” S4” Precedence Graphs We can use 3 processors to finish in 4 time units. Note that product complexity is unchanged.
49
What if Both Precedence and Dependence? If two instructions have both a precedence and a dependence relation (S1) a <- 5 (S2) a <- a + 2 showing only dependence is sufficient. S1 S2
50
Another Iterative Example i <- 1 loop exitif (i > N) (S1) read (a[i]) (S2) a[i] <- a[i] * 7 (S3) c <- a[i] / 3 (S4) print (c) i <- i + 1 endloop
51
Another Iterative Example i <- 1 loop exitif (i > N) (S1) a[i] <- keyboard (S2) a[i] <- a[i] * 7 (S3) c <- a[i] / 3 (S4) screen <- c i <- i + 1 endloop
52
(S1) a[1] <- keyboard (S2) a[1] <- a[1] * 7 (S3) c <- a[1] / 3 (S4) screen <- c (S1’) a[2] <- keyboard (S2’) a[2] <- a[2] * 7 (S3’) c <- a[2] / 3 (S4’) screen <- c (S1”) a[3] <- keyboard (S2”) a[3] <- a[3] * 7 (S3”) c <- a[3] / 3 (S4”) screen <- c
53
S1 S2 S3 S4 S1’ S2’ S3’ S4’ S1” S2” S3” S4” We have precedence relationships between iterations because of the shared c variable.
54
Crossing Index Bounds Example I <- 1 loop exitif( I > MAX ) // MAX is 3 (S1)A[I] <- A[I] + B[I] (S2)read( B[I] ) (S3)C[I] <- A[I] * 3 (S4)D[I] <- B[I] * A[I+1] I <- I + 1 endloop
55
Crossing Index Bounds Example I <- 1 loop exitif( I > MAX ) // MAX is 3 (S1) A[I] <- A[I] + B[I] (S2) B[I] <- keyboard (S3) C[I] <- A[I] * 3 (S4) D[I] <- B[I] * A[I+1] I <- I + 1 endloop
56
(S1) A[1] <- A[1] + B[1] (S2) B[1] <- keyboard (S3) C[1] <- A[1] * 3 (S4) D[1] <- B[1] * A[2] (S1’) A[2] <- A[2] + B[2] (S2’) B[2] <- keyboard (S3’) C[2] <- A[2] * 3 (S4’) D[2] <- B[2] * A[3] (S1”) A[3] <- A[3] + B[3] (S2”) B[3] <- keyboard (S3”) C[3] <- A[3] * 3 (S4”) D[3] <- B[3] * A[4]
57
Precedence between iterations S1 S2 S4 S3 S1’ S2’ S4’ S3’
58
Questions?
59
Practical Applications We used the single assignments as easy illustrations of the principles. There are additional real applications of this capability: –Much bigger than one assignment –Smaller than one assignment
60
http://setiathome.ssl.berkeley.edu/
61
Large Data Sets Consider the SETI project What do you now know about the data that makes it practical to distribute across millions of processors?
62
Instruction Processing Break computer’s processing into steps A - fetch instruction B - fetch data C - logical processing (math, test and branch) D - store result Independent for all sequential processing Dependency occurs when branch “ruins” three instruction fetches 154 32 1 23 4 5 1 23 4 5 1 2 3 4 5 A B C D 12345670 time I <- 0 loop exitif( I > MAX) endloop I <- I + 1 blah...
63
Questions?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.