Download presentation
Presentation is loading. Please wait.
1
CS 584 Lecture 20 n Assignment –Glenda program n Project Proposal is coming up! (March 13) »2 pages text + 1 page plan of action »3 references n No class March 13 –Put your project proposal in my box. –Paper presentations on March 11 (Tom Abbott)
2
Module Compostion
3
Case Study: Matrix Multiply n Goal: Data-distribution neutral n Three basic ways to distribute –row –column –submatrix n Question? –Does our library need different algorithms?
4
Analytical Model n Compare the two algorithms n Ignore the computation costs n What are the communication costs.
5
One Dimensional Decomposition n Each processor "owns" black portion n To compute the owned portion of the answer, each processor requires all of A. n This affects data-distribution.
6
1-D Decomp. P N ttPT ws 2 )1(
7
Two Dimensional Decomposition n Requires less data per processor n Algorithm can be performed stepwise.
8
Broadcast an A sub- matrix to the other processors in row. Compute Rotate the B sub- matrix upwards
9
Algorithm Set B' = B local for j = 0 to sqrt(P) -2 in each row I the [(I+j) mod sqrt(P)]th task broadcasts A' = A local to the other tasks in the row accumulate A' * B' send B' to upward neighbor done
10
2-D Decomp. P N tt P PT ws 2 1 2 log 1
11
Redistribution n If we only have one algorithm, we need to possibly redistribute the data n How much does this cost?
12
Redistribution PP N ttPT ws 2 1
13
Analysis n Performance analysis reveals that the 2 dimensional decomposition is always better. n So our matrix multiply only needs one algorithm –Might need redistribution algorithm to be totally data distribution neutral n However, this is not the best algorithm.
15
Systolic Algorithm P N ttPT ws 2 12
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.