Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computer Science and Engineering Parallel and Distributed Processing CSE 8380 February 3, 2005 Session 7.

Similar presentations


Presentation on theme: "Computer Science and Engineering Parallel and Distributed Processing CSE 8380 February 3, 2005 Session 7."— Presentation transcript:

1 Computer Science and Engineering Parallel and Distributed Processing CSE 8380 February 3, 2005 Session 7

2 Computer Science and Engineering Contents Abstract Models PRAM Model Complexity Analysis Introduction to Parallel Algorithms Sorting

3 Computer Science and Engineering What is a Model? According to Webster’s Dictionary, a model is “a description or analogy used to help visualize something that cannot be directly observed.” According to The Oxford English Dictionary, a model is “a simplified or idealized description or conception of a particular system, situation or process.”

4 Computer Science and Engineering Why Models? In general, the purpose of Modeling is to capture the salient characteristics of phenomena with clarity and the right degree of accuracy to facilitate analysis and prediction. Megg, Matheson and Tarjan (1995)

5 Computer Science and Engineering Models in Problem Solving  Computer Scientists use models to help design problem solving tools such as:  Fast Algorithms  Effective Programming Environments  Powerful Execution Engines

6 Computer Science and Engineering A model is an interface separating high level properties from low level ones An Interface Applications Architectures Provides operations Requires implementation MODEL

7 Computer Science and Engineering Models in this class  Shared Memory Model  Distributed Memory Model

8 Computer Science and Engineering PRAM Model Synchronized Read Compute Write Cycle EREW ERCW CREW CRCW Complexity: T(n), P(n), C(n) Control Private Memory P1P1 Private Memory P2P2 Private Memory PpPp Global Memory

9 Computer Science and Engineering The PRAM model and its variations (cont.)  There are different modes for read and write operations in a PRAM.  Exclusive read(ER)  Exclusive write(EW)  Concurrent read(CR)  Concurrent write(CW)  Common  Arbitrary  Minimum  Priority  Based on the different modes described above, the PRAM can be further divided into the following four subclasses.  EREW-PRAM model  CREW-PRAM model  ERCW-PRAM model  CRCW-PRAM model

10 Computer Science and Engineering Analysis of Algorithms  Sequential Algorithms  Time Complexity  Space Complexity  An algorithm whose time complexity is bounded by a polynomial is called a polynomial-time algorithm. An algorithm is considered to be efficient if it runs in polynomial time.

11 Computer Science and Engineering Analysis of Sequential Algorithms NP P NP-complete NP-hard The relationships among P, NP, NP-complete, NP-hard

12 Computer Science and Engineering Analysis of parallel algorithm Performance of a parallel algorithm is expressed in terms of how fast it is and how much resources it uses when it runs. Run time, which is defined as the time during the execution of the algorithm Number of processors the algorithm uses to solve a problem The cost of the parallel algorithm, which is the product of the run time and the number of processors

13 Computer Science and Engineering Analysis of parallel algorithm The NC-class and P-completeness NP P NP-complete NC P-complete NP-hard The relationships among P, NP, NP-complete, NP-hard, NC, and P- complete

14 Computer Science and Engineering Simulating multiple accesses on an EREW PRAM  Broadcasting mechanism:  P1 reads x and makes it known to P2.  P1 and P2 make x known to P3 and P4, respectively, in parallel.  P1, P2, P3 and P4 make x known to P5, P6, P7 and P8, respectively, in parallel.  These eight processors will make x know to another eight processors, and so on.

15 Computer Science and Engineering Simulating multiple accesses on an EREW PRAM (cont.) Simulating Concurrent read on EREW PRAM with eight processors using Algorithm Broadcast_EREW x x x P1 (a) x x x x P2 (b) x x x x x P3 (c) x x x x x x x x x P5 (d) x P4 x P6 x P7 x P8 LLL L

16 Computer Science and Engineering Parallel Algorithms  Constructs  Processor Pi  Forall  Where  Do in Parallel  Others

17 Computer Science and Engineering Simulating multiple accesses on an EREW PRAM (cont.) Algorithm Broadcast_EREW Processor P 1 y (in P 1 ’s private memory)  x L[1]  y for i=0 to log p-1 do forall P j, where 2 i +1 < j < 2 i+1 do in parallel y (in P j ’s private memory)  L[j-2 i ] L[j]  y endfor

18 Computer Science and Engineering Enumeration Sort  Given a list on n numbers a 1, a 2, …, a n  We try to find the position of each element a i in the sorted list by computing the number of elements smaller than it  It c i elements are smaller than a i, then it is the (c i +1)th element in the sorted list  If 2 or more elements have the same value, the element with the largest index in the unsorted list will be considered the largest in the sorted list.

19 Computer Science and Engineering Sort-CRCW Assumptions  To sort n elements, we use n 2 processors (n rows and n columns)  P i,j  processor in row i, column j  Concurrent write  sum of all values  A[1..n] array of elements in global memory  C[1..n]  array to store number of elements smaller than every element in A

20 Computer Science and Engineering Sort-CRCW  Two steps 1. Each row of processors i computes C[i], the number of elements smaller than A[i]. Each processor P i,j compares A[i] and A[j], then updates C[i] appropriately 2. The first row in each P i,1 row places places A[i] in its proper position in the sorted list (C[i] + 1)

21 Computer Science and Engineering Algorithm Details Detail of two step Algorithm /* step 1 */ forall P i,j, where 1 < i, j<n do in parallel if A[i] > A[j] or (A[i] = A[j] and i > j) then C[i]  1 else C[i]  0 endif endfor /* step 2 */ forall P i,l, where 1 < i<n do in parallel A[C[i] +1]  A [i] endfor

22 Computer Science and Engineering Complexity  Run time: T(n) = O(1)  Number of processors: P(n) = n 2  Cost: c(n) = O(n 2 )  Is it cost optimal?  No! (sequential sort can be done in O(n log n)

23 Computer Science and Engineering Example: sort (9, 4, 6) P 1,1 P 1,2 P 1,3 649 A = 9 & 99 & 49 & 6 P 2,1 P 2,2 P 2,3 4 & 94 & 44 & 6 P 3,1 P 3,2 P 3,3 6 & 96 & 46 & 6 102 C = 964 A = Concurrent write SUM T(n) = O(1) P(n) = n 2 C(n) = T(n) * P(n) = O(n 2 )


Download ppt "Computer Science and Engineering Parallel and Distributed Processing CSE 8380 February 3, 2005 Session 7."

Similar presentations


Ads by Google