Download presentation
Presentation is loading. Please wait.
Published bySylvia Baldwin Modified over 9 years ago
1
Fall 2008Simple Parallel Algorithms1
2
Fall 2008Simple Parallel Algorithms2 Scalar Product of Two Vectors Let a = (a 1, a 2, …, a n ); b = (b 1, b 2, …, b n ) be two vectors. The scalar product of the two vectors is given by a b = a 1 b 1 + a 2 b 2 + … + a n b n.scalar product It can be done by O(log n) time with O(n) PEs in EREW PRAM model. Using the divide-and-conquer approach, this problem can be solved in O(log n) time, using O(n/log n) processors.
3
Fall 2008Simple Parallel Algorithms3 Algorithm of Scalar Product Algorithm Scalar Product Input: Arrays a[1:n] and b[1:n]. Output: The value of the scalar product stored in c 1. BEGIN For i = 1 to n do in parallel c i = a i b i ; End-parallel p = n / 2; While p > 0 do For i = 1 to p do in parallel c i = c i + c i+p ; End-parallel p = p/2 ; End-While END.
4
Fall 2008Simple Parallel Algorithms4 Matrix Multiplication If A is a matrix of order m × n, and B is a matrix of order n × p, the product C = A B can be evaluated and it will be of order m × p. The entry in the ith row and jth column of C, C(i, j) is obtained by the scalar product of the ith row of A and jth column of B. That is
5
Fall 2008Simple Parallel Algorithms5 Algorithm of Matrix Multiplication Algorithm Matrix-Multiply Input: The Matrices A and B. Output: The product matrix C. BEGIN For i = 1 to m do in parallel For j = 1 to p do in parallel Evaluate C(i, j); End-parallel END.
6
Fall 2008Simple Parallel Algorithms6 Evaluate C(i, j) Begin For k = 1 to n do in parallel T(k) = A(i, k) B(k, j); End-parallel r = n / 2; While r > 0 do For k = 1 to r do in parallel T(k) = T(2k–1) + T(2k); End-parallel r = r/2 ; End-While C(i, j) = T(1); End
7
Fall 2008Simple Parallel Algorithms7 Complexity Analysis The complexity of the algorithm is O(log n) time, using O(mpn) processors. In particular, when A and B are square matrices, this runs in O(log n) time, using O(n 3 ) processors. If using the divide-and-conquer approach, the problem can be solved in O(log n) time, using O(n 3 /log n) processors. The algorithm needs CREW PRAM model.
8
Fall 2008Simple Parallel Algorithms8 Partial Sums Let A(1:n) be an array of numbers. The partial sums of the array is defined by Algorithm Sequential-Partial-Sum Input: Array A(1:n). Output: Partial sum in PS(1:n). BEGIN PS(1) = A(1); For i = 2 to n do PS(i) = PS(i-1) + A(i); End-For END.
9
Fall 2008Simple Parallel Algorithms9 Parallel Processing of Partial Sums 1 st Stage: Let S(i, j) denote the value of the jth node at level h - i, counted from left to right, in the binary tree. Where, h is the height of binary tree. Initially, S(0, j) = A(j). All the values of S(i, j) can be resolved by the algorithm of parallel sum. 2 nd Stage: The partial sums will be PS(0, 1), PS(0, 2), …, PS(0, n). These values are determined by a traversal from the root to the leaf of the binary tree.
10
Fall 2008Simple Parallel Algorithms10 Example of Partial Sum 1 st Stage: each S(i, j) is determined bottom up. i12345678 A(i)A(i)2338407391394863
11
Fall 2008Simple Parallel Algorithms11 2 nd Stage: PS(i, j) is evaluated top down. 2
12
Fall 2008Simple Parallel Algorithms12 Parallel Algorithm of Partial Sums Algorithm Parallel-Partial-Sum Input: Array A(1:n). Output: Partial Sums PS(0, 1:n). BEGIN For i = 1 to n do in parallel S(0, i) = A(i); End-parallel p = n; For i = 1 to (log n) do p = p / 2; For j = 1 to p do in parallel S(i, j) = S(i-1, 2j) + S(i-1, 2j-1); End-parallel End-For 1 st Stage
13
Fall 2008Simple Parallel Algorithms13 PS(log n, 1) = S(log n, 1); p = 1; For i = (log n) – 1 down to 0 do p = 2p; For j = 1 to p do in parallel Case j = 1: PS(i, j) = S(i, j); j = even: PS(i, j) = PS(i+1, j/2); Else: PS(i, j) = PS(i+1, j/2) + S(i, j); End-Case End-parallel End-For END. 2 nd Stage O(log n) time, O(n) PEs in CREW PRAM model
14
Fall 2008Simple Parallel Algorithms14 Binomial Coefficients The binomial coefficient is given by The problem here is to find all the binomial coefficients: It can be represented in the form of a triangular.
15
Fall 2008Simple Parallel Algorithms15 It can also be put in the form of a square table: If this dimensional array is represented by P, then we observe that
16
Fall 2008Simple Parallel Algorithms16 Using the fact that we get P(i, j) = P(i-1, j) + P(i, j-1). Repeating on P(i-1, j), we get The value of P(i, j) is reached by adding all the cells of the previous column upto the i th row. Such that, Also,
17
Fall 2008Simple Parallel Algorithms17 Example of Binomial Coefficients Suppose n = 6. Find all values of Initial Values of P(i, 0) j i 0123456 11 21 31 41 51 61 71
18
Fall 2008Simple Parallel Algorithms18 Values of P(i, 1) j i 0123456 111 212 313 414 515 616 71
19
Fall 2008Simple Parallel Algorithms19 The lower-left to top-right diagonal entries give the values of Final values of P(i, j) j i 0123456 11111111 2123456 31361015 4141020 51515 616 71
20
Fall 2008Simple Parallel Algorithms20 Parallel Algorithm Algorithm Parallel-Binomial-Coefficients Input: A positive integer n. Output: The binomial Coefficients BEGIN For i = 1 to n +1 do in parallel P(i, 0) = 1; End-parallel For j = 1 to n do Find the partial sums of the (j-1)th column entries using Parallel-Partial-Sum algorithm and store in jth column. That is: End-For Output results in P(n+1, 0), P(n, 1), P(n-1, 2), …, P(1, n); END.
21
Fall 2008Simple Parallel Algorithms21 Complexity Analysis The algorithm can be done in O(nlog n) time, using O(n) processors. It can be implemented in a CREW PRAM model, because of the use of Partial Sum Algorithm. Can it be faster?
22
Fall 2008Simple Parallel Algorithms22 Approximate String Matching It’s the problem of string matching that allows errors. Edit distance: it allows us to delete, insert and substitute simple characters (with a different one) in both strings. Application areas: text retrieval, computational biology, signal processing, pattern recognition.
23
Fall 2008Simple Parallel Algorithms23 Example: SURGERY 00000000 S10111111 U21012222 R32101223 V43211233 E54322123 Y65433222 Edit distance = 2 Time complexity O(nm)
24
Fall 2008Simple Parallel Algorithms24 Algorithm if i=0, C[ 0, j ] =0, 0 ≦ j ≦ n ; if j=0, C[ i, 0 ] =i, 0 ≦ j ≦ m ; others if ( Xi =Y j) then C[ i, j ] =C[ i-1, j-1 ] ; else C[ i, j ] = 1 + min ( C[ i-1, j ], C[ i, j-1 ], C[ i-1, j-1 ] ) ;
25
Fall 2008Simple Parallel Algorithms25 Parallel Processing SURGERY S U R V E Y SURGERY P0P0 P0P0 P0P0 P0P0 P0P0 P0P0 P0P0 P0P0 S P1P1 P1P1 P1P1 P1P1 P1P1 P1P1 P1P1 P1P1 U P2P2 P2P2 P2P2 P2P2 P2P2 P2P2 P2P2 P2P2 R P3P3 P3P3 P3P3 P3P3 P3P3 P3P3 P3P3 P3P3 V P4P4 P4P4 P4P4 P4P4 P4P4 P4P4 P4P4 P4P4 E P5P5 P5P5 P5P5 P5P5 P5P5 P5P5 P5P5 P5P5 Y P6P6 P6P6 P6P6 P6P6 P6P6 P6P6 P6P6 P6P6 O(m) time; O(n) PEs in CREW PRAM Model
26
Fall 2008Simple Parallel Algorithms26 Example: SURGERY 00000000 S10111111 U21012222 R32101223 V43211233 E54322123 Y65433222
27
Fall 2008Simple Parallel Algorithms27 Euler Circuit The Euler Circuit of a given tree can be represented as a list of directed arcs. In the Euler Circuit, whenever we travel along the arc, we have just traversed the vertex v, where p(v) is the parent of v.
28
Fall 2008Simple Parallel Algorithms28 Example of Euler Circuit The Euler Circuit is { }. Adjacency list of the tree vadj(v)v 1275 21, 485, 10 3495 42, 3, 5108, 11 54, 6, 7, 8, 91110 65
29
Fall 2008Simple Parallel Algorithms29 Algorithm of Euler Circuit Algorithm Parallel-Euler-Circuit Input: A tree T represented by its adjacency list with some additional pointers. Output: Successor( ) for every arc. BEGIN For every arc do in parallel Successor( ) =, where w occurs next to u in the ordered list of vertices adjacent to v. If u appears last in the list of vertices adjacent to v, then w is the first node in the list. End-parallel END. O(1) time, O(n) PEs in EREW PRAM model
30
Fall 2008Simple Parallel Algorithms30 Post Order Traversal Method The post order traversal method is an order to visit the nodes of the tree. The post order traversal a tree T with root r consists of the post order traversal of the subtrees of r from left to right, followed by the root r. The post order traversal: 1, 2, 3, 6, 7, 11, 10, 8, 9, 5, 4.
31
Fall 2008Simple Parallel Algorithms31 Post Order Numbering The post order numbering is the function which gives the rank of the vertex in the post order traversal sequence. For the previous example, the post order numbering, post(v), is given by: Vertex v1234567891011 post(v)1231110458976
32
Fall 2008Simple Parallel Algorithms32 Steps of Post Order Numbering Step 1: For every arc, if u is the parent of v, then assign weight 0 to ; otherwise, assign the weight 1 to. Step 2: Perform the prefix sum of the weights of the arcs as per the list specified by the successor function of the Euler circuit. Step 3: For every vertex v, post(v) is the prefix sum of the arc. Step 4: Post order numbering of the root is n, where n is the number of vertices in the tree.
33
Fall 2008Simple Parallel Algorithms33 Post order Numbering as Prefix Sum
34
Fall 2008Simple Parallel Algorithms34 Parallel Algorithm Algorithm Parallel-Post-Order-Numbering Input: A tree T with root r represented by Euler circuit. Output: For every vertex v, the post order numbering post(v). BEGIN For every arc do in parallel If u = p(v), assign the weight 0, Else assign the weight 1; End-parallel Find the prefix sum of the list of weights specified by the successor function; For every vertex v do in parallel post(v) = prefix sum of the arc ; End-parallel post(r) = n; END. O(log n) time, O(n) PEs in CREW model
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.