Download presentation
Presentation is loading. Please wait.
Published byGerald Wilson Modified over 9 years ago
1
Numerical Algorithms.Matrix Multiplication.Gaussian Elimination.Jacobi Iteration.Gauss-Seidel Relaxation
2
Numerical Algorithms Matrix addition
3
Numerical Algorithms Matrix Multiplication
4
Numerical Algorithms Matrix-Vector Multiplication
5
Implementing Matrix Multiplication for(i=0 ; i<n ; i++) for(j=0 ; j<n ; j++){ c[i][j] = 0; for(k=0 ; k<n ; k++) c[i][j] = c[i][j] + a[i][k] * b[k][j]; } Sequential CodeO(n 3 )
6
Implementing Matrix Multiplication Partitioning into Submatrices for(p=0 ; p<s ; p++) for(q=0 ; q<s ; q++){ Cp,q = 0; for(r=0 ; r<m ; r++) Cp,q = Cp,q + Ap,r * Br,q; }
7
Implementing Matrix Multiplication
10
Analysis communication computation
11
Implementing Matrix Multiplication O(n 2 ) with n 2 processors O(log n) with n 3 processors
12
Implementing Matrix Multiplication submatrices s=n/m communication computation
13
Recursive Implementation mat_mult(App, Bpp, s) { if( s==1) C=A*B; else{ s = s/2; P0 = mat_mult(App, Bpp, s); P1 = mat_mult(Apq, Bqp, s); P2 = mat_mult(App, Bpq, s); P3 = mat_mult(Apq, Bqq, s); P4 = mat_mult(Aqp, Bpp, s); P5 = mat_mult(Aqq, Bqp, s); P6 = mat_mult(Aqp, Bpq, s); P7 = mat_mult(Aqq, Bqq, s); Cpp = P0 + P1; Cpq = P2 + P3; Cqp = P4 + P5; Cqq = P6 + P7; } return(C); }
14
Mesh Implementation Connon's Algorithm 1. initially processor Pij has element Aij and Bij 2. Elements are moved from their initial position to an "aligned" position. The complete ith row of A is shifted i places left and the complete jth column of B is shifted j places downward. this has the effect of placing the elements aij+1 and the element bi+jj in processor Pij, as illusrated in figure 10.10. These elements are pair of those required in the accumulation of cij 3. Each processor, P1j, multiplies its elements. 4. The ith row of A is shifted one place right, and the jth column of B is shifted one place downward. this has the effect of bringing together the adjacent elements of A and B, which will also be required in the accumulation, as illustrated in Figure 10.11. 5. Each processor, Pij, multiplies the elements brought to it and adds the result to the accumulation sum. 6. Step 4 and 5 are repeated until the final result is obtained
15
Mesh Implementation
16
Analysis communication computation O(sm 2 )
17
Two dimensional pipeline--- Systolic array recv(&a, Pi,j-1); recv(&b, Pi-1,j); c=c+a*b; send(&a, Pi,j+1); send(&b, Pi+1,j);
18
Two dimensional pipeline--- Systolic array
19
Solving a System of Linear Equations Ax=b Dense matrix Sparse matrix
20
Solving a System of Linear Equations Gaussian Elimination
21
Solving a System of Linear Equations for(i=0 ; i<n-1 ; i++) for(j=i+1 ; j<n ; j++){ m = a[j][i]/a[i][i]; for(k=i ; k<n ; k++) a[j][k] = a[j][k] - a[i][k] * m; b[j] = b[j] - b[i] * m; O(n 3 )
22
Solving a System of Linear Equations communication O(n 2 )
23
Solving a System of Linear Equations computation
24
Solving a System of Linear Equations Pipeline configuration
25
Solving a System of Linear Equations
26
Iterative Methods Jacobi Iteration
27
Iterative Methods
29
Relationship with a General System of Linear Equations Iterative Methods
32
Gauss-Seidel Relaxation
33
Iterative Methods
34
Red-Black Ordering
35
Iterative Methods forall(i=0 ; i<n ; i++) forall(j=1 ; j<n ; j++) if((i+j)%2 == 0) f[i][j] = 0.25*(f[i-1][j]+f[i][j-1]+f[i+1][j]+f[i][j+1]); forall(i=1 ; i<n ; i++) forall(j=1 ; j<n ; j++) if((i+j)%2 !=0 ) f[i][j] = 0.25*(f[i-1][j]+f[i][j-1]+f[i+1][j]+f[i][j+1]);
36
Iterative Methods High-Order Difference Methods
37
Iterative Methods Multigrid Method
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.