Download presentation
Presentation is loading. Please wait.
Published byMadlyn Blankenship Modified over 9 years ago
1
A CONDENSATION-BASED LOW COMMUNICATION LINEAR SYSTEMS SOLVER UTILIZING CRAMER'S RULE Ken Habgood, Itamar Arel Department of Electrical Engineering & Computer Science The University of Tennessee GABRIEL CRAMER (1704-1752)
2
EECS Department / University of Tennessee EECS Department / University of Tennessee http://mil.engr.utk.edu Outline 2 Motivation & problem statement Algorithm review Numerical accuracy & stability Parallel Implementation Communication Results Source: http://tridane.faculty.asu.edu
3
EECS Department / University of Tennessee EECS Department / University of Tennessee http://mil.engr.utk.edu Introduction Mainstream approach: Gaussian Elimination e.g. LU decomposition Looking for a lower communication overhead, efficient parallel solver Targeting an unpopular approach: Cramer’s Rule 3
4
EECS Department / University of Tennessee EECS Department / University of Tennessee http://mil.engr.utk.edu LU Communication Pattern Source: http://www.caam.rice.edu/~timwar/MA471F03/ Communication for distributed LU decomposition L00 U00 U01U02 L10A11A12 L20A21A22 Three sequential steps 1. Top left computes and sends 2. Row and column leads compute and send 3. Remaining processors factorize their blocks One-to-one communication Idle time while leads processing 4
5
EECS Department / University of Tennessee EECS Department / University of Tennessee http://mil.engr.utk.edu Outline 5 Motivation & problem statement Algorithm review Numerical accuracy & stability Parallel Implementation Communication Results Source: http://tridane.faculty.asu.edu
6
EECS Department / University of Tennessee EECS Department / University of Tennessee http://mil.engr.utk.edu Proposed Algorithm Flow 6
7
EECS Department / University of Tennessee EECS Department / University of Tennessee http://mil.engr.utk.edu Matrix “Mirroring” Mirroring example Applying Chio’s condensation yields: 7
8
EECS Department / University of Tennessee EECS Department / University of Tennessee http://mil.engr.utk.edu Outline 8 Motivation & problem statement Algorithm review Numerical accuracy & stability Parallel Implementation Communication Results Source: http://tridane.faculty.asu.edu
9
EECS Department / University of Tennessee EECS Department / University of Tennessee http://mil.engr.utk.edu Accuracy and Numerical Stability Backward error estimation Theoretical estimate of rounding error E matrix depends on two items The largest element in A or b The growth factor of the algorithm Same growth factor as LU-decomposition with partial pivoting 9
10
EECS Department / University of Tennessee EECS Department / University of Tennessee http://mil.engr.utk.edu Forward Error Comparisons Matrix Size κ(A) Max MatlabMax GSL Avg Matlab Avg GSL 1000 x 10005069302.39E-091.93E-101.03E-105.38E-12 2000 x 20007903454.52E-095.36E-091.01E-107.27E-12 3000 x 300015401521.95E-081.84E-081.12E-102.09E-11 4000 x 4000127605994.81E-085.62E-081.43E-107.91E-11 5000 x 50007657862.92E-084.39E-081.18E-103.46E-11 6000 x 600014994308.67E-088.70E-081.37E-106.04E-11 7000 x 700034880109.92E-088.95E-081.27E-105.15E-11 8000 x 800081540209.09E-089.43E-081.86E-107.85E-11 10
11
EECS Department / University of Tennessee EECS Department / University of Tennessee http://mil.engr.utk.edu Forward Error - Residual Matrix Sizeκ(A) Max Residual Avg Residual 1000 x 10005069303.14E-084.46E-09 2000 x 20007903456.72E-099.48E-10 3000 x 300015401522.79E-083.28E-09 4000 x 4000127605991.06E-051.34E-06 5000 x 50007657862.00E-082.65E-09 6000 x 600014994302.95E-083.86E-09 7000 x 700034880101.99E-082.44E-09 8000 x 800081540201.94E-082.32E-09 11
12
EECS Department / University of Tennessee EECS Department / University of Tennessee http://mil.engr.utk.edu MATLAB Matrix Gallery Special Matrix Avg MatlabResidual Matlab Residual clement — Tridiagonal matrix with zero diagonal entries1.40E-057.43E+1337.85E+144 lehmer — Symmetric positive definite matrix2.49E-067.20E-093.89E-06 circul — Circulant matrix3.23E-081.53E-131.04E-09 chebspec — Chebyshev spectral differentiation matrix9.12E-023.74E+042.0E-01 lesp — Tridiagonal matrix with real, sensitive eigenvalues9.56E-115.11E-167.30E-10 minij — Symmetric positive definite matrix5.14E-101.71E-086.59E-06 orthog — Orthogonal and nearly orthogonal matrices1.03E-071.09E-142.80E-08 randjorth — Random J-orthogonal matrix1.55E-041.68E-001.13E-04 12
13
EECS Department / University of Tennessee EECS Department / University of Tennessee http://mil.engr.utk.edu Outline 13 Motivation & problem statement Algorithm review Numerical accuracy & stability Parallel Implementation Communication Results Source: http://tridane.faculty.asu.edu
14
EECS Department / University of Tennessee EECS Department / University of Tennessee http://mil.engr.utk.edu Serial Performance Results support the theoretical ~2.5x complexity ratio 14
15
EECS Department / University of Tennessee EECS Department / University of Tennessee http://mil.engr.utk.edu Algorithm Processing Flow 15
16
EECS Department / University of Tennessee EECS Department / University of Tennessee http://mil.engr.utk.edu Overview of Parallel Implementation 16
17
EECS Department / University of Tennessee EECS Department / University of Tennessee http://mil.engr.utk.edu Parallel Implementation (cont’) 17
18
EECS Department / University of Tennessee EECS Department / University of Tennessee http://mil.engr.utk.edu Two phases of parallel communication Parallel Chio’s Gather Columns Overall Bandwidth Communication Complexity N: Original matrix size, P: number of processors, F: gather columns size 18
19
EECS Department / University of Tennessee EECS Department / University of Tennessee http://mil.engr.utk.edu Communication Overhead 19
20
EECS Department / University of Tennessee EECS Department / University of Tennessee http://mil.engr.utk.edu Point at which Communication “dead time” matches computational workload Where’s the Breakeven Point? Assuming d C =.05 and N = 1000, the breakeven processors point would be P ~142 20
21
EECS Department / University of Tennessee EECS Department / University of Tennessee http://mil.engr.utk.edu Closing Thoughts … Proposed O(N 3 ) Cramer’s Rule method Significantly lower communications overhead Many more “broadcasts” than “unicasts” Comm. function of problem size not processors Next steps … Optimize parallel implementation Spare matrix version 21
22
EECS Department / University of Tennessee EECS Department / University of Tennessee http://mil.engr.utk.edu Thank you 22
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.