Symmetric Minimum Priority Ordering for Sparse Unsymmetric Factorization Patrick Amestoy ENSEEIHT-IRIT (Toulouse) Sherry Li LBNL/NERSC (Berkeley) Esmond Ng LBNL/NERSC (Berkeley)
ERCIM-Rennes, Feb, Contents Motivation Graph models for elimination Minimum priority metrics Preliminary results Summary
ERCIM-Rennes, Feb, Motivation -- New LU Factorization Algorithms Inexpensive pre/post-processing Equilibration (or scaling) Pre-permute rows or columns of A to maximize its diagonal Find a matching with maximum weight for bipartite graph of A Example: MC64 [Duff/Koster ‘99] Iterative refinement GESP (static pivoting) [Li/Demmel ‘98, SuperLU_DIST] Pivots are chosen from the diagonal Allow half-precision perturbation of small diagonals Unsymmetrized multifrontal [Amestoy/Puglisi ‘00, MA41_NEW] Prefer diagonal pivoting, but threshold pivoting is possible Allow unsymmetric fronts, but dependency graph is still a tree Diagonal is (almost) good Struct(L’) Struct(U)
ERCIM-Rennes, Feb, Existing Ordering Strategies for Preserving Sparsity Symmetric ordering algorithms on A’+A Minimum priority e.g., minimum degree, minimum deficiency, etc. Graph partitioning Hybrid Problem: unsymmetric structure is not respected!
ERCIM-Rennes, Feb, Ordering Algorithms Revisit Markowitz [1957] for unsymmetric matrices At step k, pick pivot in the trailing submatrix so that: It has minimum, and It is bounded by a numerical threshold Bound the size of the rank-1 update matrix Expensive to implement because it is mixed with numerical concern Examples: MA48 (HSL), etc. “Restricted” Markowitz -- only look ahead a few candidate columns (rows) with the lowest degrees [Zlatev ‘80] Minimum degree [Tinney/Walker ‘67] Special case of Markowitz for SPD systems Efficient implementation, because: Diagonal is good as numerical pivot Use quotient graph as a compact representation without regard of numerical values
ERCIM-Rennes, Feb, Simulation Result Order(A) vs. Order(A’+A) (Markowitz vs. min degree) Diagonal pivoting 88 unsymmetric matrices Mean fill ratio 0.90 Mean flops ratio 0.79 54 very unsymmetric (symmetry <= 0.5) Mean fill ratio 0.85 Mean flops ratio 0.56
ERCIM-Rennes, Feb, Elimination Rules Symmetric Undirected graph After vertex i is eliminated, all its neighbors become a clique Unsymmetric Bipartite graph After vertex i is eliminated, all the row and column vertices adjacent to i become fully connected -- a “clique”. (assuming diagonal pivot) ii r1 r2 c1 c2 c3 eliminate i c1r1 r2 c2 c3
ERCIM-Rennes, Feb, Cost of Implementation Elimination models can be implemented using standard graphs or quotient graphs, with different cost in time & space.
ERCIM-Rennes, Feb, Quotient Graph -- Symmetric Elements -- representative nodes of the connected components in the eliminated subgraph Variables -- uneliminated nodes Current pivot p: If variable v adjacent to e1, it will be adjacent to p e1 can be absorbed by p p is representative of conn. comp. {e1, e2, p} e1 e2 pxx x x. element list = {e1, e2}. variable list v
ERCIM-Rennes, Feb, Quotient Graph -- Unsymmetric Current pivot p: Difficulty: Path length may be greater than 2 ! e1 e2 p x x x v
ERCIM-Rennes, Feb, Quotient Graph -- “Local Symmetrization” e1 e2 p x x x v Current pivot p: Advantage: - Path length bounded by 2 ! Disadvantage: - Lose some asymmetry - More fill ss s
ERCIM-Rennes, Feb, Minimum Priority Metrics Metrics are based on “approximate degree” in the sense of AMD, can be implemented efficiently Almost the same cost using various metrics: Based on row & column counts: PRODUCT (a.k.a. Markowitz), SUM, MIN, MAX, etc. Minimum fill : areas associated with the existing cliques are deducted …...
ERCIM-Rennes, Feb, Preliminary Results with Local Symmetrization Matrices: 98 unsymmetric in structure Metrics : based on row/column counts or fill Solvers: MA41_NEW : unsymmetrized multifrontal Local symmetrization ordering is ideal for this solver SuperLU_DIST : GESP
ERCIM-Rennes, Feb, Compare Different Metrics Solver: MA41_NEW Average fill ratio using various metrics with respect to Markowitz (product of row & col counts)
ERCIM-Rennes, Feb, Compare with AMD(A’+A) using Min Fill -- All Unsymmetric MA41_NEW SuperLU_DIST
ERCIM-Rennes, Feb, Compare with AMD(A’+A) using Min Fill -- Very Unsymmetric MA41_NEW SuperLU_DIST
ERCIM-Rennes, Feb, Summary First implementation based on BQG model Features: supervariable, element absorption, mass elimination Using approximate degree (degree upper bound) Tried various metrics on large collection of matrices PRODUCT, SUM, MIN-FILL, etc. Not a single one is universally best, MIN-FILL is often better Local symmetrization Cheaper to implement, harder to understand behavior Especially suitable for unsymmetrized multifrontal, also benefit GESP Respectable gain for very unsymmetric matrices
ERCIM-Rennes, Feb, Summary (con’d) Results for very unsymmetric matrices Future work Work underway for a fully unsymmetric version Extend to graph partitioning strategy
ERCIM-Rennes, Feb, The End
ERCIM-Rennes, Feb, x 2 x x x 3 x 4 x 5 x x x 6 x x 7 Example A G(A) row column