TECHNIQUES ALGORITHMS AND SOFTWARE FOR THE DIRECT SOLUTION OF LARGE SPARSE LINEAR EQUATIONS.

Slides:

Advertisements

Similar presentations

Fill Reduction Algorithm Using Diagonal Markowitz Scheme with Local Symmetrization Patrick Amestoy ENSEEIHT-IRIT, France Xiaoye S. Li Esmond Ng Lawrence.

Advertisements

A Discrete Adjoint-Based Approach for Optimization Problems on 3D Unstructured Meshes Dimitri J. Mavriplis Department of Mechanical Engineering University.

Algebraic, transcendental (i.e., involving trigonometric and exponential functions), ordinary differential equations, or partial differential equations...

ECE 552 Numerical Circuit Analysis Chapter Four SPARSE MATRIX SOLUTION TECHNIQUES Copyright © I. Hajj 2012 All rights reserved.

MATH 685/ CSI 700/ OR 682 Lecture Notes

Solving Linear Systems (Numerical Recipes, Chap 2)

Lecture 7 Intersection of Hyperplanes and Matrix Inverse Shang-Hua Teng.

Systems of Linear Equations

SOLVING SYSTEMS OF LINEAR EQUATIONS. Overview A matrix consists of a rectangular array of elements represented by a single symbol (example: [A]). An individual.

CISE301_Topic3KFUPM1 SE301: Numerical Methods Topic 3: Solution of Systems of Linear Equations Lectures 12-17: KFUPM Read Chapter 9 of the textbook.

Numerical Algorithms Matrix multiplication

Solution of linear system of equations

Chapter 2, Linear Systems, Mainly LU Decomposition.

1 On The Use Of MA47 Solution Procedure In The Finite Element Steady State Heat Analysis Ana Žiberna and Dubravka Mijuca Faculty of Mathematics Department.

1cs542g-term Notes  Assignment 1 will be out later today (look on the web)

1cs542g-term Notes  Assignment 1 is out (questions?)

Linear Algebraic Equations

Special Matrices and Gauss-Siedel

The Landscape of Ax=b Solvers Direct A = LU Iterative y’ = Ay Non- symmetric Symmetric positive definite More RobustLess Storage (if sparse) More Robust.

CS 240A: Solving Ax = b in parallel °Dense A: Gaussian elimination with partial pivoting Same flavor as matrix * matrix, but more complicated °Sparse A:

ECIV 520 Structural Analysis II Review of Matrix Algebra.

Ordinary least squares regression (OLS)

ECIV 301 Programming & Graphics Numerical Methods for Engineers REVIEW II.

Monica Garika Chandana Guduru. METHODS TO SOLVE LINEAR SYSTEMS Direct methods Gaussian elimination method LU method for factorization Simplex method of.

ECE 530 – Analysis Techniques for Large-Scale Electrical Systems

Systems of Linear Equations

1 1.1 © 2012 Pearson Education, Inc. Linear Equations in Linear Algebra SYSTEMS OF LINEAR EQUATIONS.

Chapter 8 Objectives Understanding matrix notation.

CS 290H Lecture 12 Column intersection graphs, Ordering for sparsity in LU with partial pivoting Read “Computing the block triangular form of a sparse.

Scientific Computing Linear Systems – LU Factorization.

MUMPS A Multifrontal Massively Parallel Solver IMPLEMENTATION Distributed multifrontal.

Advanced Computer Graphics Spring 2014 K. H. Ko School of Mechatronics Gwangju Institute of Science and Technology.

MA2213 Lecture 5 Linear Equations (Direct Solvers)

Using LU Decomposition to Optimize the modconcen.m Routine Matt Tornowske April 1, 2002.

Scalabilities Issues in Sparse Factorization and Triangular Solution Sherry Li Lawrence Berkeley National Laboratory Sparse Days, CERFACS, June 23-24,

Fast Low-Frequency Impedance Extraction using a Volumetric 3D Integral Formulation A.MAFFUCCI, A. TAMBURRINO, S. VENTRE, F. VILLONE EURATOM/ENEA/CREATE.

Chapter 3 Solution of Algebraic Equations 1 ChE 401: Computational Techniques for Chemical Engineers Fall 2009/2010 DRAFT SLIDES.

EE616 Dr. Janusz Starzyk Computer Aided Analysis of Electronic Circuits Innovations in numerical techniques had profound import on CAD: –Sparse matrix.

Lecture 8 Matrix Inverse and LU Decomposition

JAVA AND MATRIX COMPUTATION

Solution of Sparse Linear Systems

Lecture 4 Sparse Factorization: Data-flow Organization

Parallel Solution of the Poisson Problem Using MPI

Direct Methods for Sparse Linear Systems Lecture 4 Alessandra Nardi Thanks to Prof. Jacob White, Suvranu De, Deepak Ramaswamy, Michal Rewienski, and Karen.

Direct Methods for Linear Systems Lecture 3 Alessandra Nardi Thanks to Prof. Jacob White, Suvranu De, Deepak Ramaswamy, Michal Rewienski, and Karen Veroy.

Chapter 9 Gauss Elimination The Islamic University of Gaza

Linear Algebra Libraries: BLAS, LAPACK, ScaLAPACK, PLASMA, MAGMA

ECE 530 – Analysis Techniques for Large-Scale Electrical Systems Prof. Hao Zhu Dept. of Electrical and Computer Engineering University of Illinois at Urbana-Champaign.

1 Chapter 7 Numerical Methods for the Solution of Systems of Equations.

CS 290H Lecture 15 GESP concluded Final presentations for survey projects next Tue and Thu 20-minute talk with at least 5 min for questions and discussion.

Linear Systems Dinesh A.

Programming Massively Parallel Graphics Multiprocessors using CUDA Final Project Amirhassan Asgari Kamiabad

CS 290H Lecture 9 Left-looking LU with partial pivoting Read “A supernodal approach to sparse partial pivoting” (course reader #4), sections 1 through.

ECE 530 – Analysis Techniques for Large-Scale Electrical Systems Prof. Hao Zhu Dept. of Electrical and Computer Engineering University of Illinois at Urbana-Champaign.

Linear Algebra Libraries: BLAS, LAPACK, ScaLAPACK, PLASMA, MAGMA Shirley Moore CPS5401 Fall 2013 svmoore.pbworks.com November 12, 2012.

Symmetric-pattern multifrontal factorization T(A) G(A)

Unit #1 Linear Systems Fall Dr. Jehad Al Dallal.

Conjugate gradient iteration One matrix-vector multiplication per iteration Two vector dot products per iteration Four n-vectors of working storage x 0.

Parallel Direct Methods for Sparse Linear Systems

PIVOTING The pivot or pivot element is the element of a matrix, or an array, which is selected first by an algorithm (e.g. Gaussian elimination, simplex.

Chapter 2, Linear Systems, Mainly LU Decomposition

Numerical Linear Algebra

Spring Dr. Jehad Al Dallal

CS 290N / 219: Sparse Matrix Algorithms

Linear Equations.

Chapter 10: Solving Linear Systems of Equations

Linear Systems Numerical Methods.

Lecture 8 Matrix Inverse and LU Decomposition

Ax = b Methods for Solution of the System of Equations (ReCap):

Presentation transcript:

TECHNIQUES ALGORITHMS AND SOFTWARE FOR THE DIRECT SOLUTION OF LARGE SPARSE LINEAR EQUATIONS

Wish to solve Ax = b A is LARGE and S P A R S E

A LARGE... ORDER n(t) t n SPARSE... NUMBER ENTRIES kn k ~ 2 - log n

NUMERICAL APPLICATIONS Stiff ODEs … BDF … Sparse Jacobian Linear Programming … Simplex Optimization/Nonlinear Equations Elliptic Partial Differential Equations Eigensystem Solution Two Point Boundary Value Problems Least Squares Calculations

APPLICATION AREAS PhysicsCFD Lattice gauge Atomic spectra ChemistryQuantum chemistry Chemical engineering Civil EngineeringStructural analysis Electrical EngineeringPower systems Circuit simulation Device simulation GeographyGeodesy DemographyMigration EconomicsEconomic modelling Behavioural sciencesIndustrial relations PoliticsTrading PsychologySocial dominance Business administrationBureaucracy Operations researchLinear Programming

THERE FOLLOWS PICTURES OF SPARSE MATRICES FROM VARIOUS APPLICATIONS This is done to illustrate different structures for sparse matrices

STANDARD SETS OF SPARSE MATRICES Original set Harwell-Boeing Sparse Matrix Collection Available by anonymous ftp from: ftp.numerical.rl.ac.uk or from the Web page: Extended set of test matrices available from: Tim Davis

RUTHERFORD-BOEING SPARSE MATRIX TEST COLLECTION Unassembled finite-element matrices Least-squares problems Unsymmetric eigenproblems Large unsymmetric matrices Very large symmetric matrices Singular Systems See Report: The Rutherford-Boeing Sparse Matrix Collection Iain S Duff, Roger G Grimes and John G Lewis Report RAL-TR , Rutherford Appleton Laboratory To be distributed via matrix market

RUTHERFORD - BOEING SPARSE MATRIX COLLECTION Anonymous FTP to FTP.CERFACS.FR cd pub/algo/matrices/rutherford_boeing OR

A x = b has solution x = A -1 b notation only. Do not use or even think of inverse of A. For Sparse A, A -1 is usually dense [egA tridiagonal arrowhead ]

DIRECT METHODS Gaussian Elimination PAQ  LU Permutations P and Q chosen to preserve sparsity and maintain stability L : Lower triangular (sparse) U : Upper triangular (sparse) SOLVE: Ax = b by Ly = Pb then UQx = y

PHASES OF SPARSE DIRECT SOLUTION Although the exact subdivision of tasks for sparse direct solution will depend on the algorithm and software being used, a common subdivision is given by: ANALYSE An analysis phase where the matrix is analysed to produce a suitable ordering and data structures for efficient factorization. FACTORIZE A factorization phase where the numerical factorization is performed. SOLVE A solve phase where the factors are used to solve the system using forward and backward substitution. We note the following: ANALYSE is sometimes preceded by a preordering phase to exploit structure. For general unsymmetric systems, the ANALYSE and FACTORIZE phases are sometimes combined to ensure the ordering does not compromise stability. Note that the concept of separate FACTORIZE and ANALYSE phases is not present for dense systems.

STEPS MATCH SOLUTION REQUIREMENTS 1. One-off solution Ax = b A/F/S 2. Sequence of solutions [Matrix changes but structure is invarient] A 1 x 1 = b 1 A 2 x 2 = b 2 A/F/S/F/S/F/S A 3 x 3 = b 3 3. Sequence of solutions [Same matrix] Ax 1 = b 1 Ax 2 = b 2 A/F/S/S/S Ax 3 = b 3 For example …. 2. Solution of nonlinear system A is Jacobian 3. Inverse iteration

TIMES (MA48) FOR ‘TYPICAL’ EXAMPLE [Seconds on CRAY YMP … matrix GRE 1107] ANALYSE/FACTORIZE.66 FACTORIZE.075 SOLVE.0068

DIRECT METHODS Easy to package High accuracy Method of choice in many applications Not dramatically affected by conditioning Reasonably independent of structure However High time and storage requirement Typically limited to n ~ So Use on subproblem Use as preconditioner

COMBINING DIRECT AND ITERATIVE METHODS … can be thought of as sophisticated preconditioning. Multigrid … using direct method as coarse grid solver. Domain Decomposition … using direct method on local subdomains and ‘direct’ preconditioner on interface. Block Iterative Methods … direct solver on sub-blocks Partial Factorization as Preconditioner ‘Full’ Factorization of Nearby Problem as a Preconditioner.

FOR EFFICIENT SOLUTION OF SPARSE EQUATIONS WE MUST Only store nonzeros (or exceptionally a few zeros also) Only perform arithmetic with nonzeros Preserve sparsity during computation

COMPLEXITY of LU factorization on dense matrices of order n 2/3 n 3 + 0(n 2 ) flops n 2 storage For band (order n, bandwidth k) 2 k 2 n work, nk storage Five-diagonal matrix (on k x k grid) 0 (k 3 ) work and 0 (k 2 log k) storage Tridiagonal + Arrowhead matrix 0 (n) work + storage Target 0 (n) + 0 (  ) for sparse matrix of order n with  entries

GRAPH THEORY We form a graph on n vertices associated with our matrix of order n. Edge (i, j) exists in the graph if and only if entry a ij (and, by symmetry a ji ) in the matrix is non-zero. Example x x x x x x x x x x x Symmetric Matrix and Associated Graph x 6 x x x x x

MAIN BENEFITS OF USING GRAPH THEORY IN STUDY OF SYMMETRIC GAUSSIAN ELIMINATION 1. Structure of graph is invariant under symmetric permutations of the matrix (corresponds to re- labelling of vertices). 2. For mesh problems, there is usually an equivalence between the mesh and the graph associated with the resulting matrix. We thus work directly with the underlying structure. 3. We can represent cliques in graphs by listing vertices in a clique without storing all the interconnecting edges.

COMPLEXITY FOR MODEL PROBLEM Model Problem with n variables Ordering Time0 (1).. 0 (n) Symbolic0 (n) Factorization Numerical0 (n 3/2 ) Solve0 (n log 2 n) and big 0 is not so big

Ordering for sparsity … unsymmetric matrices * x x x v v r i = 3, c j = 4 Minimize (r i - 1) * (c j - 1) MARKOWITZ ORDERING (Markowitz 1957) Choose nonzero satisfying numerical criterion with best Markowitz count

Benefits of Sparsity on Matrix of Order 2021 with 7353 entries Total Storage Flops Time (secs) Procedure (Kwords) (10 6 ) CRAY J90 Treating system as dense Storing and operating only on nonzero entries Using sparsity pivoting

An example of a code that uses Markowitz and Threshold Pivoting is the HSL Code MA48

Comparison between sparse (MA48 from HSL) and dense (SGESV from LAPACK). Times for factorization/solution in seconds on CRAY Y-MP.

HARWELL SUBROUTINE LIBRARY Started at Harwell in 1963 Most routines from research work of group Particular strengths in: - Sparse Matrices - Optimization - Large-Scale system solution Since 1990 Collaboration between RAL and Harwell HSL October 2000 URL:

Wide Range of Sparse Direct Software Markowitz + ThresholdMA48 Band - Variable Band (Skyline)MA55 Frontal - SymmetricMA62 - UnsymmetricMA42 Multifrontal - Symmetric MA57 and MA67 - UnsymmetricMA41 - Very UnsymmetricMA38 - Unsymmetric elementMA46 Supernodal SUPERLU

Features of MA48 Uses Markowitz with threshold privoting Factorizes rectangular and singular systems Block Triangularization - Done at ‘higher’ level - Internal routines work only on single block Switch to full code at all phases of solution Three factorize entries - Only pivot order is input - Fast factorize.. Uses structures generated in first factorize - Only factorizes later columns of matrix Iterative refinement and error estimator in Solve phase Can employ drop tolerance ‘Low’ storage in analysis phase

TWO PROBLEMS WITH MARKOWITZ & THRESHOLD 1. Code Complexity  Implementation Efficiency 2. Exploitation of Parallelism

TYPICAL INNERMOST LOOP DO 590 JJ = J1, J2 *J = ICN (JJ) IF (IQ(J). GT.0) GO TO 590 IOP = IOP + 1 *PIVROW = IJPOS - IQ (J) A(JJ) = A(JJ) + AU x A (PIVROW) IF (LBIG) BIG = DMAXI (DABS(A(JJ)), BIG) IF (DABS (A(JJ)). LT. TOL) IDROP = IDROP + 1 ICN (PIVROW) = ICN (PIVROW) 590 CONTINUE

INDIRECT ADDRESSING Increases memory traffic Results in bad data distribution and non- localization of data And non-regularity of access Suffers from short ‘vector’ lengths

SOLUTION Isolate indirect addressing W(IND(I)) from inner loops(s). Use full/dense matrix kernels (Level 2 & 3 BLAS) in innermost loops.

MAIN PROBLEM WITH PARALLEL IMPLEMENTATION IS THAT SPARSE DIRECT TECHNIQUES 1. REDUCE GRANULARITY 2. INCREASE SYNCHRONIZATION 3. INCREASE FRAGMENTATION OF DATA

PARALLEL IMPLEMENTATION X Obtain by choosing entries so that no two lie in same row or column. Then do rank k update.

TECHNIQUES FOR USING LEVEL 3 DENSE BLAS IN SPARSE FACTORIZATION Frontal methods (1971) Multifrontal methods (1982) Supernodal methods (1985) All pre-dated BLAS in 1990

BLAST BLAS TECHNICAL FORUM Web Page: has both: Dense and Sparse BLAS