Trilinos 102: Advanced Concepts November 7, 2007 8:30-9:30 a.m. Mike Heroux Jim Willenbring.

Slides:



Advertisements
Similar presentations
Configuration management
Advertisements

MPI Message Passing Interface
Parallel Jacobi Algorithm Steven Dong Applied Mathematics.
Systems and Technology Group © 2006 IBM Corporation Cell Programming Tutorial - JHD24 May 2006 Cell Programming Tutorial Jeff Derby, Senior Technical Staff.
Practical techniques & Examples
Reference: / MPI Program Structure.
The road to reliable, autonomous distributed systems
8/23/ Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories Sandia is a multiprogram laboratory operated.
CSCI 317 Mike Heroux1 Sparse Matrix Computations CSCI 317 Mike Heroux.
Reference: Message Passing Fundamentals.
Outline for Today More math… Finish linear algebra: Matrix composition
Topic Overview One-to-All Broadcast and All-to-One Reduction
Dataface API Essentials Steve Hannah Web Lite Solutions Corp.
C++ / G4MICE Course Session 3 Introduction to Classes Pointers and References Makefiles Standard Template Library.
Epetra Concepts Data management using Epetra Michael A. Heroux Sandia National Laboratories Sandia is a multiprogram laboratory operated by Sandia Corporation,
Introduction to Parallel Programming MapReduce Except where otherwise noted all portions of this work are Copyright (c) 2007 Google and are licensed under.
8/22/ Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.
Exercise problems for students taking the Programming Parallel Computers course. Janusz Kowalik Piotr Arlukowicz Tadeusz Puzniakowski Informatics Institute.
CS 591x – Cluster Computing and Programming Parallel Computers Parallel Libraries.
Introduction to Object-oriented programming and software development Lecture 1.
SOFTWARE ENGINEERING BIT-8 APRIL, 16,2008 Introduction to UML.
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.
Trilinos 101: Getting Started with Trilinos November 7, :30-9:30 a.m. Mike Heroux Jim Willenbring.
Using PyTrilinos 2005 Trilinos Users Group Meeting 1 Nov :30-4:30 Bill Spotz.
PyTrilinos: A Python Interface to Trilinos Bill Spotz Sandia National Laboratories Reproducible Research in Computational Geophysics August 31, 2006.
Page 1 Trilinos Software Engineering Technologies and Integration Numerical Algorithm Interoperability and Vertical Integration –Abstract Numerical Algorithms.
Trilinos 101 (Part II) Creating and managing linear algebra data in Trilinos: Data management using Epetra and Teuchos Michael A. Heroux Sandia National.
Trilinos 101: Getting Started with Trilinos November 6, :30-9:30 a.m. Jim Willenbring Mike Heroux (Presenter)
Amesos Sparse Direct Solver Package Ken Stanley, Rob Hoekstra, Marzio Sala, Tim Davis, Mike Heroux Trilinos Users Group Albuquerque 3 Nov 2004.
1 Computer Programming (ECGD2102 ) Using MATLAB Instructor: Eng. Eman Al.Swaity Lecture (1): Introduction.
Message Passing Programming with MPI Introduction to MPI Basic MPI functions Most of the MPI materials are obtained from William Gropp and Rusty Lusk’s.
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.
Hybrid MPI and OpenMP Parallel Programming
MPI Introduction to MPI Commands. Basics – Send and Receive MPI is a message passing environment. The processors’ method of sharing information is NOT.
Amesos Sparse Direct Solver Package Tim Davis, Mike Heroux, Rob Hoekstra, Marzio Sala, Ken Stanley, Heidi Thornquist, Jim Willenbring Trilinos Users Group.
An Overview of Epetra Michael A. Heroux Sandia National Labs.
Amesos Interfaces to sparse direct solvers October 15, :30 – 9:30 a.m. Ken Stanley.
Scalable Linear Algebra Capability Area Michael A. Heroux Sandia National Laboratories Sandia is a multiprogram laboratory operated by Sandia Corporation,
New Features in ML 2004 Trilinos Users Group Meeting November 2-4, 2004 Jonathan Hu, Ray Tuminaro, Marzio Sala, Michael Gee, Haim Waisman Sandia is a multiprogram.
An Overview of Trilinos Michael A. Heroux Sandia National Laboratories Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin.
1 Stratimikos Unified Wrapper to Trilinos Linear Solvers and Preconditioners Roscoe A. Bartlett Department of Optimization & Uncertainty Estimation Sandia.
Epetra Tutorial Michael A. Heroux Sandia National Laboratories
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.
Teuchos: Utilities for Developers & Users November 2nd, 3:30-4:30pm Roscoe Bartlett Mike Heroux Kris Kampshoff Kevin Long Paul Sexton Heidi.
Case Study in Computational Science & Engineering - Lecture 5 1 Iterative Solution of Linear Systems Jacobi Method while not converged do { }
Linear Algebra Libraries: BLAS, LAPACK, ScaLAPACK, PLASMA, MAGMA
Introduction to OOP CPS235: Introduction.
Photos placed in horizontal position with even amount of white space between photos and header Sandia National Laboratories is a multi-program laboratory.
Single Node Optimization Computational Astrophysics.
ML: A Multilevel Preconditioning Package Copper Mountain Conference on Iterative Methods March 29-April 2, 2004 Jonathan Hu Ray Tuminaro Marzio Sala Sandia.
Combining Trilinos Packages To Solve Linear Systems Michael A. Heroux Sandia National Labs.
3/12/2013Computer Engg, IIT(BHU)1 MPI-1. MESSAGE PASSING INTERFACE A message passing library specification Extended message-passing model Not a language.
How to configure, build and install Trilinos November 2, :30-9:30 a.m. Jim Willenbring.
April 24, 2002 Parallel Port Example. April 24, 2002 Introduction The objective of this lecture is to go over a simple problem that illustrates the use.
What’s New for Epetra Michael A. Heroux Sandia National Laboratories Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin.
A Parallel Hierarchical Solver for the Poisson Equation Seung Lee Deparment of Mechanical Engineering
Linear Algebra Libraries: BLAS, LAPACK, ScaLAPACK, PLASMA, MAGMA Shirley Moore CPS5401 Fall 2013 svmoore.pbworks.com November 12, 2012.
Software Engineering Algorithms, Compilers, & Lifecycle.
Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.
Application of Design Patterns to Geometric Decompositions V. Balaji, Thomas L. Clune, Robert W. Numrich and Brice T. Womack.
DOE/Office of Science/ASCR (Sandia National Laboratories)
Auburn University
Xing Cai University of Oslo
Spark Presentation.
Trilinos Software Engineering Technologies and Integration
Introduction to parallelism and the Message Passing Interface
Overview of Workflows: Why Use Them?
SPL – PS1 Introduction to C++.
Programming Parallel Computers
Presentation transcript:

Trilinos 102: Advanced Concepts November 7, :30-9:30 a.m. Mike Heroux Jim Willenbring

Overview  How to Create a Trilinos (Compatible) Package  Adding Files to the Build System and Tarball  Adding Configure Options  Using Makefile.export for Tests and Examples  2D Objects.  Parallel Data Redistribution.

Outline  Creating Objects.  2D Objects.  Teuchos tidbits.  Performance Optimizations.

How to Create a Trilinos (Compatible) Package  Two primary cases  Using Autotools with an existing package  Starting a new package using Autotools  Both cases are similar  In either case, the package might be  Stand alone  Used via Trilinos/packages/external  Added to Trilinos

How to Create a Trilinos (Compatible) Package  Look at the new_package package  Customize the following files for your package  configure.ac  Makefile.am  src/Makefile.am  test/Makefile.am  example/Makefile.am  Makefile.export..in  (Some of the necessary changes can be made using scripts supplied by new_package)  Additional instructions are supplied with new_package

 Library source files: CORE = \ $(srcdir)/Epetra_BLAS.cpp \... $(srcdir)/Epetra_Object.cpp CORE_H = \ $(srcdir)/Epetra_BLAS.h \ … $(srcdir)/Epetra_ConfigDefs.h  Conditionally compiled files listed with ‘EXTRA_’ prefix  Don’t forget to list header files! Adding Files to the Build System and Tarball

 Makefile.am/.in:  Add the directory the new files are in to ‘SUBDIRS’ in the Makefile.am one level up SUBDIRS = DIR1 DIR2  Add the Makefile that will be generated to ‘AC_CONFIG_FILES’ in configure.ac AC_CONFIG_FILES([Makefile … src/Makefile …])  Don’t forget to ‘cvs add’ both files ./bootstrap  Other types of files (scripts, plain text, etc):  Add the name of the file to EXTRA_DIST EXTRA_DIST = script1 README … ./bootstrap

Adding Configure Options  TAC_ARG_ENABLE_CAN_USE_PACKAGE(epetra, teuchos, …)  ‘#ifdef HAVE_EPETRA_TEUCHOS’ in source code  TAC_ARG_ENABLE_FEATURE_SUB( epetra, abc, …)  ‘#ifdef HAVE_EPETRA_ARRAY_BOUNDS_CHECK’ in source

Adding Configure Options  TAC_ARG_WITH_PACKAGE(zoltan, [Enable Zoltan interface support], ZOLTAN, no)  AM_CONDITIONAL(HAVE_ZOLTAN, [test "X$ac_cv_use_zoltan" != "Xno"]) ‘if HAVE_ZOLTAN’ in Makefile.am  AC_SEARCH_LIBS(pow,[m],,AC_MSG_ERROR(Cannot find math library))

Using Makefile.export for Tests / Examples  Makefile.am: include $(top_builddir)/Makefile.export.epetra EXEEXT =.exe noinst_PROGRAMS = CrsMatrix_test CrsMatrix_test_SOURCES = $(srcdir)/cxx_main.cpp CrsMatrix_test_DEPENDENCIES=$(top_builddir)/src/libepetra.a CrsMatrix_test_CXXFLAGS = $(EPETRA_INCLUDES) CrsMatrix_test_LDADD = $(EPETRA_LIBS)

LAL Foundation: Petra  Petra provides a “common language” for distributed linear algebra objects (operator, matrix, vector)  Petra provides distributed matrix and vector services.  Has 3 implementations under development.

Perform redistribution of distributed objects: Parallel permutations. “Ghosting” of values for local computations. Collection of partial results from remote processors. Petra Object Model Abstract Interface to Parallel Machine Shameless mimic of MPI interface. Keeps MPI dependence to a single class (through all of Trilinos!). Allow trivial serial implementation. Opens door to novel parallel libraries (shmem, UPC, etc…) Abstract Interface for Sparse All-to-All Communication Supports construction of pre-recorded “plan” for data-driven communications. Examples: Supports gathering/scatter of off-processor x/y values when computing y = Ax. Gathering overlap rows for Overlapping Schwarz. Redistribution of matrices, vectors, etc… Describes layout of distributed objects: Vectors: Number of vector entries on each processor and global ID Matrices/graphs: Rows/Columns managed by a processor. Called “Maps” in Epetra. Dense Distributed Vector and Matrices: Simple local data structure. BLAS-able, LAPACK-able. Ghostable, redistributable. RTOp-able. Base Class for All Distributed Objects: Performs all communication. Requires Check, Pack, Unpack methods from derived class. Graph class for structure-only computations: Reusable matrix structure. Pattern-based preconditioners. Pattern-based load balancing tools. Basic sparse matrix class: Flexible construction process. Arbitrary entry placement on parallel machine.

Petra Implementations  Three version under development:  Epetra (Essential Petra):  Current production version.  Restricted to real, double precision arithmetic.  Uses stable core subset of C++ (circa 2000).  Interfaces accessible to C and Fortran users.  Tpetra (Templated Petra):  Next generation C++ version.  Templated scalar and ordinal fields.  Uses namespaces, and STL: Improved usability/efficiency.  Jpetra (Java Petra):  Pure Java. Portable to any JVM.  Interfaces to Java versions of MPI, LAPACK and BLAS via interfaces.

Details about Epetra Maps  Note: Focus on Maps (not BlockMaps).  Getting beyond standard use case…

1-to-1 Maps  1-to-1 map (defn): A map is 1-to-1 if each GID appears only once in the map (and is therefore associated with only a single processor).  Certain operations in parallel data repartitioning require 1- to-1 maps. Specifically:  The source map of an import must be 1-to-1.  The target map of an export must be 1-to-1.  The domain map of a 2D object must be 1-to-1.  The range map of a 2D object must be 1-to-1.

2D Objects: Four Maps  Epetra 2D objects:  CrsMatrix, FECrsMatrix  CrsGraph  VbrMatrix, FEVbrMatrix  Have four maps:  RowMap: On each processor, the GIDs of the rows that processor will “manage”.  ColMap: On each processor, the GIDs of the columns that processor will “manage”.  DomainMap: The layout of domain objects (the x vector/multivector in y=Ax).  RangeMap: The layout of range objects (the y vector/multivector in y=Ax). Must be 1-to-1 maps!!! Typically a 1-to-1 map Typically NOT a 1-to-1 map

Sample Problem = yA x

Case 1: Standard Approach  RowMap = {0, 1}  ColMap = {0, 1, 2}  DomainMap = {0, 1}  RangeMap = {0, 1}  First 2 rows of A, elements of y and elements of x, kept on PE 0.  Last row of A, element of y and element of x, kept on PE 1. PE 0 ContentsPE 1 Contents  RowMap = {2}  ColMap = {1, 2}  DomainMap = {2}  RangeMap = {2} Notes:  Rows are wholly owned.  RowMap=DomainMap=RangeMap (all 1-to-1).  ColMap is NOT 1-to-1.  Call to FillComplete: A.FillComplete(); // Assumes = yAx Original Problem

Case 2: Twist 1  RowMap = {0, 1}  ColMap = {0, 1, 2}  DomainMap = {1, 2}  RangeMap = {0}  First 2 rows of A, first element of y and last 2 elements of x, kept on PE 0.  Last row of A, last 2 element of y and first element of x, kept on PE 1. PE 0 ContentsPE 1 Contents  RowMap = {2}  ColMap = {1, 2}  DomainMap = {0}  RangeMap = {1, 2} Notes:  Rows are wholly owned.  RowMap is NOT = DomainMap is NOT = RangeMap (all 1-to-1).  ColMap is NOT 1-to-1.  Call to FillComplete: A.FillComplete(DomainMap, RangeMap); = yAx Original Problem

Case 2: Twist 2  RowMap = {0, 1}  ColMap = {0, 1}  DomainMap = {1, 2}  RangeMap = {0}  First row of A, part of second row of A, first element of y and last 2 elements of x, kept on PE 0.  Last row, part of second row of A, last 2 element of y and first element of x, kept on PE 1. PE 0 ContentsPE 1 Contents  RowMap = {1, 2}  ColMap = {1, 2}  DomainMap = {0}  RangeMap = {1, 2} Notes:  Rows are NOT wholly owned.  RowMap is NOT = DomainMap is NOT = RangeMap (all 1-to-1).  RowMap and ColMap are NOT 1-to-1.  Call to FillComplete: A.FillComplete(DomainMap, RangeMap); = yAx Original Problem

What does FillComplete Do?  A bunch of stuff.  One task is to create (if needed) import/export objects to support distributed matrix-vector multiplication:  If ColMap ≠ DomainMap, create Import object.  If RowMap ≠ RangeMap, create Export object.  A few rules:  Rectangular matrices will always require: A.FillComplete(DomainMap,RangeMap);  DomainMap and RangeMap must be 1-to-1.

Parallel Data Redistribution  Epetra vectors, multivectors, graphs and matrices are distributed via one of the map objects.  A map is basically a partitioning of a list of global IDs:  IDs are simply labels, no need to use contiguous values (Directory class handles details for general ID lists).  No a priori restriction on replicated IDs.  If we are given:  A source map and  A set of vectors, multivectors, graphs and matrices (or other distributable objects) based on source map.  Redistribution is performed by: 1.Specifying a target map with a new distribution of the global IDs. 2.Creating Import or Export object using the source and target maps. 3.Creating vectors, multivectors, graphs and matrices that are redistributed (to target map layout) using the Import/Export object.

Example: epetra/ex9.cpp int main(int argc, char *argv[]) { MPI_Init(&argc, &argv); Epetra_MpiComm Comm(MPI_COMM_WORLD); int NumGlobalElements = 4; // global dimension of the problem int NumMyElements; // local nodes Epetra_IntSerialDenseVector MyGlobalElements; if( Comm.MyPID() == 0 ) { NumMyElements = 3; MyGlobalElements.Size(NumMyElements); MyGlobalElements[0] = 0; MyGlobalElements[1] = 1; MyGlobalElements[2] = 2; } else { NumMyElements = 3; MyGlobalElements.Size(NumMyElements); MyGlobalElements[0] = 1; MyGlobalElements[1] = 2; MyGlobalElements[2] = 3; } // create a map Epetra_Map Map(-1,MyGlobalElements.Length(), MyGlobalElements.Values(),0, Comm); // create a vector based on map Epetra_Vector xxx(Map); for( int i=0 ; i<NumMyElements ; ++i ) xxx[i] = 10*( Comm.MyPID()+1 ); if( Comm.MyPID() == 0 ){ double val = 12; int pos = 3; xxx.SumIntoGlobalValues(1,0,&val,&pos); } cout << xxx; // create a target map, in which all elements are on proc 0 int NumMyElements_target; if( Comm.MyPID() == 0 ) NumMyElements_target = NumGlobalElements; else NumMyElements_target = 0; Epetra_Map TargetMap(-1,NumMyElements_target,0,Comm); Epetra_Export Exporter(Map,TargetMap); // work on vectors Epetra_Vector yyy(TargetMap); yyy.Export(xxx,Exporter,Add); cout << yyy; MPI_Finalize(); return( EXIT_SUCCESS ); }

Output: epetra/ex9.cpp > mpirun -np 2./ex9.exe Epetra::Vector MyPID GID Value Epetra::Vector Epetra::Vector MyPID GID Value Epetra::Vector PE 0 xxx(0)=10 xxx(1)=10 xxx(2)=10 PE 1 xxx(1)=20 xxx(2)=20 xxx(3)=20 PE 0 yyy(0)=10 yyy(1)=30 yyy(2)=30 yyy(3)=20 PE 1 Export/Add Before ExportAfter Export

Import vs. Export  Import (Export) means calling processor knows what it wants to receive (send).  Distinction between Import/Export is important to user, almost identical in implementation.  Import (Export) objects can be used to do an Export (Import) as a reverse operation.  When mapping is bijective (1-to-1 and onto), either Import or Export is appropriate.

Example: 1D Matrix Assembly a b -u xx = f u(a) =  0 u(b) =  1 x1x1 x2x2 x3x3 PE 0PE 1 3 Equations: Find u at x 1, x 2 and x 3 Equation for u at x 2 gets a contribution from PE 0 and PE 1. Would like to compute partial contributions independently. Then combine partial results.

Two Maps  We need two maps:  Assembly map: PE 0: { 1, 2 }. PE 1: { 2, 3 }.  Solver map: PE 0: { 1, 2 } (we arbitrate ownership of 2). PE 1: { 3 }.

End of Assembly Phase  At the end of assembly phase we have AssemblyMatrix: On PE 0: On PE 1:  Want to assign all of Equation 2 to PE 0 for use with solver.  NOTE: For a class of Neumann-Neumann preconditioners, the above layout is exactly what we want. Equation 1: Equation 2: Equation 2: Equation 3: Row 2 is shared

Export Assembly Matrix to Solver Matrix Epetra_Export Exporter(AssemblyMap, SolverMap); Epetra_CrsMatrix SolverMatrix (Copy, SolverMap, 0); SolverMatrix.Export(AssemblyMatrix, Exporter, Add); SolverMatrix.FillComplete();

Matrix Export Equation 1: Equation 2: Equation 3: Equation 1: Equation 2: Equation 2: Equation 3: PE 0 PE 1 Before Export After Export Export/Add

Example: epetraext/ex2.cpp int main(int argc, char *argv[]) { MPI_Init(&argc,&argv); Epetra_MpiComm Comm (MPI_COMM_WORLD); int MyPID = Comm.MyPID(); int n=4; // Generate Laplacian2d gallery matrix Trilinos_Util::CrsMatrixGallery G("laplace_2d", Comm); G.Set("problem_size", n*n); G.Set("map_type", "linear"); // Linear map initially // Get the LinearProblem. Epetra_LinearProblem *Prob = G.GetLinearProblem(); // Get the exact solution. Epetra_MultiVector *sol = G.GetExactSolution(); // Get the rhs (b) and lhs (x) Epetra_MultiVector *b = Prob->GetRHS(); Epetra_MultiVector *x = Prob->GetLHS(); // Repartition graph using Zoltan EpetraExt::Zoltan_CrsGraph * ZoltanTrans = new EpetraExt::Zoltan_CrsGraph(); EpetraExt::LinearProblem_GraphTrans * ZoltanLPTrans = new EpetraExt::LinearProblem_GraphTrans( *(dynamic_cast *>(ZoltanTrans)) ); cout << "Creating Load Balanced Linear Problem\n"; Epetra_LinearProblem &BalancedProb = (*ZoltanLPTrans)(*Prob); // Get the rhs (b) and lhs (x) Epetra_MultiVector *Balancedb = Prob->GetRHS(); Epetra_MultiVector *Balancedx = Prob->GetLHS(); cout << "Balanced b: " << *Balancedb << endl; cout << "Balanced x: " << *Balancedx << endl; MPI_Finalize() ; return 0 ; }

Need for Import/Export  Solvers for complex engineering applications need expressive, easy-to-use parallel data redistribution:  Allows better scaling for non-uniform overlapping Schwarz.  Necessary for robust solution of multiphysics problems.  We have found import and export facilities to be a very natural and powerful technique to address these issues.

Extending Capabilities: Preconditioners, Operators, Matrices Illustrated using AztecOO as example

Epetra User Class Categories  Sparse Matrices: RowMatrix, (CrsMatrix, VbrMatrix, FECrsMatrix, FEVbrMatrix)  Linear Operator:Operator: (AztecOO, ML, Ifpack)  Dense Matrices:DenseMatrix, DenseVector, BLAS, LAPACK, SerialDenseSolver  Vectors: Vector, MultiVector  Graphs:CrsGraph  Data Layout: Map, BlockMap, LocalMap  Redistribution:Import, Export, LbGraph, LbMatrix  Aggregates: LinearProblem  Parallel Machine: Comm, (SerialComm, MpiComm, MpiSmpComm)  Utilities: Time, Flops

LinearProblem Class  A linear problem is defined by:  Matrix A : An Epetra_RowMatrix or Epetra_Operator object. (often a CrsMatrix or VbrMatrix object.)  Vectors x, b : Vector objects.  To call AztecOO, first define a LinearProblem:  Constructed from A, x and b.  Once defined, can: Scale the problem (explicit preconditioning). Precondition it (implicitly). Change x and b.

AztecOO  Aztec is the previous workhorse solver at Sandia:  Extracted from the MPSalsa reacting flow code.  Installed in dozens of Sandia apps.  AztecOO leverages the investment in Aztec:  Uses Aztec iterative methods and preconditioners.  AztecOO improves on Aztec by:  Using Epetra objects for defining matrix and RHS.  Providing more preconditioners/scalings.  Using C++ class design to enable more sophisticated use.  AztecOO interfaces allows:  Continued use of Aztec for functionality.  Introduction of new solver capabilities outside of Aztec.  Belos is coming along as alternative.  AztecOO will not go away.  Will encourage new efforts and refactorings to use Belos.

A Simple Epetra/AztecOO Program // Header files omitted… int main(int argc, char *argv[]) { MPI_Init(&argc,&argv); // Initialize MPI, MpiComm Epetra_MpiComm Comm( MPI_COMM_WORLD ); // ***** Create x and b vectors ***** Epetra_Vector x(Map); Epetra_Vector b(Map); b.Random(); // Fill RHS with random #s // ***** Create an Epetra_Matrix tridiag(-1,2,-1) ***** Epetra_CrsMatrix A(Copy, Map, 3); double negOne = -1.0; double posTwo = 2.0; for (int i=0; i<NumMyElements; i++) { int GlobalRow = A.GRID(i); int RowLess1 = GlobalRow - 1; int RowPlus1 = GlobalRow + 1; if (RowLess1!=-1) A.InsertGlobalValues(GlobalRow, 1, &negOne, &RowLess1); if (RowPlus1!=NumGlobalElements) A.InsertGlobalValues(GlobalRow, 1, &negOne, &RowPlus1); A.InsertGlobalValues(GlobalRow, 1, &posTwo, &GlobalRow); } A.FillComplete(); // Transform from GIDs to LIDs // ***** Map puts same number of equations on each pe ***** int NumMyElements = 1000 ; Epetra_Map Map(-1, NumMyElements, 0, Comm); int NumGlobalElements = Map.NumGlobalElements(); // ***** Report results, finish *********************** cout << "Solver performed " << solver.NumIters() << " iterations." << endl << "Norm of true residual = " << solver.TrueResidual() << endl; MPI_Finalize() ; return 0; } // ***** Create/define AztecOO instance, solve ***** AztecOO solver(problem); solver.SetAztecOption(AZ_precond, AZ_Jacobi); solver.Iterate(1000, 1.0E-8); // ***** Create Linear Problem ***** Epetra_LinearProblem problem(&A, &x, &b);

AztecOO Extensibility  AztecOO is designed to accept externally defined:  Operators (both A and M): The linear operator A is accessed as an Epetra_Operator. Users can register a preconstructed preconditioner as an Epetra_Operator.  RowMatrix: If A is registered as a RowMatrix, Aztec’s preconditioners are accessible. Alternatively M can be registered separately as an Epetra_RowMatrix, and Aztec’s preconditioners are accessible.  StatusTests: Aztec’s standard stopping criteria are accessible. Can override these mechanisms by registering a StatusTest Object.

AztecOO understands Epetra_Operator Epetra_Operator Methods Documentation  AztecOO is designed to accept externally defined:  Operators (both A and M).  RowMatrix (Facilitates use of AztecOO preconditioners with external A).  StatusTests (externally- defined stopping criteria).

AztecOO Understands Epetra_RowMatrix Epetra_RowMatrix Methods

AztecOO UserOp/UserMat Recursive Call Example Trilinos/packages/aztecoo/example/AztecOO_RecursiveCall 1. Poisson2dOperator A(nx, ny, comm); // Generate nx by ny Poisson operator 2. Epetra_CrsMatrix * precMatrix = A.GeneratePrecMatrix(); // Build tridiagonal approximate Poisson 3. Epetra_Vector xx(A.OperatorDomainMap()); // Generate vectors (xx will be used to generate RHS b) 4. Epetra_Vector x(A.OperatorDomainMap()); 5. Epetra_Vector b(A.OperatorRangeMap()); 6. xx.Random(); // Generate exact x and then rhs b 7. A.Apply(xx, b); 8. // Build AztecOO solver that will be used as a preconditioner 9. Epetra_LinearProblem precProblem; 10. precProblem.SetOperator(precMatrix); 11. AztecOO precSolver(precProblem); 12. precSolver.SetAztecOption(AZ_precond, AZ_ls); 13. precSolver.SetAztecOption(AZ_output, AZ_none); 14. precSolver.SetAztecOption(AZ_solver, AZ_cg); 15. AztecOO_Operator precOperator(&precSolver, 20); 16. Epetra_LinearProblem problem(&A, &x, &b); // Construct linear problem 17. AztecOO solver(problem); // Construct solver 18. solver.SetPrecOperator(&precOperator); // Register Preconditioner operator 19. solver.SetAztecOption(AZ_solver, AZ_cg); 20. solver.Iterate(Niters, 1.0E-12);

Ifpack/AztecOO Example Trilinos/packages/aztecoo/example/IfpackAztecOO 1. // Assume A, x, b are define, LevelFill and Overlap are specified 2. Ifpack_IlukGraph IlukGraph(A.Graph(), LevelFill, Overlap); 3. IlukGraph.ConstructFilledGraph(); 4. Ifpack_CrsRiluk ILUK (IlukGraph); 5. ILUK.InitValues(A); 6. assert(ILUK->Factor()==0); // Note: All Epetra/Ifpack/AztecOO method return int err codes 7. double Condest; 8. ILUK.Condest(false, Condest); // Get condition estimate 9. if (Condest > tooBig) { 10. ILUK.SetAbsoluteThreshold(Athresh); 11. ILUK.SetRelativeThreshold(Rthresh); 12. Go back to line 4 and try again 13. } 14. Epetra_LinearProblem problem(&A, &x, &b); // Construct linear problem 15. AztecOO solver(problem); // Construct solver 16. solver.SetPrecOperator(&ILUK); // Register Preconditioner operator 17. solver.SetAztecOption(AZ_solver, AZ_cg); 18. solver.Iterate(Niters, 1.0E-12); 19.// Once this linear solutions complete and the next nonlinear step is advanced, 20.// we will return to the solver, but only need to execute steps 5 on down…

Multiple Stopping Criteria  Possible scenario for stopping an iterative solver:  Test 1: Make sure residual is decreased by 6 orders of magnitude. And  Test 2: Make sure that the inf-norm of true residual is no more 1.0E-8. But  Test 3: do no more than 200 iterations.  Note: Test 1 is cheap. Do it before Test 2.

AztecOO StatusTest classes  AztecOO_StatusTest:  Abstract base class for defining stopping criteria.  Combo class: OR, AND, SEQ AztecOO_StatusTest Methods

AztecOO/StatusTest Example Trilinos/packages/aztecoo/example/AztecOO 1. // Assume A, x, b are define 2.Epetra_LinearProblem problem(&A, &x, &b); // Construct linear problem 3. AztecOO solver(problem); // Construct solver 4.AztecOO_StatusTestResNorm restest1(A, x, bb, 1.0E-6); 5.restest1.DefineResForm(AztecOO_StatusTestResNorm::Implicit, AztecOO_StatusTestResNorm::TwoNorm); 6.restest1.DefineScaleForm(AztecOO_StatusTestResNorm::NormOfInitRes, AztecOO_StatusTestResNorm::TwoNorm); 7.AztecOO_StatusTestResNorm restest2(A, x, bb, 1.0E-8); 8.restest2.DefineResForm(AztecOO_StatusTestResNorm::Explicit, AztecOO_StatusTestResNorm::InfNorm); 9.restest2.DefineScaleForm(AztecOO_StatusTestResNorm::NormOfRHS, AztecOO_StatusTestResNorm::InfNorm); 10.AztecOO_StatusTestCombo comboTest1(AztecOO_StatusTestCombo::SEQ, restest1, restest2); 11.AztecOO_StatusTestMaxIters maxItersTest(200); 12.AztecOO_StatusTestCombo comboTest2(AztecOO_StatusTestCombo::OR, maxItersTest1, comboTest1); 13.solver.SetStatusTest(&comboTest2); 14. solver.SetAztecOption(AZ_solver, AZ_cg); 15. solver.Iterate(Niters, 1.0E-12);

Summary: Extending Capabilities  Trilinos packages are designed to interoperate.  All packages (ML, IFPACK, AztecOO, …) that can provide linear operators:  Implement the Epetra_Operator interface.  Are available to any package that can use an linear operator.  All packages (ML, AztecOO, NOX, Belos, Anasazi, …) that can use linear operators:  Accept linear operator via Epetra_Operator interface.  Support easy user extensions.  All packages (ML, IFPACK, AztecOO, …) that need matrix coefficient data:  Can access that data from Epetra_RowMatrix interface.  Can use any concrete Epetra matrix class, or any user-provided adapter.

Summary: Extending Capabilities AztecOO is one example:  Flexibility comes from abstract base classes: Epetra_Operator: –All Epetra matrix classes implement. –Best way to define A and M when coefficient info not needed. Epetra_RowMatrix: –All Epetra matrix classes implement. –Best way to define A and M when coefficient info is needed. AztecOO_StatusTest: –A suite of parametrized status tests. –An abstract interface for users to define their own. –Ability to combine tests for sophisticated control of stopping.