CS 591x – Cluster Computing and Programming Parallel Computers Parallel Libraries.

Slides:



Advertisements
Similar presentations
MPI Message Passing Interface
Advertisements

Computational Physics Linear Algebra Dr. Guy Tel-Zur Sunset in Caruaru by Jaime JaimeJunior. publicdomainpictures.netVersion , 14:00.
Advanced Computational Software Scientific Libraries: Part 2 Blue Waters Undergraduate Petascale Education Program May 29 – June
1 A Common Application Platform (CAP) for SURAgrid -Mahantesh Halappanavar, John-Paul Robinson, Enis Afgane, Mary Fran Yafchalk and Purushotham Bangalore.
Reference: / MPI Program Structure.
1cs542g-term Notes  Assignment 1 will be out later today (look on the web)
Reference: Getting Started with MPI.
1 Friday, October 20, 2006 “Work expands to fill the time available for its completion.” -Parkinson’s 1st Law.
Introduction to MPI. What is Message Passing Interface (MPI)?  Portable standard for communication  Processes can communicate through messages.  Each.
CS 240A: Models of parallel programming: Distributed memory and MPI.
Message-Passing Programming and MPI CS 524 – High-Performance Computing.
MPJava : High-Performance Message Passing in Java using Java.nio Bill Pugh Jaime Spacco University of Maryland, College Park.
Comp 422: Parallel Programming Lecture 8: Message Passing (MPI)
PETSc Portable, Extensible Toolkit for Scientific computing.
6/22/2005ICS'20051 Parallel Sparse LU Factorization on Second-class Message Passing Platforms Kai Shen University of Rochester.
Monica Garika Chandana Guduru. METHODS TO SOLVE LINEAR SYSTEMS Direct methods Gaussian elimination method LU method for factorization Simplex method of.
1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben.
Chapter 13 Finite Difference Methods: Outline Solving ordinary and partial differential equations Finite difference methods (FDM) vs Finite Element Methods.
Parallel & Cluster Computing Linear Algebra Henry Neeman, Director OU Supercomputing Center for Education & Research University of Oklahoma SC08 Education.
CE 311 K - Introduction to Computer Methods Daene C. McKinney
CS 179: GPU Programming Lecture 20: Cross-system communication.
Exercise problems for students taking the Programming Parallel Computers course. Janusz Kowalik Piotr Arlukowicz Tadeusz Puzniakowski Informatics Institute.
Antonio M. Vidal Jesús Peinado
Lecture 29 Fall 2006 Lecture 29: Parallel Programming Overview.
Parallel Processing LAB NO 1.
1 © 2012 The MathWorks, Inc. Speeding up MATLAB Applications.
® Backward Error Analysis and Numerical Software Sven Hammarling NAG Ltd, Oxford
Introduction to ScaLAPACK. 2 ScaLAPACK ScaLAPACK is for solving dense linear systems and computing eigenvalues for dense matrices.
Libraries Linda Petzold UC Santa Barbara. What is a Library? Library - Wikipedia, the free encyclopedia.mhtLibrary - Wikipedia, the free encyclopedia.mht.
High Performance Computing 1 Numerical Linear Algebra An Introduction.
Introduction to PETSc VIGRE Seminar, Wednesday, November 8, 2006.
Wavelet Transforms CENG 5931 GNU RADIO INSTRUCTOR: Dr GEORGE COLLINS.
1 Using the PETSc Parallel Software library in Developing MPP Software for Calculating Exact Cumulative Reaction Probabilities for Large Systems (M. Minkoff.
Support for Debugging Automatically Parallelized Programs Robert Hood Gabriele Jost CSC/MRJ Technology Solutions NASA.
N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 1 Porting from the Cray T3E to the IBM SP Jonathan Carter NERSC User Services.
O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY 1 Parallel Solution of the 3-D Laplace Equation Using a Symmetric-Galerkin Boundary Integral.
ANS 1998 Winter Meeting DOE 2000 Numerics Capabilities 1 Barry Smith Argonne National Laboratory DOE 2000 Numerics Capability
Progress report on the alignment of the tracking system A. Bonissent D. Fouchez A.Tilquin CPPM Marseille Mechanical constraints from optical measurement.
Part I MPI from scratch. Part I By: Camilo A. SilvaBIOinformatics Summer 2008 PIRE :: REU :: Cyberbridges.
CMPS 1371 Introduction to Computing for Engineers MATRICES.
Parallel Computing A task is broken down into tasks, performed by separate workers or processes Processes interact by exchanging information What do we.
L17: Introduction to “Irregular” Algorithms and MPI, cont. November 8, 2011.
Center for Component Technology for Terascale Simulation Software CCA is about: Enhancing Programmer Productivity without sacrificing performance. Supporting.
Message Passing Programming Model AMANO, Hideharu Textbook pp. 140-147.
Summary of MPI commands Luis Basurto. Large scale systems Shared Memory systems – Memory is shared among processors Distributed memory systems – Each.
1 SciDAC TOPS PETSc Work SciDAC TOPS Developers Satish Balay Chris Buschelman Matt Knepley Barry Smith.
JAVA AND MATRIX COMPUTATION
CS 591 x I/O in MPI. MPI exists as many different implementations MPI implementations are based on MPI standards MPI standards are developed and maintained.
ES 240: Scientific and Engineering Computation. Chapter 8 Chapter 8: Linear Algebraic Equations and Matrices Uchechukwu Ofoegbu Temple University.
Implementing Hypre- AMG in NIMROD via PETSc S. Vadlamani- Tech X S. Kruger- Tech X T. Manteuffel- CU APPM S. McCormick- CU APPM Funding: DE-FG02-07ER84730.
Introduction to MPI CDP 1. Shared Memory vs. Message Passing Shared Memory Implicit communication via memory operations (load/store/lock) Global address.
Linear Algebra Libraries: BLAS, LAPACK, ScaLAPACK, PLASMA, MAGMA
3/12/2013Computer Engg, IIT(BHU)1 MPI-1. MESSAGE PASSING INTERFACE A message passing library specification Extended message-passing model Not a language.
C OMPUTATIONAL R ESEARCH D IVISION 1 Defining Software Requirements for Scientific Computing Phillip Colella Applied Numerical Algorithms Group Lawrence.
High Performance Flexible DSP Infrastructure Based on MPI and VSIPL 7th Annual Workshop on High Performance Embedded Computing MIT Lincoln Laboratory
April 24, 2002 Parallel Port Example. April 24, 2002 Introduction The objective of this lecture is to go over a simple problem that illustrates the use.
Parallel Programming & Cluster Computing Linear Algebra Henry Neeman, University of Oklahoma Paul Gray, University of Northern Iowa SC08 Education Program’s.
Message Passing Interface Using resources from
Linear Algebra Libraries: BLAS, LAPACK, ScaLAPACK, PLASMA, MAGMA Shirley Moore CPS5401 Fall 2013 svmoore.pbworks.com November 12, 2012.
Intro to Scientific Libraries Intro to Scientific Libraries Blue Waters Undergraduate Petascale Education Program May 29 – June
Linear Algebra review (optional)
05/23/11 Evaluation and Benchmarking of Highly Scalable Parallel Numerical Libraries Christos Theodosiou User and Application Support.
A survey of Exascale Linear Algebra Libraries for Data Assimilation
Introduction to parallel computing concepts and technics
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Message Passing Interface (cont.) Topologies.
Introduction to MPI.
Usage of highly scalable parallel numerical libraries
Introduction to parallelism and the Message Passing Interface
Multidimensional array
Linear Algebra review (optional)
Presentation transcript:

CS 591x – Cluster Computing and Programming Parallel Computers Parallel Libraries

Recall that so far we have been – Breaking up (decomposing) our “large” problems into smaller pieces… Distributing the pieces of the problem to multiple processors Explicitly moving data among processes through message passing

Parallel Libraries Note that – Large scientific and engineering problems often represent data in matrices and vectors Large scientific and engineering problems make heavy use of linear algebra, linear systems, non-linear systems

Parallel Libraries MPI is designed to support the development of libraries Consequently, there are a number of libraries, based on MPI, used to develop parallel software Some libraries take care of much, or all of the parallelization That means….

Parallel Libraries … You don’t have to… … but you still can… … if you want … sometimes…

Parallel Libraries ScaLAPACK Scalable Linear Algebra PACKage PETSc Portable, Extensible Toolkit for Scientific Computation

ScaLaPACK Built on LAPACK – Linear Algebra Package Powerful Widely used in scientific and engineering computing not scalable to distributed memory parallel computers LAPACK is built on BLAS – the Basic Linear Algebra Subprogram library

ScaLAPACK uses PBLAS – Parallel BLAS performs local matrix and vector operations in parallel application uses BLAS uses BLACS – Basic Linear Algebra Communications Subprograms library handles interprocess communications for ScaLAPACK uses MPI (other implementations also)

ScaLAPACK Maps matrices and vectors to a process grid called a BLACSgrid similar to an MPI Cartesian topology matrices and vectors decomposed into rectangular blocks – block cyclically distributed to BLACSgrid

ScaLAPACK – sample based on Pacheco pg MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &p); MPI_Comm_rank(MPI_COMM_WORLD,&myrank); Get_input(p, myrank, &n, &n_proc_rows,&nproc_cols, &row_block_size, &col_block_size); m=n; Cblacs_get(0,0,&blacs_grid); /* build blacs grid */ /* R process grid will use row major order */ Cblacs_gridinit(&blacs_grid,”R”,nproc_rows, nproc_cols); Cblacs_pcoord(blacs_grid,my_rank,&my_proc_row,&my_proc_col);

ScaLAPACK – sample cont. local_mat_rows=get_dim(m,row_block_size,my_proc_row,nproc_rows); local_mat_cols=get_dim(n,col_block_size,my_proc_col,nproc_cols); Allocate(my_rank,”A”,&A_local,local_mat_rows*local_mat_cols,1); b_local_size=get_dim(m,row_block_size,my_proc_row,nproc_rows); Allocate(my_rank,”b”,b_local,b_local_size,1); exact_local_size=get_dim(m,col_block_size,my_proc_row,nproc_rows); Allocate(myrank,”Exact”,&exact_local,exact_local_size,1);

ScaLAPACK – sample cont. Build_descript(my_rank,”A”,A_descript,m,n,row_block_size,col_block_siz e,blacs_grid,local_mat_rows); Build_descript(my_rank,”B”,b_descript,m,1,row_block_size,1,blacs_grid, b_local_size); Build_descript(my_rank,”Exact”,exact_descript,n,1,col_block_size,1,blac s_grid,exact_local_size);

scaLAPACK – sample cont. Initialize(p,my_rank,A_local,local_mat_rows,local_mat_cols,exact_loca l,exact_local_size); Mat_vect_mult(m,n,A_local,A_descript, exact_local, exact_descript, b_local, b_descript); Allocate(my_rank,”pivot_list”,&pivot_list,local_mat_rows + row_block_size,0); MPI_Barrier(MPI_COMM_WORLD);

scaLAPACK – sample cont. /* psgesv solves Ax=b returns solution in b */ solve(my_rank,n,A_local,A_descript,pivot_list, b_local, b_descript); … Cblacs_exit(1); MPI_Finalize(); … }

scaLAPACK – sample cont. void Mat_vect_mult(int m, int n, float* A_local, int A_descript, float* x_local, int* x_descript, float y_local, int* y_descript) ( char transpose = ‘N’; … psgemv(&transpose, &m, &n, &alpha, A_local, &first_row_A, &first_col_A, A_descript, x_local, &first_row_x, &first_col_x, x_descript, &beta, y_local, &first_row_y, &first_col_y, y_descript, y_increment); }

Crossing Languages – Some Issues Calling routines from another language calling Fortran subroutine in C Using n dimensional arrays remember row major vs column major Passing arguments in routine/function calls Fortran passes by address, C passes by value

PETSc Portable, Extensible Toolkit for Scientific Computation Large, powerful Solves Partial differential equations Linear systems Non-linear systems Solves matrices – Dense Sparse

PETSc PETSc routines return error codes PETSc error checking routines to help troubleshoot problems CHKERRRA(errorcode)

PETSc Built on top of MPI Developed primarily for C/C++ unlike scaLAPACK has Fortran interface Dense and sparce matrices same interface

PETSc Includes many non-blocking operations i.e. any process can update any cell matrix as non-blocking operation --- other work can be going on while this update operation is carried out Many options available from command line PETSc includes many solvers Solvers can be selected from command line can change solvers without recompiling PETSC_DECIDES

PETSc from --

PETSc from --

PETSc – sample routines PetscOptionsGetInt(PETSC_NULL, “-n”, &n, &flg); VecSetType(Vec x, Vec_type vec_type); VecCreate(MPI_Comm comm, Vec *x); VecSetSizes(Vec x, int m, int M); VecDuplicate(Vec old, Vec new); MatCreate(MPI_Comm comm, int m, int n, int M, int N, Mat* A); MatSetValues(Mat A, int m, int* im, int n, int* in, PetscScalar *values, INSERT_VALUES);

PETSc – sample routines MatAssemblyBegin(Mat A, MAT_FINAL_ASSEMBLY); MatAssemblyEnd(Mat A, MAT_FINAL_ASSEMBLY); KSPCreate(MPI_Comm comm, KSP *ksp); KSPSolve(KSP ksp, Vec b, Vec x); PetscInitialize(&argc, &argv); PetscFinalize();

BLAS (Basic Linear Algebra Subprograms LAPACK Linear Algebra PACKage ScaLaPACK ome.html ome.html

PETSc /docs/splitmanual/manual.html#Node0