Parallel Sparse Matrix Algorithms for numerical computing matrix-vector multiplication.

Slides:

Advertisements

Similar presentations

Sahalu Junaidu ICS 573: High Performance Computing 8.1 Topic Overview Matrix-Matrix Multiplication Block Matrix Operations A Simple Parallel Matrix-Matrix.

Advertisements

Parallel Matrix Operations using MPI CPS 5401 Fall 2014 Shirley Moore, Instructor November 3,

CS 484. Dense Matrix Algorithms There are two types of Matrices Dense (Full) Sparse We will consider matrices that are Dense Square.

MPJava : High-Performance Message Passing in Java using Java.nio Bill Pugh Jaime Spacco University of Maryland, College Park.

Sparse Triangular Solve in UPC By Christian Bell and Rajesh Nishtala.

1 Parallel Algorithms II Topics: matrix and graph algorithms.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.

Parallel Programming: Techniques and Applications Using Networked Workstations and Parallel Computers Chapter 11: Numerical Algorithms Sec 11.2: Implementing.

Maths for Computer Graphics

1 Friday, October 20, 2006 “Work expands to fill the time available for its completion.” -Parkinson’s 1st Law.

Examples of Two- Dimensional Systolic Arrays. Obvious Matrix Multiply Rows of a distributed to each PE in row. Columns of b distributed to each PE in.

1 Fast Sparse Matrix Multiplication Raphael Yuster Haifa University (Oranim) Uri Zwick Tel Aviv University ESA 2004.

Message-Passing Programming and MPI CS 524 – High-Performance Computing.

MPJava : High-Performance Message Passing in Java using Java.nio Bill Pugh Jaime Spacco University of Maryland, College Park.

Design of parallel algorithms

Design of parallel algorithms Matrix operations J. Porras.

Lecture 8 Objectives Material from Chapter 9 More complete introduction of MPI functions Show how to implement manager-worker programs Parallel Algorithms.

Monica Garika Chandana Guduru. METHODS TO SOLVE LINEAR SYSTEMS Direct methods Gaussian elimination method LU method for factorization Simplex method of.

1 Matrix Addition, C = A + B Add corresponding elements of each matrix to form elements of result matrix. Given elements of A as a i,j and elements of.

1 Tuesday, October 31, 2006 “Data expands to fill the space available for storage.” -Parkinson’s Law.

Mapping Techniques for Load Balancing

Chapter 7 Matrix Mathematics Matrix Operations Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Feng Lu Chuan Heng Foh, Jianfei Cai and Liang- Tien Chia Information Theory, ISIT IEEE International Symposium on LT Codes Decoding: Design.

Exercise problems for students taking the Programming Parallel Computers course. Janusz Kowalik Piotr Arlukowicz Tadeusz Puzniakowski Informatics Institute.

Chapter 10 Review: Matrix Algebra

Chapter 5. Loops are common in most programming languages Plus side: Are very fast (in other languages) & easy to understand Negative side: Require a.

Martin Ellison University of Warwick and CEPR Bank of England, December 2005 Introduction to MATLAB.

Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.

Parallel Algorithms Sorting and more. Keep hardware in mind When considering ‘parallel’ algorithms, – We have to have an understanding of the hardware.

1 MPI: Message-Passing Interface Chapter 2. 2 MPI - (Message Passing Interface) Message passing library standard (MPI) is developed by group of academics.

Part I MPI from scratch. Part I By: Camilo A. SilvaBIOinformatics Summer 2008 PIRE :: REU :: Cyberbridges.

Data Structure (Part II) Chapter 2 – Arrays. Matrix A matrix with 5 rows and 3 columns can be represented by n = 3 m = 5 We say this is a 5×3 matrix.

CMPS 1371 Introduction to Computing for Engineers MATRICES.

After step 2, processors know who owns the data in their assumed partitions— now the assumed partition defines the rendezvous points Scalable Conceptual.

L17: Introduction to “Irregular” Algorithms and MPI, cont. November 8, 2011.

CS 838: Pervasive Parallelism Introduction to MPI Copyright 2005 Mark D. Hill University of Wisconsin-Madison Slides are derived from an online tutorial.

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, ©

Parallel Solution of the Poisson Problem Using MPI

PARALLELIZATION OF MULTIPLE BACKSOLVES James Stanley April 25, 2002 Project #2.

CSCI-455/522 Introduction to High Performance Computing Lecture 4.

WEEK 6 Class Activities Lecturer’s slides.

Linear Algebra Libraries: BLAS, LAPACK, ScaLAPACK, PLASMA, MAGMA

Parallel Computing in Numerical Simulation of Laser Deposition The objective of this proposed project is to research and develop an effective prediction.

Memory Coherence in Shared Virtual Memory System ACM Transactions on Computer Science(TOCS), 1989 KAI LI Princeton University PAUL HUDAK Yale University.

Image Processing A Study in Pixel Averaging Building a Resolution Pyramid With Parallel Computing Denise Runnels and Farnaz Zand.

CS 420 Design of Algorithms Parallel Algorithm Design.

Sparse Matrix Dense Vector Multiplication by Pedro A. Escallon Parallel Processing Class Florida Institute of Technology April 2002.

Lecture 9 Architecture Independent (MPI) Algorithm Design

CS 450: COMPUTER GRAPHICS TRANSFORMATIONS SPRING 2015 DR. MICHAEL J. REALE.

Parallel Computing Presented by Justin Reschke

ARRAYS IN C/C++ (1-Dimensional & 2-Dimensional) Introduction 1-D 2-D Applications Operations Limitations Conclusion Bibliography.

L20: Sparse Matrix Algorithms, SIMD review November 15, 2012.

Linear Algebra Libraries: BLAS, LAPACK, ScaLAPACK, PLASMA, MAGMA Shirley Moore CPS5401 Fall 2013 svmoore.pbworks.com November 12, 2012.

COMP7330/7336 Advanced Parallel and Distributed Computing MPI Programming - Exercises Dr. Xiao Qin Auburn University

A rectangular array of numeric or algebraic quantities subject to mathematical operations. The regular formation of elements into columns and rows.

ITCS 4/5145 Parallel Computing, UNC-Charlotte, B

ROBOT NAVIGATION AI Project Asmaa Sehnouni Jasmine Dsouza Supervised by :Dr. Pei Wang.

Send and Receive.

Array Processor.

Send and Receive.

CSCE569 Parallel Computing

ITCS 4/5145 Parallel Computing, UNC-Charlotte, B

Mihir Awatramani Lakshmi kiran Tondehal Xinying Wang Y. Ravi Chandra

Parallel Matrix Operations

2009 AAG Annual Meeting Las Vegas, NV March 25th, 2009

Introduction to parallelism and the Message Passing Interface

ITCS 4/5145 Parallel Computing, UNC-Charlotte, B

Introduction to Matlab

Programming Parallel Computers

Presentation transcript:

Parallel Sparse Matrix Algorithms for numerical computing matrix-vector multiplication

Introduction Matrix computing is an important part in numeric computing. Sparse Matrix is very important in Matrix computing. Sparse matrix can be used in all kinds of computing.

Construction Importance of Parallel Sparse Matrix Computing Introduction to Sparse Matrix Introduction to Matrix-Vector Multiplication How to use parallel technology solve it? Conclusion

Sparse Matrix Concept Sparse Matrix Storage/Save

Concept In the mathematical subfield of numerical analysis, a sparse matrix is a matrix populated primarily with zeros.(Stoer & Bulirsch 2002, p. 619). If has a matrix A[m*n], where NZ = nonzero elements. If NZ<<m*n then A is Sparse Matrix

Sparse Matrix Storage/Save A new data structure to store the sparse matrix. New data structure is easy to transformed from tradition data structure. Less space to store. Fast seeking address of the elements.

Example Matrix A[4*4] with elements: 0 ， 0 ， 0 ， 2 1 ， 0 ， 0 ， 6 0 ， 1 ， 0 ， 0 0 ， 0 ， 0 ， 0 Space: 4x4xB = 16B New structure only store non-zero elements: 0 ， 3 ， 2 1 ， 0 ， 1 1 ， 3 ， 6 2 ， 1 ， 1 space: 3x4xB = 12B

Matrix-vector multiplication Matrix-Vector Multiplication define by : Where A ij is a matrix define by [i*j] X j is vector has j elements.

Parallel Method Produce Matrix Produce Vector Transform Matrix to Sparse Matrix Broadcast Vector to each slave processor Partition Sparse Matrix Send each buffer to each salve processor Each salve do Matrix-Vector Multiplication Send result to master Done

Data structure and input parameter User input matrix : 2D array User input vector : 1D array Usage : exefilename

Produce Matrix void producematrix(int ** _mt,int _row,int _col,int _zero){ for (i = 0; i < _row; i++) { for (j = 0; j < _col; j++) { tempelements = rand() % 50; //get random if (residualelements == 0) { _mt[i][j] = 0; }else{ p = rand() % 1; if (p < residualzero/residualelements) { _mt[i][j] = 0; residualzero--; }else{ _mt[i][j] = tempelements; residualelements--; }

Transform Matrix to Sparse Matrix int * producesparse1d(int ** _mt,int _mtrow,int _mtcol,int _nonzero){ int * tempsp = (int*)malloc(sizeof(int)*_nonzero * 3); int m,n; m = 0; for (int i = 0; i < _mtrow; i++) { for (int j = 0; j < _mtcol; j++) { if (_mt[i][j]!=0) { tempsp[m] = i; tempsp[m+1] = j; tempsp[m+2] = _mt[i][j]; m = m +3; } return tempsp; }

Broadcast Vector to each slave processor

Partition Sparse Matrix

Send each buffer to each salve processor

Parallel logic MPI_Status stat; MPI_Init(&argc,&argv); MPI_Comm_size(MPI_COMM_WORLD,&numprocs); MPI Regular MPI_Comm_rank(MPI_COMM_WORLD,&myid); MPI_Bcast; //broadcast vector to slave if (myid == 0){ //master MPI_Send(sendbuffer); //send each buffer to each slave }else{ //slave MPI_Recv(recvbuffer); // receive buffer from master SparseMult(recvbuffer,vect); // multiplication MPI_Send(slaveresult); //send result to master } MPI_Finalize();

Results

Conclusions & Analysis Spend more time on communication with each processors. Unbalanced communication and computing is bottle-neck

Bibliography [1]Blaise,B. (2009). Lawrence Livermore National Laboratory.Retrieved May 2009 from the World Wide Web: [2] Bruce Hendrickson, Robert Leland and Steve Plimpton, An Efficient Parallel Algorithm for Matrix – Vector Multiplication, Sandia National Laboratories, Albuquerque, NM87185 [3] Stoer, Josef; Bulirsch, Roland (2002), Introduction to Numerical Analysis (3rd ed.), Berlin, New York: Springer-Verlag, ISBN ISBN [4] L.M. Romero and E.L. Zapata, Data Distributions for Sparse Matrix Vector Multiplication, Univ ersity of Malaga, J.Parallel Computing, vol. 21, no.4, April 1999 [5] Martin Johnson Numerical Algorithm, Lecture notes in Parallel Computing, IIMS Massey University at Albany, Auckland, New Zealand [6] Message Passing Interface [7] R.Raz. On the complexity of matrix product. SIAM Journal on Computing, 32: , 2003 [8] SPARSE MATRIX [9] SPARSE MATRIX [10] V. Pan. How to multiply matrices faster. Lecture notes in computer science, volume 179. Springer-Verlag, 1985 [11] Wang Shun and Wang Xiao Ge Parallel Algorithm for Matrix – Vector Multiplication, Tsinghua University Library, CHINA

questions