Chapter 6 Parallel Sorting Algorithm Sorting Parallel Sorting Bubble Sort Odd-Even (Transposition) Sort Parallel Odd-Even Transposition Sort Related Functions.

Slides:



Advertisements
Similar presentations
Its.unc.edu 1 Collective Communication University of North Carolina - Chapel Hill ITS - Research Computing Instructor: Mark Reed
Advertisements

MPI Collective Communications
Sahalu Junaidu ICS 573: High Performance Computing 8.1 Topic Overview Matrix-Matrix Multiplication Block Matrix Operations A Simple Parallel Matrix-Matrix.
Reference: / MPI Program Structure.
MPI Program Structure Self Test with solution. Self Test 1.How would you modify "Hello World" so that only even-numbered processors print the greeting.
Parallel Sorting Algorithms Comparison Sorts if (A>B) { temp=A; A=B; B=temp; } Potential Speed-up –Optimal Comparison Sort: O(N lg N) –Optimal Parallel.
MPI_Gatherv CISC372 Fall 2006 Andrew Toy Tom Lynch Bill Meehan.
MPI_AlltoAllv Function Outline int MPI_Alltoallv ( void *sendbuf, int *sendcnts, int *sdispls, MPI_Datatype sendtype, void *recvbuf, int *recvcnts, int.
12c.1 Collective Communication in MPI UNC-Wilmington, C. Ferner, 2008 Nov 4, 2008.
SOME BASIC MPI ROUTINES With formal datatypes specified.
1 Lecture 11 Sorting Parallel Computing Fall 2008.
Collective Communication.  Collective communication is defined as communication that involves a group of processes  More restrictive than point to point.
CS 179: GPU Programming Lecture 20: Cross-system communication.
L15: Putting it together: N-body (Ch. 6) October 30, 2012.
ORNL is managed by UT-Battelle for the US Department of Energy Crash Course In Message Passing Interface Adam Simpson NCCS User Assistance.
Parallel Processing1 Parallel Processing (CS 676) Lecture 7: Message Passing using MPI * Jeremy R. Johnson *Parts of this lecture was derived from chapters.
Parallel Programming and Algorithms – MPI Collective Operations David Monismith CS599 Feb. 10, 2015 Based upon MPI: A Message-Passing Interface Standard.
2a.1 Message-Passing Computing More MPI routines: Collective routines Synchronous routines Non-blocking routines ITCS 4/5145 Parallel Computing, UNC-Charlotte,
1 Collective Communications. 2 Overview  All processes in a group participate in communication, by calling the same function with matching arguments.
1 MPI: Message-Passing Interface Chapter 2. 2 MPI - (Message Passing Interface) Message passing library standard (MPI) is developed by group of academics.
Chapter 8 Searching and Sorting Arrays Csc 125 Introduction to C++ Fall 2005.
PP Lab MPI programming VI. Program 1 Break up a long vector into subvectors of equal length. Distribute subvectors to processes. Let them compute the.
Parallel Programming with MPI Prof. Sivarama Dandamudi School of Computer Science Carleton University.
CS 838: Pervasive Parallelism Introduction to MPI Copyright 2005 Mark D. Hill University of Wisconsin-Madison Slides are derived from an online tutorial.
Message Passing Programming Model AMANO, Hideharu Textbook pp. 140-147.
Summary of MPI commands Luis Basurto. Large scale systems Shared Memory systems – Memory is shared among processors Distributed memory systems – Each.
MPI Introduction to MPI Commands. Basics – Send and Receive MPI is a message passing environment. The processors’ method of sharing information is NOT.
Parallel Programming with MPI By, Santosh K Jena..
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, ©
CSCI-455/522 Introduction to High Performance Computing Lecture 4.
Oct. 23, 2002Parallel Processing1 Parallel Processing (CS 730) Lecture 6: Message Passing using MPI * Jeremy R. Johnson *Parts of this lecture was derived.
1. 2 The logical view of a machine supporting the message-passing paradigm consists of p processes, each with its own exclusive address space. The logical.
L19: Putting it together: N-body (Ch. 6) November 22, 2011.
CS4402 – Parallel Computing
2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()
Chapter 5. Nonblocking Communication MPI_Send, MPI_Recv are blocking operations Will not return until the arguments to the functions can be safely modified.
-1.1- MPI Lectured by: Nguyễn Đức Thái Prepared by: Thoại Nam.
CSCI-455/552 Introduction to High Performance Computing Lecture 21.
MPI Groups, Communicators and Topologies. Groups and communicators In our case studies, we saw examples where collective communication needed to be performed.
Message Passing Programming Based on MPI Collective Communication I Bora AKAYDIN
Message Passing Interface Using resources from
COMP7330/7336 Advanced Parallel and Distributed Computing MPI Programming - Exercises Dr. Xiao Qin Auburn University
COMP7330/7336 Advanced Parallel and Distributed Computing MPI Programming: 1. Collective Operations 2. Overlapping Communication with Computation Dr. Xiao.
ITCS 4/5145 Parallel Computing, UNC-Charlotte, B
Distributed Processing with MPI International Summer School 2015 Tomsk Polytechnic University Assistant Professor Dr. Sergey Axyonov.
Computer Science Department
Chapter 4.
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Odd-Even Sort Implementation Dr. Xiao Qin.
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Message Passing Interface (cont.) Topologies.
CS4402 – Parallel Computing
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Parallel Odd-Even Sort Algorithm Dr. Xiao.
Sorting Quiz questions
Computer Science Department
Send and Receive.
Collective Communication with MPI
Collective Communication Operations
Send and Receive.
ITCS 4/5145 Parallel Computing, UNC-Charlotte, B
Parallel Processing - MPI
CS 5334/4390 Spring 2017 Rogelio Long
Lecture 14: Inter-process Communication
MPI: Message Passing Interface
Parallel Computing Spring 2010
Message-Passing Computing More MPI routines: Collective routines Synchronous routines Non-blocking routines ITCS 4/5145 Parallel Computing, UNC-Charlotte,
ITCS 4/5145 Parallel Computing, UNC-Charlotte, B
Introduction to Parallel Computing with MPI
Hardware Environment VIA cluster - 8 nodes Blade Server – 5 nodes
Message-Passing Computing Message Passing Interface (MPI)
Computer Science Department
Parallel Processing - MPI
Presentation transcript:

Chapter 6 Parallel Sorting Algorithm Sorting Parallel Sorting Bubble Sort Odd-Even (Transposition) Sort Parallel Odd-Even Transposition Sort Related Functions

Sorting Arrange elements of a list into certain order Make data become easier to access Speed up other operations such as searching Many sorting algorithms with different time and space complexities

Parallel Sorting Design methodology Based on an existing sequential sort algorithm –Try to utilize all resources available –Possible to turn a poor sequential algorithm into a reasonable parallel algorithm (from O(n 2 ) to O(n)) Completely new approach –New algorithm from scratch –Harder to develop –Sometimes yield better solution Potential speedup O(nlogn) optimal for any sequential sorting algorithm without using special properties of the numbers Optimal parallel time complexity O(nlogn/n ) = O(logn)

Bubble Sort One of the straight-forward sorting methods –Cycles through the list –Compares consecutive elements and swaps them if necessary –Stops when no more out of order pair Slow & inefficient Average performance is O(n 2 ) Example:

Bubble Sort for(int i=0; i<n; i++) { for(int j=0; j<n-1; j++) { if(array[j]>array[j+1]) { int temp = array[j+1]; array[j+1] = array[j]; array[j] = temp; } Example:

Odd-Even (Transposition) Sort Variation of bubble sort. Operates in two alternating phases, even phase and odd phase. Even phase Even-indexed items compare and exchange with their right neighbor. Odd phase Odd-indexed items exchange numbers with their right neighbor.

Odd-Even (Transposition) Sort for (int i = 0; i < n; i++) { if (i % 2 == 1) { // odd phase for (int j = 2; j < n; j += 2) { if (a[j] < a[j-1]) swap (a[j-1], a[j]); } else { //even phase for (int j = 1; j < n; j += 2) { if (a[j] < a[j-1]) swap (a[j-1], a[j]); }

Odd-Even (Transposition) Sort Sorting n = 8 elements, using the odd-even transposition sort algorithm

Parallel Odd-Even Transposition Sort Operates in two alternating phases, even phase and odd phase Even phase Even-numbered processes exchange numbers with their right neighbor. Odd phase Odd-numbered processes exchange numbers with their right neighbor.

Parallel Odd-Even Transposition

Parallel Odd-Even Transposition Sort MPI_Comm_rank(MPI_COMM_WORLD, &mypid); MPI_Comm_size(MPI_COMM_WORLD, &nprocs); for (int i = 0; i < nprocs; i++) { if (i % 2 == 1) { // odd phase if (mypid % 2 == 1) compare_and_exchange_min(mypid+1); else compare_and_exchange_max(mypid-1); } else { //even phase if (mypid % 2 == 0) compare_and_exchange_min(mypid+1); else compare_and_exchange_max(mypid-1); }

Parallel Odd-Even Transposition (n>>p)

MPI_Scatter MPI_Scatter is a collective routine that is very similar to MPI_Bcast A root processor sending data to all processors in a communicator MPI_Bcast sends the same piece of data to all processes MPI_Scatter sends chunks of an array to different processors

MPI_Scatter MPI_Bcast takes a single element at the root processor and copies it to all other processors MPI_Scatter takes an array of elements and distributes the elements in the order of the processor rank

MPI_Scatter Its prototype MPI_Scatter(void* send_data, int send_count, MPI_Datatype send_datatype, void* recv_data, int recv_count, MPI_Datatype recv_datatype, int root, MPI_Comm communicator) send_data: an array of data on the root processor send_count and send_datatype: how many elements of a MPI Datatype will be sent to each processor recv_data: a buffer of data that can hold recv_count elements root: root processor communicator

MPI_Gather The inverse of MPI_Scatter Takes elements from many processors and gathers them to one single processor The elements are ordered by the rank of the processors from which they were received Used in parallel sorting and searching

MPI_Gather Its prototype MPI_Gather(void* send_data, int send_count, MPI_Datatype send_datatype, void* recv_data, int recv_count, MPI_Datatype recv_datatype, int root, MPI_Comm communicator) Only the root processors needs to have a valid receive buffer All other calling processors can pass NULL for recv_data recv_count is the count of elements received per processors, not the total summation of counts from all processors

Example 1 #include "mpi.h" #include int main (int argc, char **argv) { int size, rank; int recvbuf[4]; int sendbuf[16]={1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16}; MPI_Init(&argc,&argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); MPI_Scatter(sendbuf,4,MPI_INT,recvbuf,4,MPI_INT,0,MPI_COMM_WORLD); printf("Processor %d gets elements: %d %d %d %d\n",rank,recvbuf[0], recvbuf[1],recvbuf[2],recvbuf[3]); MPI_Finalize(); }

Example 1 Processor 0 gets elements: Processor 1 gets elements: Processor 3 gets elements: Processor 2 gets elements:

Example 2 #include "mpi.h" #include int main (int argc, char **argv) { int size, rank; int sendbuf[4]; int recvbuf[16]; MPI_Init(&argc,&argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); int i; for (i =0; i < 4; i++){ sendbuf[i]= 4*rank + i+1; }

Example 2 MPI_Gather(sendbuf,4,MPI_INT,recvbuf,4,MPI_INT,0,MPI_COMM_WORLD); if (rank == 0){ int j; for(j = 0; j < 16; j++){ printf("The %d th element is %d\n", j, recvbuf[j]); } MPI_Finalize(); }

Example 2 The 0 th element is 1 The 1 th element is 2 The 2 th element is 3 The 3 th element is 4 The 4 th element is 5 The 5 th element is 6 The 6 th element is 7 The 7 th element is 8 The 8 th element is 9 The 9 th element is 10 The 10 th element is 11 The 11 th element is 12 The 12 th element is 13 The 13 th element is 14 The 14 th element is 15 The 15 th element is 16