Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. 2004.

Slides:



Advertisements
Similar presentations
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Advertisements

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
CSCI-455/552 Introduction to High Performance Computing Lecture 25.
Lecture 3: Parallel Algorithm Design
1 Parallel Parentheses Matching Plus Some Applications.
CSCI-455/552 Introduction to High Performance Computing Lecture 11.
Partitioning and Divide-and-Conquer Strategies ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, Jan 23, 2013.
4.1 Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Parallel Strategies Partitioning consists of the following steps –Divide the problem into parts –Compute each part separately –Merge the results Divide.
Partitioning and Divide-and-Conquer Strategies ITCS 4/5145 Cluster Computing, UNC-Charlotte, B. Wilkinson, 2007.
Lecture 7 : Parallel Algorithms (focus on sorting algorithms) Courtesy : SUNY-Stony Brook Prof. Chowdhury’s course note slides are used in this lecture.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Parallel Programming: Techniques and Applications Using Networked Workstations and Parallel Computers Chapter 11: Numerical Algorithms Sec 11.2: Implementing.
Advanced Topics in Algorithms and Data Structures Lecture pg 1 Recursion.
Faster Sorting Methods Chapter 12 Slides by Steve Armstrong LeTourneau University Longview, TX  2007,  Prentice Hall.
Advanced Topics in Algorithms and Data Structures Page 1 Parallel merging through partitioning The partitioning strategy consists of: Breaking up the given.
Partitioning and Divide-and-Conquer Strategies ITCS 4/5145 Cluster Computing, UNC-Charlotte, B. Wilkinson, 2007.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
1 Tuesday, November 14, 2006 “UNIX was never designed to keep people from doing stupid things, because that policy would also keep them from doing clever.
Chapter 10 in textbook. Sorting Algorithms
Algorithms and Applications
1 Merge Sort Merge Sort Reading p A Sorting Pattern The most efficient sorting algorithms all seem to follow a divide-and-conquer strategy.
1 Tuesday, October 03, 2006 If I have seen further, it is by standing on the shoulders of giants. -Isaac Newton.
Parallel Merging Advanced Algorithms & Data Structures Lecture Theme 15 Prof. Dr. Th. Ottmann Summer Semester 2006.
CSCI 4440 / 8446 Parallel Computing Three Sorting Algorithms.
and Divide-and-Conquer Strategies
Design of parallel algorithms
Sorting Algorithms: Topic Overview
CS 584. Sorting n One of the most common operations n Definition: –Arrange an unordered collection of elements into a monotonically increasing or decreasing.
Topic Overview One-to-All Broadcast and All-to-One Reduction
CSCI-455/552 Introduction to High Performance Computing Lecture 22.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Parallel Programming in C with MPI and OpenMP
CSCI-455/552 Introduction to High Performance Computing Lecture 18.
Basic Communication Operations Based on Chapter 4 of Introduction to Parallel Computing by Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar These.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
ITCS 4/5145 Cluster Computing, UNC-Charlotte, B. Wilkinson, 2006outline.1 ITCS 4145/5145 Parallel Programming (Cluster Computing) Fall 2006 Barry Wilkinson.
1 Parallel Sorting Algorithms. 2 Potential Speedup O(nlogn) optimal sequential sorting algorithm Best we can expect based upon a sequential sorting algorithm.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “ Introduction to the Design & Analysis of Algorithms, ” 2 nd ed., Ch. 1 Chapter.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
CSCI-455/552 Introduction to High Performance Computing Lecture 13.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Embarrassingly Parallel Computations Partitioning and Divide-and-Conquer Strategies Pipelined Computations Synchronous Computations Asynchronous Computations.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
CSCI-455/552 Introduction to High Performance Computing Lecture 6.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
CSCI-455/552 Introduction to High Performance Computing Lecture 9.
CSCI-455/552 Introduction to High Performance Computing Lecture 23.
Data Structures and Algorithms in Parallel Computing Lecture 8.
Sorting Quick, Merge & Radix Divide-and-conquer Technique subproblem 2 of size n/2 subproblem 1 of size n/2 a solution to subproblem 1 a solution to.
Unit-8 Sorting Algorithms Prepared By:-H.M.PATEL.
CSCI-455/552 Introduction to High Performance Computing Lecture 21.
CSCI-455/552 Introduction to High Performance Computing Lecture 15.
1 Chapter4 Partitioning and Divide-and-Conquer Strategies 划分和分治的并行技术 Lecture 5.
Divide and Conquer Algorithms Sathish Vadhiyar. Introduction  One of the important parallel algorithm models  The idea is to decompose the problem into.
Partitioning & Divide and Conquer Strategies Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
Divide-and-Conquer Pattern ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, June 25, 2012 slides4.ppt. 4.1.
Parallel Sorting Algorithms
and Divide-and-Conquer Strategies
Parallel Sorting Algorithms
Introduction to High Performance Computing Lecture 16
Introduction to High Performance Computing Lecture 17
Presentation transcript:

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. Partitioning and Divide-and-Conquer Strategies Chapter 4

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. Divide and Conquer Divide the problem into sub-problems of same form as larger problem. Further divide into still smaller sub-problems (usually done by recursion) Recursive divide and conquer amenable to parallelization because separate processes can be used for divided parts.

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. Divide-and-Conquer Using a Hypercube

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. Partial summation

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. Bucket sort One “bucket” assigned to hold numbers that fall within each region. Numbers in each bucket sorted using a sequential sorting algorithm. Sequential sorting time complexity: O(nlog(n/m)). Works well if the original numbers uniformly distributed across a known interval, say 0 to a - 1.

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. Parallel version of bucket sort Simple approach Assign one processor for each bucket:

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. Parallel bucket sort - improved version Requires  all-to-all scatter

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved “all-to-all” scatter routine actually transfers rows of an array to columns: Transposes a matrix.

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. All-to-All scatter on a 3-dim Hypercube T par = plogp

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved Tim’s version of All-to-All scatter on a 3-dim Hypercube Step 1: P0: x00, x10, x20, x30, x40, x50, x60, x70 ____/ ___/ ___/ ___/ / / / / P1: x01, x11, x21, x31, x41, x51, x61, x71 Step 2: P0: x00, x01, x20, x21, x40, x41, x60, x61 _________/____/ _________/____/ / / / / P2: x02, x03, x22, x23, x42, x43, x62, x63 Step 3: P0: x00, x01, x02, x03, x40, x41, x42, x43 ___________________/____/____/____/ / / / / P4: x05, x06, x07, x08, x45, x46, x47, x48 T par = plogp

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved Time Complexity Analysis for Bucket Sort Sequential Algorithm: 1.Each element is put into one of the m buckets  O(n) 2.Sort elements in each bucket:  (m* (n/m * log(n/m))) = nlog(n/m) If m=O(n) then TIME = O(n) Overall: T seq = O(n) (* How is this possible? A: uniform distribution assumption *) Parallel Algorithm (m=p): 1.Each PE partitions its n/p elements into p=m sub-buckets  O(n/p) 2.Using all-to-all personalized broadcast algorithm, each PE sends sub- buckets to the appropriate Pes. All-to-All Scatter of 1 unit messages can be done in O(plogp) time on a p processor Hypercube. Therefore, this step takes  (n/p 2 )*plogp= (n/p)logp time 3.Each PE sorts its own bucket using the seq. bucket sort alg. If there is uniform distribution and (n/p) elements can be put in (n/p) buckets then sorting can be done in  O(n/p) Time. (Otherwise, it is [n/p*log(n/p)]) Total Time  T par = T (1) + T (2) + T (3) = n/p + (n/p)logp + n/p

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M Pearson Education Inc. All rights reserved. Next: Pipelining + Prime Number Sieve