CS 284a, 29 October 1997 Copyright (c) 1997-98, John Thornley1 CS 284a Lecture Tuesday, 29 October, 1997.

Slides:



Advertisements
Similar presentations
Chapter 14 Recursion Lecture Slides to Accompany An Introduction to Computer Science Using Java (2nd Edition) by S.N. Kamin, D. Mickunas,, E. Reingold.
Advertisements

Mergesort CSE 2320 – Algorithms and Data Structures Vassilis Athitsos University of Texas at Arlington 1.
Chapter 7 Sorting Part II. 7.3 QUICK SORT Example left right pivot i j 5 > pivot and should go to the other side. 2 < pivot and should go to.
SORTING AND ASYMPTOTIC COMPLEXITY Lecture 12 CS2110 – Spring 2014 File searchSortAlgorithms.zip on course website (lecture notes for lectures 12, 13) contains.
Theory of Computing Lecture 3 MAS 714 Hartmut Klauck.
ADA: 5. Quicksort1 Objective o describe the quicksort algorithm, it's partition function, and analyse its running time under different data conditions.
Lecture 7-2 : Distributed Algorithms for Sorting Courtesy : Michael J. Quinn, Parallel Programming in C with MPI and OpenMP (chapter 14)
Efficient Sorts. Divide and Conquer Divide and Conquer : chop a problem into smaller problems, solve those – Ex: binary search.
Advanced Topics in Algorithms and Data Structures Lecture pg 1 Recursion.
CSC2100B Quick Sort and Merge Sort Xin 1. Quick Sort Efficient sorting algorithm Example of Divide and Conquer algorithm Two phases ◦ Partition phase.
CS 206 Introduction to Computer Science II 04 / 28 / 2009 Instructor: Michael Eckmann.
QuickSort The content for these slides was originally created by Gerard Harrison. Ported to C# by Mike Panitz.
Lecture 7COMPSCI.220.FS.T Algorithm MergeSort John von Neumann ( 1945 ! ): a recursive divide- and-conquer approach Three basic steps: –If the.
CS 171: Introduction to Computer Science II Quicksort.
CS 162 Intro to Programming II Quick Sort 1. Quicksort Maybe the most commonly used algorithm Quicksort is also a divide and conquer algorithm Advantage.
High Performance Comparison-Based Sorting Algorithm on Many-Core GPUs Xiaochun Ye, Dongrui Fan, Wei Lin, Nan Yuan, and Paolo Ienne Key Laboratory of Computer.
CS 206 Introduction to Computer Science II 12 / 09 / 2009 Instructor: Michael Eckmann.
Advanced Topics in Algorithms and Data Structures Page 1 Parallel merging through partitioning The partitioning strategy consists of: Breaking up the given.
1 Tuesday, November 14, 2006 “UNIX was never designed to keep people from doing stupid things, because that policy would also keep them from doing clever.
1 Sorting Algorithms (Part II) Overview  Divide and Conquer Sorting Methods.  Merge Sort and its Implementation.  Brief Analysis of Merge Sort.  Quick.
CS 206 Introduction to Computer Science II 12 / 05 / 2008 Instructor: Michael Eckmann.
General Computer Science for Engineers CISC 106 James Atlas Computer and Information Sciences 10/23/2009.
1 Lecture 11 Sorting Parallel Computing Fall 2008.
Parallel Merging Advanced Algorithms & Data Structures Lecture Theme 15 Prof. Dr. Th. Ottmann Summer Semester 2006.
CS2420: Lecture 10 Vladimir Kulyukin Computer Science Department Utah State University.
Algorithmic Complexity 2 Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
CS 284a, 4 November 1997 Copyright (c) , John Thornley1 CS 284a Lecture Tuesday, 4 November, 1997.
CSCI 4440 / 8446 Parallel Computing Three Sorting Algorithms.
CS 284a Lecture Wednesday, 26 November, 1997
CS 584. Sorting n One of the most common operations n Definition: –Arrange an unordered collection of elements into a monotonically increasing or decreasing.
Sorting Lower Bound Andreas Klappenecker based on slides by Prof. Welch 1.
S: Application of quicksort on an array of ints: partitioning.
Chapter 7 (Part 2) Sorting Algorithms Merge Sort.
CS 206 Introduction to Computer Science II 12 / 08 / 2008 Instructor: Michael Eckmann.
CS2420: Lecture 11 Vladimir Kulyukin Computer Science Department Utah State University.
Parallel Programming in C with MPI and OpenMP
1 Time Analysis Analyzing an algorithm = estimating the resources it requires. Time How long will it take to execute? Impossible to find exact value Depends.
1 Parallel Sorting Algorithms. 2 Potential Speedup O(nlogn) optimal sequential sorting algorithm Best we can expect based upon a sequential sorting algorithm.
Java Methods Big-O Analysis of Algorithms Object-Oriented Programming
CSS106 Introduction to Elementary Algorithms M.Sc Askar Satabaldiyev Lecture 05: MergeSort & QuickSort.
Computer Science 101 A Survey of Computer Science QuickSort.
Department of Computer Science MapReduce for the Cell B. E. Architecture Marc de Kruijf University of Wisconsin−Madison Advised by Professor Sankaralingam.
Advanced Computer Networks Lecture 1 - Parallelization 1.
Sorting 1. Insertion Sort
Data Structures - CSCI 102 Selection Sort Keep the list separated into sorted and unsorted sections Start by finding the minimum & put it at the front.
Data Structures and Algorithms in Parallel Computing Lecture 8.
Young CS 331 D&A of Algo. Topic: Divide and Conquer1 Divide-and-Conquer General idea: Divide a problem into subprograms of the same kind; solve subprograms.
CS 367 Introduction to Data Structures Lecture 11.
Unit-8 Sorting Algorithms Prepared By:-H.M.PATEL.
CSCI-455/552 Introduction to High Performance Computing Lecture 21.
Sorting – Lecture 3 More about Merge Sort, Quick Sort.
CMPT 238 Data Structures More on Sorting: Merge Sort and Quicksort.
Divide and Conquer Algorithms Sathish Vadhiyar. Introduction  One of the important parallel algorithm models  The idea is to decompose the problem into.
6/16/2010 Parallel Performance Parallel Performance.
Divide and Conquer divide and conquer algorithms typically recursively divide a problem into several smaller sub-problems until the sub-problems are.
Algorithm Design Methods
Sorting LinkedLists.
Unit-2 Divide and Conquer
Topic: Divide and Conquer
Parallelism for summation
CSE 326: Data Structures Sorting
ITEC 2620M Introduction to Data Structures
CSE 373 Data Structures and Algorithms
Dense Linear Algebra (Data Distributions)
CSC 143 Java Sorting.
Topic: Divide and Conquer
Algorithm Efficiency and Sorting
Parallel Sorting Algorithms
CSE 332: Parallel Algorithms
Presentation transcript:

CS 284a, 29 October 1997 Copyright (c) , John Thornley1 CS 284a Lecture Tuesday, 29 October, 1997

CS 284a, 29 October 1997 Copyright (c) , John Thornley2 Multithreaded Sorting: The Problem with Quicksort and Mergesort Sequential partition or merge limits speedup. Speedup(n, p) = (p log 2 (n))/(2p log 2 (n/p)). Speedup(n,  ) = log 2 (n)/2. Example:Speedup(100 million, 2) = 1.9 (96%), Speedup(100 million, 4) = 3.5 (88%), Speedup(100 million, 8) = 5.7 (71%), Speedup(100 million, 16) = 8.1 (51%), Speedup(100 million, 32) = 10.2 (32%), Speedup(100 million, 64) = 11.6 (18%).

CS 284a, 29 October 1997 Copyright (c) , John Thornley3 The PSRS Algorithm (Parallel Sorting by Regular Sampling) Basic idea: –Split data into k equal-sized segments. –Sort segments concurrently (e.g., using quicksort). –Parallel k-way merge of sorted segments. Another name: “one-deep parallel mergesort”. Key algorithm is parallel k-way merge. Complexity  O(n/p log(n)) for n > p 3. Fastest general-purpose parallel sorting algorithm.

CS 284a, 29 October 1997 Copyright (c) , John Thornley4 PSRS Algorithm void PSRS(int n, item data[], item result[], int k); /* Precondition: */ /* k >= 1 and n >= 2*k*k and */ /*data[0.. n - 1] allocated and result[0.. n - 1] allocated. */ /* Postcondition:*/ /* ascending(result[0.. n - 1]) and*/ /* permutation(result[0.. n - 1], in data[0.. n - 1]).*/

CS 284a, 29 October 1997 Copyright (c) , John Thornley5 Step 1: Divide the Data into Segments data data Sequential complexity: O(k)

CS 284a, 29 October 1997 Copyright (c) , John Thornley6 Step 2: In Parallel, Sort the Data Segments data data sequential quicksort sequential quicksort sequential quicksort Sequential complexity: O(n log 2 (n/k))

CS 284a, 29 October 1997 Copyright (c) , John Thornley7 Step 3: Take Evenly-Spaced Sample Points From the Sorted Data Segments Sequential complexity: O(2k 2 ) data sample 2k 2 sample points

CS 284a, 29 October 1997 Copyright (c) , John Thornley8 Step 4: Sort the Data Sample sample sequential quicksort Sequential complexity: O(2k 2 log 2 (2k 2 )) sample

CS 284a, 29 October 1997 Copyright (c) , John Thornley9 Step 5: Choose Evenly-Spaced Pivots From the Sorted Data Sample sample 369 pivots Sequential complexity: O(k)

CS 284a, 29 October 1997 Copyright (c) , John Thornley10 Step 6: Partition the Sorted Data Segments Using the Pivots data 369 pivots Sequential complexity: O(k 2 log 2 (n/k))  3  6  9

CS 284a, 29 October 1997 Copyright (c) , John Thornley11 Step 7: Compute the Sizes of the Result Partitions data = = = 15 result Sequential complexity: O(k 2 )

CS 284a, 29 October 1997 Copyright (c) , John Thornley12 Step 8: In Parallel, Merge the Partitioned Data Segments into the Result Partitions data sequential k-way merge result sequential k-way merge sequential k-way merge Sequential complexity: O(n log 2 (k))

CS 284a, 29 October 1997 Copyright (c) , John Thornley13 Overall Sequential Complexity Step 1: O(k) - Divide data into segments. Step 2: O(n log 2 (n/k)) - Sort data segments. Step 3: O(2k 2 ) - Sample sorted data segments. Step 4: O(2k 2 log 2 (2k 2 )) - Sort data sample. Step 5: O(k) - Choose pivots from sorted data sample. Step 6: O(k 2 log 2 (n/k))- Partition sorted data segments. Step 7: O(k 2 ) - Compute result partition sizes. Step 8: O(n log 2 (k)) - Merge data into result partitions. Dominant terms (2k 2 << n) : O(n log 2 (n/k)) + O(n log 2 (k)). For fixed k, as n  , complexity  O(n log 2 (n)).

CS 284a, 29 October 1997 Copyright (c) , John Thornley14 Multithreaded PSRS void PSRS(int n, item data[], item result, int k, int t); /* Precondition: */ /* k >= 1 and n >= 2*k*k and */ /*data[0.. n - 1] allocated and result[0.. n - 1] allocated and*/ /*t >= 1.*/ /* Postcondition:*/ /* ascending(result[0.. n - 1]) and*/ /* permutation(result[0.. n - 1], in data[0.. n - 1]).*/ Extra argument: t is the number of threads used. If t > k, k threads are used.

CS 284a, 29 October 1997 Copyright (c) , John Thornley15 Multithreaded PSRS Algorithm Every step can be t-way multithreaded. Important steps to multithread: –Step 2: Sort data segments. –Step 8: Merge data into result partitions. Complexity: O(n/t log 2 (n/k)) + O(n/t log 2 (k)). For fixed k, as n  , complexity  O(n/t log 2 (n)).

CS 284a, 29 October 1997 Copyright (c) , John Thornley16 Multithreaded Performance Issues Load Balance: –How evenly-sized will the partitions be? –What is the data is not uniformly distributed? –What if there are lots of duplicates in the data? –Can we solve load balancing by having k > t? Algorithm Overhead: –How does sequential performance compare with quicksort? –How does sequential performance depend on k? Multithreading: –What is the cost of thread creation? Should we use barriers? –What are the cache/memory access issues?

CS 284a, 29 October 1997 Copyright (c) , John Thornley17 Costs of Partitioning Two dominant performance terms: –Step 2: O(n log 2 (n/k)) - Sort data segments. –Step 8: O(n log 2 (k)) - Merge data into result partitions. As k increases, step 2 cost decreases. As k increases, step 8 cost increases. Both extremes (k = 1, k = n) are O(n log 2 (n)). Overall effect depends on constant multipliers.