Download presentation
Presentation is loading. Please wait.
1
Sorting Algorithms CS 524 – High-Performance Computing
2
CS 524 (Au 2004/05)- Asim Karim @ LUMS2 Sorting Sorting is the task of arranging an unordered collection (sequence) of elements into monotonically increasing (or decreasing) order Sorting transforms an unordered set of elements S = {a 1, a 2, a 3,…a n } into the set S’ = {a’ 1, a’ 2, a’ 3,…a’ n } where a’ i ≤ a’ j for 0 ≤ i ≤ j ≤ n and S’ is a permutation of S Sorting algorithms can be categorized into internal (S can fit into main memory) and external (S cannot fit in main memory) We study internal algorithms only Sorting algorithms can also be categorized as comparison-based or noncomparison-based
3
CS 524 (Au 2004/05)- Asim Karim @ LUMS3 Data Storage on Parallel Computers Storage of input and output sequences Where? One processor or distributed among processors? How? What is the order of data distribution with respect to the order of the processors
4
CS 524 (Au 2004/05)- Asim Karim @ LUMS4 Compare-Exchange on Parallel Computers One element per processor: a i on P i and a j on P j Compare-exchange between two processors P i and P j requires a communication and a comparison operation A parallel system with as many processors as number of elements would deliver poor performance. Why?
5
CS 524 (Au 2004/05)- Asim Karim @ LUMS5 Compare-Split on Parallel Computers (1)
6
CS 524 (Au 2004/05)- Asim Karim @ LUMS6 Compare-Split on Parallel Computers (2) Each processors has n/p elements of the sequence Initially processor P i has block A i After sorting, the blocks of elements are ordered such that A’ i ≤ A’ j for i ≤ j and union of A i = union of A’ i Compare-split Each processor sends its block to the other (each block is sorted locally) The processor merges the two blocks of elements The processor splits the merged elements and retains the appropriate half of it
7
CS 524 (Au 2004/05)- Asim Karim @ LUMS7 Sorting Network (1) Sorting network is a specialized interconnection network that can perform many comparisons simultaneously thus improving sorting performance significantly Key component of the soriting network: comparator Increasing comparator Decreasing comparator
8
CS 524 (Au 2004/05)- Asim Karim @ LUMS8 Sorting Network (2)
9
CS 524 (Au 2004/05)- Asim Karim @ LUMS9 Bubble Sort Complexity: O(n 2 ) Bubble sort is difficult to parallelize. Why?
10
CS 524 (Au 2004/05)- Asim Karim @ LUMS10 Odd-Even Transposition Sort (1)
11
CS 524 (Au 2004/05)- Asim Karim @ LUMS11 Odd-Even Transpositon Sort (2)
12
CS 524 (Au 2004/05)- Asim Karim @ LUMS12 Parallel Implementation: p = n Data partitioning: Each processor P i has one element a i Computation and Communication: During each phase, the odd or even numbered processors perform a compare-exchange with their right processors Performance On a linear array On a crossbar On a bus Not cost optimal
13
CS 524 (Au 2004/05)- Asim Karim @ LUMS13 Parallel Implementation: p < n Data partitioning: Each processor P i has n/p elements in the block A i Computation and Communication: Sort A i locally (using merge sort or quicksort). Then, execute p phases (p/2 odd and p/2 even) performing compare-split operations with the right neigboring processor. Performance On a linear array On a crossbar On a bus Cost optimal on linear array and crossbar when p = O(log n). Not cost optimal on bus
14
CS 524 (Au 2004/05)- Asim Karim @ LUMS14 Shellsort (1) Odd-even transposition sort moves elements one position at a time If a sequence has only a few unordered elements and if they are far away from their correct position then OE sort will take a long time to sort the sequence Shellsort can move elements longer distances. It has two phases: In the first phase, blocks that are far away are compare-split In the second phase, an odd-even transposition sort is conducted. This is continued as long as blocks are changing positions
15
CS 524 (Au 2004/05)- Asim Karim @ LUMS15 Shellsort (2)
16
CS 524 (Au 2004/05)- Asim Karim @ LUMS16 Shellsort (3) Initially, each processor sort its block of elements locally First phase 1. Compare-split P i (i < p/2) with P p-i-1 (reverse order compare-split) 2. The processors are partitioned into two groups; one group has the first p/2 processors and the other the next p/2 processors. Compare-split (in reverse order) among each group. 3. Go to 1. Repeat for log p times. Second phase Perform OE sort until no changes occur
17
CS 524 (Au 2004/05)- Asim Karim @ LUMS17 Shellsort (4) Performance On a linear array On a crossbar On a bus
18
CS 524 (Au 2004/05)- Asim Karim @ LUMS18 Quicksort (1)
19
CS 524 (Au 2004/05)- Asim Karim @ LUMS19 Quicksort (2) Recursive divide-and-conquer algorithm that has an average complexity of O(nlogn)
20
CS 524 (Au 2004/05)- Asim Karim @ LUMS20 Quicksort (3) The partitioning of a sequence of length n has a complexity of O(n) The selection of the pivot affects significantly the overall complexity of quicksort In the worst case, where a n-length sequence is partitioned into a 1 and a n-1-length subsequences, the overall complexity becomes O(n 2 ) On average, the complexity is O(nlogn)
21
CS 524 (Au 2004/05)- Asim Karim @ LUMS21 Parallelizing Quicksort A naïve formulation Start off with one process with does the initial partitioning. Then, assign one of the subproblems (the recursion) to another process. Repeat for each subsequence until no further partitioning is possible. Not cost-optimal (Why?) Analysis
22
CS 524 (Au 2004/05)- Asim Karim @ LUMS22 Message-Passing Parallel Formulation Data partitioning: Each processor P i has A i of n/p elements Computation and communication Select a pivot Broadcast the pivot to all processors Locally rearrange the block A i into sub-blocks S i and L i Combine S i and L i from all processors as S and L Partition S to one group of processors and L to the other Recursively perform these operations until a sub-block is assigned to one processor only. Then, the processors sort the set locally
23
CS 524 (Au 2004/05)- Asim Karim @ LUMS23
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.