Bitonic Sorting and Its Circuit Design

Slides:



Advertisements
Similar presentations
Garfield AP Computer Science
Advertisements

PERMUTATION CIRCUITS Presented by Wooyoung Kim, 1/28/2009 CSc 8530 Parallel Algorithms, Spring 2009 Dr. Sushil K. Prasad.
Parallel Sorting Sathish Vadhiyar. Sorting  Sorting n keys over p processors  Sort and move the keys to the appropriate processor so that every key.
1 Parallel Parentheses Matching Plus Some Applications.
1 Potential for Parallel Computation Module 2. 2 Potential for Parallelism Much trivially parallel computing  Independent data, accounts  Nothing to.
Advanced Topics in Algorithms and Data Structures Lecture 7.1, page 1 An overview of lecture 7 An optimal parallel algorithm for the 2D convex hull problem,
Advanced Topics in Algorithms and Data Structures Lecture pg 1 Recursion.
Parallel Sorting Algorithms Comparison Sorts if (A>B) { temp=A; A=B; B=temp; } Potential Speed-up –Optimal Comparison Sort: O(N lg N) –Optimal Parallel.
Advanced Topics in Algorithms and Data Structures Page 1 Parallel merging through partitioning The partitioning strategy consists of: Breaking up the given.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
1 Tuesday, November 14, 2006 “UNIX was never designed to keep people from doing stupid things, because that policy would also keep them from doing clever.
Chapter 10 in textbook. Sorting Algorithms
Algorithms and Applications
CHAPTER 11 Sorting.
Sorting Algorithms: Topic Overview
CS 584. Sorting n One of the most common operations n Definition: –Arrange an unordered collection of elements into a monotonically increasing or decreasing.
Sorting Algorithms Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text ``Introduction to Parallel Computing'', Addison Wesley,
CS 584. Sorting n One of the most common operations n Definition: –Arrange an unordered collection of elements into a monotonically increasing or decreasing.
Bitonic and Merging sorting networks Efficient Parallel Algorithms COMP308.
CSCI-455/552 Introduction to High Performance Computing Lecture 22.
Simulating a CRCW algorithm with an EREW algorithm Lecture 4 Efficient Parallel Algorithms COMP308.
1 Sorting Algorithms - Rearranging a list of numbers into increasing (strictly non-decreasing) order. ITCS4145/5145, Parallel Programming B. Wilkinson.
Lecture 12: Parallel Sorting Shantanu Dutt ECE Dept. UIC.
1 Parallel Sorting Algorithms. 2 Potential Speedup O(nlogn) optimal sequential sorting algorithm Best we can expect based upon a sequential sorting algorithm.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Outline  introduction  Sorting Networks  Bubble Sort and its Variants 2.
Lecture 6 Algorithm Analysis Arne Kutzner Hanyang University / Seoul Korea.
Divide and Conquer Applications Sanghyun Park Fall 2002 CSE, POSTECH.
Searching and Sorting Recursion, Merge-sort, Divide & Conquer, Bucket sort, Radix sort Lecture 5.
1. 2 Sorting Algorithms - rearranging a list of numbers into increasing (strictly nondecreasing) order.
1 Sorting (Bubble Sort, Insertion Sort, Selection Sort)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
1 Parallel Sorting Algorithm. 2 Bitonic Sequence A bitonic sequence is defined as a list with no more than one LOCAL MAXIMUM and no more than one LOCAL.
“Sorting networks and their applications”, AFIPS Proc. of 1968 Spring Joint Computer Conference, Vol. 32, pp
CSCI-455/552 Introduction to High Performance Computing Lecture 23.
Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar
Data Structures and Algorithms in Parallel Computing Lecture 8.
Young CS 331 D&A of Algo. Topic: Divide and Conquer1 Divide-and-Conquer General idea: Divide a problem into subprograms of the same kind; solve subprograms.
Today’s Material Sorting: Definitions Basic Sorting Algorithms
Parallel Programming - Sorting David Monismith CS599 Notes are primarily based upon Introduction to Parallel Programming, Second Edition by Grama, Gupta,
Sorting Lower Bounds n Beating Them. Recap Divide and Conquer –Know how to break a problem into smaller problems, such that –Given a solution to the smaller.
Unit-8 Sorting Algorithms Prepared By:-H.M.PATEL.
Sorting: Parallel Compare Exchange Operation A parallel compare-exchange operation. Processes P i and P j send their elements to each other. Process P.
S ORTING ON P ARALLEL C OMPUTERS Dr. Sherenaz Al-Haj Baddar KASIT University of Jordan
Merge Sort.
Advanced Algorithms Analysis and Design
PRAM and Parallel Computing
Lecture 3: Parallel Algorithm Design
Sorting Networks Characteristics The basic unit: a comparator
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Parallel Odd-Even Sort Algorithm Dr. Xiao.
Chapter 4 Divide-and-Conquer
Parallel Sorting Algorithms
Adapted from slides by Marty Stepp and Stuart Reges
Adapted from slides by Marty Stepp and Stuart Reges
The Complexity of Algorithms and the Lower Bounds of Problems
Welcome to CIS 068 ! Lesson 9: Sorting CIS 068.
CS Two Basic Sorting Algorithms Review Exchange Sorting Merge Sorting
Data Structures Review Session
Topic: Divide and Conquer
Parallel Sorting Algorithms
Bitonic and Merging sorting networks
Lecture 6 Algorithm Analysis
Parallel Sorting Algorithms
Lecture 6 Algorithm Analysis
CSE 326: Data Structures: Sorting
Sorting Algorithms - Rearranging a list of numbers into increasing (strictly non-decreasing) order. Sorting number is important in applications as it can.
Topic: Divide and Conquer
Searching/Sorting/Searching
Parallel Sorting Algorithms
DIVIDE AND CONQUER.
Presentation transcript:

Bitonic Sorting and Its Circuit Design

Professor, Kent State University Kenneth E. Batcher Professor, Kent State University http://www.cs.kent.edu/~batcher “Sorting networks and their applications”, AFIPS Proc. of 1968 Spring Joint Computer Conference, Vol. 32, pp 307-314.

Background Sorting is fundamental Low bound of any sequential sorting algorithms is O(nlogn) Can we improve the time complexity further? Parallel algorithms Circuit/Network Design Parallel Computing Models

①Bitonic Sequence 双调序列 sequence of elements {a0, a1, …, an-1} where either (1) there exists an index, i, 0 i  n-1, such that {a0, …, ai} is monotonically increasing, and {ai+1, …, an-1} is monotonically decreasing, e.g. {1, 2, 4, 7, 6, 0} Or (2) there exists a cyclic shift of indices so that (1) is satisfied e.g. {8, 9, 2, 1, 0, 4}  {0, 4, 8, 9, 2, 1}

①Bitonic Sequence : Examples Value of element ai a0 a1 a2 a3 a4 a5 a6 a7 { 3, 5, 7, 9, 8, 6, 4, 2 } Value of element ai a0 a1 a2 a3 a4 a5 a6 a7 { 8, 6, 4, 2, 3, 5, 7, 9}

①Bitonic Sequence : Examples Value of element ai a0 a1 a2 a3 a4 a5 a6 a7 { 3, 5, 7, 9, 11, 13, 15, 17 } Value of element ai a0 a1 a2 a3 a4 a5 a6 a7 { 5, 3, 1, 2, 4, 6, 8, 7 }

Bitonic Sort: basic idea Consider a bitonic sequence S of size n where the first half ( {a0, a1, …, an/2-1} ) is increasing, and the second half ( {an/2, an/2+1, …, an-1} ) is decreasing Value of element ai a0 a1... an/2-1 an/2 an/2 +1 … an-1

②“Bitonic Split” 双调分裂 Pair-wise min-max comparison s1 = {min(a0, an/2), min(a1, an/2+1), … , min(an/2-1, an-1)} s2 = {max(a0, an/2), max(a1, an/2+1), … , max(an/2-1, an-1)} ai a0 a1... an/2-1 an/2 an/2 +1 … an-1 Compare and exchange value an/2 a0 an/2-1 an-1 value S2 S1

There exists an element b in S1 such that all elements before b is increasing and all elements after b is decreasing an element c in S2 such that all elements before c is decreasing and all elements after c is increasing S1 and S2 Both S1 and S2 are bitonic sequences Any elements in S1 < any elements in S2 (because b < c and b is the maximum value in S1 and c is the minimum value in S2) value S2 c b S1

pair-wise min-max comparison e.g. { 2, 4, 6, 8, 7, 5, 3, 1} { 2, 4, 6, 8 7, 5, 3, 1 } => S1={2, 4, 3, 1} S2={7, 5, 6, 8} bitonic sequence of size 8 => 2 bitonic sequence of size 4 Compare and exchange

②Bitonic Split Bitonic Split Bitonic(n) 2 Bitonic(n/2) The split is applicable to any bitonic sequence. Need not to have the 1st half to be increasing/decreasing and the 2nd half to be decreasing/increasing: Bitonic Split Bitonic(n) 2 Bitonic(n/2)

Sorting a bitonic sequence By using bitonic split recursively, INPUT: a bitonic sequence of size n Phase 1: 2 bitonic sequence of size n/2 Phase 2: 4 bitonic sequence of size n/4 … Phase (log n): n bitonic sequence of size 1 a sorted sequence can be generated by concatenating the n bitonic sequence of size 1 By using bitonic split recursively, sorting a bitonic sequence of size n sorting two smaller bitonic sequence of size n/2, and concatenating the results … sorting n smaller bitonic sequence of size 1, and concatenating the results a sorted sequence of size n

③Bitonic Merge 双调合并 sort a bitonic sequence using bitonic splits 1 2 3 4 5 6 7 8 9 10111213141516 length 16 8 Some Arrow directions are wrong. 4 2 Anything wrong with this slide?

What do you think of ? Bitonic Merge Circuit : BM[16] What do you think of ?  hypercube. What do you think of ?

Questions ? How can we convert an unsorted sequence to a bitonic sequence ? (then, by using bitonic split recursively, a sorted sequence can be formed).

Turn an unsorted sequence into a bitonic sequence: ③Bitonic Merge (BM) Operation 1 2 3 4 5 6 7 8 9 10111213141516 length 4 8 16 At every phase, sort a bitonic sequence of size 2, 4, 8, 16 into a monotonically increasing or decreased sequence

Turn an unsorted sequence into a bitonic sequence

④Bitonic Sort 1 2 3 4 5 6 7 8 9 10111213141516 length 4 8 16

Sort (any ordered of) sequence Using bitonic merge repeatedly Definition: BM[n]: increasing bitonic merge of size n bitonic merge : sort a bitonic sequence of size n into a monotonically increasing sequence BM[n]: decreasing bitonic merge of size n bitonic merge that sort a bitonic sequence of size n into a monotonically decreasing sequence

Steps: Divide the sequence into a group of 2 any sequence of size 2 is a bitonic sequence: either the increasing part is of size 2 and the decreasing part is of size 0, or vice versa Using BM[2] on a group to form an increasing sequence, and BM[2] on the adjacent group to form an decreasing sequence Concatenate the two group to form a bitonic sequence of size 4

Steps: Repeat the above steps on other groups Repeat the above steps recursively, until a bitonic sequence of size n is formed Using bitonic merge again to turn the bitonic sequence into a sorted sequence

Bitonic Sorting Circuit: BS(18) BM[n]: increasing bitonic merge of size n BM[n]: decreasing bitonic merge of size n

Sort (any ordered of) sequence Hence, n unsorted numbers n/2 group of 2-number bitonic sequence n/4 group of 4-number bitonic sequence … 1 group of n-number bitonic sequence a sorted sequence

⑤Complexity of Bitonic Sort Parallel bitonic sort with n processor The last stage of an n-element bitonic sorting need to merge n-element, and has a depth of log(n) Other stages perform a complete sort of n/2 elements Depth, d(n) = d(n/2) + log(n) d(n) = 1 + 2 + 4 + … + log(n) = (log2n) Complexity: T(n) = (log2n)

⑤Complexity of Bitonic Sort Parallel sorting with a block of elements per processor sort the local block of elements first (using any sorting algorithm such as quicksort, bitonic sort) sort the elements among processors using parallel bitonic sort T(n) = T(local_sort) + T(comparisons) +T(communication) Only computation time is considered here (you need to consider all communication time also)

⑥Concluding Remarks Bitonic Sorting: Common Sense Regression to Computer Science One of 10 Most Important Papers Parallel Algorithm: Ascend/Descend Another example: Prefix sum Network Model:

Bitonic Sorting Network Hypercube connections! Try to Write Bitonic Sorting algorithm on hypercube.

Bitonic Sort on Butterfly

Bitonic Sort on Butterfly

Bitonic Sort on Butterfly

Bitonic Sort on Butterfly

Bitonic Sort on Butterfly

Bitonic Sort on Butterfly

Bitonic Sort on Butterfly

Bitonic Sort on Butterfly

Bitonic Sort on Butterfly

Bitonic Sort on Butterfly

Bitonic Sort on Butterfly

Bitonic Sort on Butterfly

Bitonic Sort on Butterfly

PRAM Model … P1 P2 P3 Pn Memory Access time from any processor to any memory unit is equal It is impossible in practice So it is an ideal model for parallel computing

PRAM Model Program for Sum= a(1)+a(2)+…+a(N) for i = 1 to log N for j= 1 to n/ 2i parallel do a(j) = a(j) + a(N/ 2i + j) endpar endfor Finally a(1) is the sum

Hypercube Model Finally node 00…0 holds the sum Suppose node N(i) holds element a(i), where i is the value of node index x1x2…xn for i = 1 to n for j= i to n parallel do N(00…0 (xj=0) xj+1…xn)  N(00…0 (xj=1) xj+1…xn); a(00…0 (xj=0) xj+1…xn) = a(00…0 (xj=0) xj+1…xn) + a(00…0 (xj=1) xj+1…xn) endpar endfor Finally node 00…0 holds the sum

Hypercube Model Suppose node 000 holds element a(0) and 111holds element a(7) a(4) a(5) a(0) a(1) a(0)+a(4) a(1)+a(5) a(6) a(7) a(3) a(2) a(2)+a(6) a(3)+a(7) Prove a graph of degree d has a diameter of at least logd N, where N is the number of nodes. Write a program for bitonic sorting on hypercube. Suppose node N(i) holds the element a(i) initially, where i(0~2n-1) is the decimal value of node index x1x2…xn . Finally N(0)holds the smallest element and N(2n-1)holds the largest element. a(0)+a(4) +a(2)+a(6) a(0)+a(4) +a(2)+a(6) +a(1)+a(5) +a(3)+a(7) a(1)+a(5) +a(3)+a(7)