Chapter 11 Sets, and Selection

Slides:



Advertisements
Similar presentations
Introduction to Algorithms
Advertisements

Merge Sort 4/15/2017 4:30 PM Merge Sort 7 2   7  2  2 7
Quicksort Quicksort     29  9.
© 2004 Goodrich, Tamassia Quick-Sort     29  9.
© 2004 Goodrich, Tamassia QuickSort1 Quick-Sort     29  9.
Quick-Sort     29  9.
© 2004 Goodrich, Tamassia Quick-Sort     29  9.
© 2004 Goodrich, Tamassia Sets1. © 2004 Goodrich, Tamassia Sets2 Set Operations (§ 10.6) We represent a set by the sorted sequence of its elements By.
© 2004 Goodrich, Tamassia Selection1. © 2004 Goodrich, Tamassia Selection2 The Selection Problem Given an integer k and n elements x 1, x 2, …, x n, taken.
TTIT33 Algorithms and Optimization – Dalg Lecture 2 HT TTIT33 Algorithms and optimization Lecture 2 Algorithms Sorting [GT] 3.1.2, 11 [LD] ,
© 2004 Goodrich, Tamassia Merge Sort1 Quick-Sort     29  9.
1 CSC401 – Analysis of Algorithms Lecture Notes 9 Radix Sort and Selection Objectives  Introduce no-comparison-based sorting algorithms: Bucket-sort and.
Selection1. 2 The Selection Problem Given an integer k and n elements x 1, x 2, …, x n, taken from a total order, find the k-th smallest element in this.
CSC 213 Lecture 15: Sets, Union/Find, and the Selection Problem.
Sorting.
Order Statistics The ith order statistic in a set of n elements is the ith smallest element The minimum is thus the 1st order statistic The maximum is.
Order Statistics. Order statistics Given an input of n values and an integer i, we wish to find the i’th largest value. There are i-1 elements smaller.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
Sorting Fun1 Chapter 4: Sorting     29  9.
LECTURE 40: SELECTION CSC 212 – Data Structures.  Sequence of Comparable elements available  Only care implementation has O(1) access time  Elements.
© 2004 Goodrich, Tamassia Quick-Sort     29  9.
1 Merge Sort 7 2  9 4   2  2 79  4   72  29  94  4.
Towers of Hanoi Move n (4) disks from pole A to pole B such that a larger disk is never put on a smaller disk A BC ABC.
QuickSort by Dr. Bun Yue Professor of Computer Science CSCI 3333 Data.
CHAPTER 10.1 BINARY SEARCH TREES    1 ACKNOWLEDGEMENT: THESE SLIDES ARE ADAPTED FROM SLIDES PROVIDED WITH DATA STRUCTURES AND ALGORITHMS.
Sets and Multimaps 9/9/2017 Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia,
Algorithm Design Techniques, Greedy Method – Knapsack Problem, Job Sequencing, Divide and Conquer Method – Quick Sort, Finding Maximum and Minimum, Dynamic.
Merge Sort 1/12/2018 5:48 AM Merge Sort 7 2   7  2  2 7
Advanced Sorting 7 2  9 4   2   4   7
Chapter 11 Sorting Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and Mount.
Merge Sort 1/12/2018 9:39 AM Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia,
Merge Sort 1/12/2018 9:44 AM Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia,
Quick-Sort 2/18/2018 3:56 AM Selection Selection.
Sets and Multimaps 5/9/2018 Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia,
Randomized Algorithms
Chapter 10.1 Binary Search Trees
Quick-Sort 9/12/2018 3:26 PM Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia,
Quick-Sort 9/13/2018 1:15 AM Quick-Sort     2
Selection Selection 1 Quick-Sort Quick-Sort 10/30/16 13:52
Radish-Sort 11/11/ :01 AM Quick-Sort     2 9  9
Objectives Introduce different known sorting algorithms
Sequences 11/12/2018 5:23 AM Sets Sets.
Quick-Sort 11/14/2018 2:17 PM Chapter 4: Sorting    7 9
Quick-Sort 11/19/ :46 AM Chapter 4: Sorting    7 9
Randomized Algorithms
Merge Sort 12/4/ :32 AM Merge Sort 7 2   7  2   4  4 9
Ch. 8 Priority Queues And Heaps
Medians and Order Statistics
1 Lecture 13 CS2013.
CS 3343: Analysis of Algorithms
Sorting, Sets and Selection
Quick-Sort 2/23/2019 1:48 AM Chapter 4: Sorting    7 9
Merge Sort 2/23/ :15 PM Merge Sort 7 2   7  2   4  4 9
Merge Sort 2/24/2019 6:07 PM Merge Sort 7 2   7  2  2 7
Quick-Sort 2/25/2019 2:22 AM Quick-Sort     2
Order Statistics Def: Let A be an ordered set containing n elements. The i-th order statistic is the i-th smallest element. Minimum: 1st order statistic.
Copyright © Aiman Hanna All rights reserved
Quick-Sort 4/8/ :20 AM Quick-Sort     2 9  9
Sequences 4/10/2019 4:55 AM Sets Sets.
Merge Sort 4/10/ :25 AM Merge Sort 7 2   7  2   4  4 9
Quick-Sort 4/25/2019 8:10 AM Quick-Sort     2
Divide & Conquer Sorting
Quick-Sort 5/7/2019 6:43 PM Selection Selection.
The Selection Problem.
Quick-Sort 5/25/2019 6:16 PM Selection Selection.
Merge Sort 5/30/2019 7:52 AM Merge Sort 7 2   7  2  2 7
CS203 Lecture 15.
Data Structures and Algorithms CS 244
Divide-and-Conquer 7 2  9 4   2   4   7
Presentation transcript:

Chapter 11 Sets, and Selection Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and Mount (Wiley 2004) and slides from Jory Denny and Mukulika Ghosh 1 1

Sequences 11/28/2019 10:49 AM Sets

Set Operations A set is an ordered data structure similar to an ordered map, except only elements are stored (and yes elements must be unique) We represent a set by the sorted sequence of its elements By specializing the auxiliary methods the generic merge algorithm can be used to perform basic set operations: Union - 𝐴∪𝐵 – Return all elements which appear in 𝐴 or 𝐵 (unique only) Intersection - 𝐴∩𝐵 – Return only elements which appear in both 𝐴 and 𝐵 Subtraction - 𝐴∖𝐵 – Return elements in 𝐴 which are not in 𝐵 The running time of an operation on sets 𝐴 and 𝐵 should be at most 𝑂 𝑛 𝐴 + 𝑛 𝐵 Set union: if 𝑎<𝑏 𝑆.insertFront(𝑎) if 𝑏<𝑎 𝑆.insertFront(𝑏) else 𝑎=𝑏 𝑆.insertFront(𝑎) Set intersection: if 𝑎<𝑏 {𝑑𝑜 𝑛𝑜𝑡ℎ𝑖𝑛𝑔} if 𝑏<𝑎 {𝑑𝑜 𝑛𝑜𝑡ℎ𝑖𝑛𝑔} else 𝑎=𝑏 𝑆.insertBack(𝑎)

Generic Merging Algorithm genericMerge(𝐴, 𝐵) Input: Sets 𝐴,𝐵 (implemented as sequences) Output: Set 𝑆 𝑆←∅ while ¬𝐴.empty ∧¬𝐵.empty do 𝑎←𝐴.front ; 𝑏←𝐵.front if 𝑎<𝑏 aIsLess(𝑎, 𝑆) //generic action 𝐴.eraseFront ; else if 𝑏<𝑎 bIsLess(𝑏, 𝑆) //generic action 𝐵.eraseFront else //𝑎=𝑏 bothAreEqual(𝑎, 𝑏, 𝑆) //generic action 𝐴.eraseFront ; 𝐵.eraseFront while ¬𝐴.empty() do aIsLess(𝐴.front , 𝑆); 𝐴.eraseFront while ¬𝐵.empty() do bIsLess(𝐵.front , 𝑆); 𝐵.eraseFront return 𝑆 Generalized merge of two sorted sets 𝐴 and 𝐵 Auxiliary methods (generic functions) aIsLess 𝑎, 𝑆 bIsLess(𝑏, 𝑆) bothAreEqual 𝑎, 𝑏, 𝑆 Runs in 𝑂 𝑛 𝐴 + 𝑛 𝐵 time provided the auxiliary methods run in 𝑂 1 time

Using Generic Merge for Set Operations Any of the set operations can be implemented using a generic merge For example: For intersection: only copy elements that are duplicated in both list For union: copy every element from both lists except for the duplicates All methods run in linear time

Better/Typical Set Implementation Can use search trees such that the key is equivalent to the element to implement a set, allows fast ordering of data 6 9 2 4 1 8 < > =

Merge Sort 二○一九年十一月二十八日二○一九年十一月二十八日 Selection

The Selection Problem Given an integer 𝑘 and 𝑛 elements 𝑥 1 , 𝑥 2 , …, 𝑥 𝑛 , taken from a total order, find the 𝑘-th smallest element in this set. Also called order statistics, the 𝑖th order statistic is the 𝑖th smallest element Minimum - 𝑘=1 - 1st order statistic Maximum - 𝑘=𝑛 - 𝑛th order statistic Median - 𝑘= 𝑛 2 etc

We can sort the set in 𝑂 𝑛 log 𝑛 time and then index the 𝑘-th element. The Selection Problem Naïve solution - SORT! We can sort the set in 𝑂 𝑛 log 𝑛 time and then index the 𝑘-th element. Can we solve the selection problem faster? 7 4 9 6 2  2 4 6 7 9 k=3

The Minimum (or Maximum) Algorithm minimum(𝐴) Input: Array 𝐴 Output: minimum element in 𝐴 𝑚←𝐴 1 for 𝑖←2…𝑛 do 𝑚← min 𝑚,𝐴 𝑖 return 𝑚 Running Time 𝑂(𝑛) Is this the best possible?

Quick-Select x x L G E 𝑘≤|𝐿| 𝑘> 𝐿 + 𝐸 𝑘’=𝑘− 𝐿 − 𝐸 𝐿 <𝑘≤ 𝐿 + 𝐸 Quick-select is a randomized selection algorithm based on the prune-and-search paradigm: Prune: pick a random element 𝑥 (called pivot) and partition 𝑆 into 𝐿 elements <𝑥 𝐸 elements =𝑥 𝐺 elements >𝑥 Search: depending on 𝑘, either answer is in 𝐸, or we need to recur on either 𝐿 or 𝐺 Note: Partition same as Quicksort x L G E 𝑘≤|𝐿| 𝑘> 𝐿 + 𝐸 𝑘’=𝑘− 𝐿 − 𝐸 𝐿 <𝑘≤ 𝐿 + 𝐸 (𝑑𝑜𝑛𝑒)

Quick-Select Visualization An execution of quick-select can be visualized by a recursion path Each node represents a recursive call of quick-select, and stores 𝑘 and the remaining sequence 𝑘=5, 𝑆=(7, 4,9, 3 , 2, 6, 5, 1, 8) 𝑘=2, 𝑆=(7, 4, 9, 6, 5, 8 ) 𝑘=2, 𝑆=(7, 4 , 6, 5) 𝑘=1, 𝑆=(7, 6, 5 ) 5

Exercise Good call Bad call Best Case - even splits (n/2 and n/2) Worst Case - bad splits (1 and n-1) Derive and solve the recurrence relation corresponding to the best case performance of randomized quick-select. Derive and solve the recurrence relation corresponding to the worst case performance of randomized quick-select. 7 2 9 4 3 7 6 1 9 7 2 9 4 3 7 6 1 2 4 3 1 7 2 9 4 3 7 6 Good call Bad call

Expected Running Time Consider a recursive call of quick-select on a sequence of size 𝑠 Good call: the size of 𝐿 and 𝐺 is at most 3𝑠 4 Bad call: the size of 𝐿 and 𝐺 is greater than 3𝑠 4 A call is good with probability 1/2 1/2 of the possible pivots cause good calls: 7 2 9 4 3 7 6 1 9 7 2 9 4 3 7 6 1 2 4 3 1 7 9 7 1  1 1 7 2 9 4 3 7 6 Good call Bad call 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Bad pivots Good pivots Bad pivots

Expected Running TimE Probabilistic Fact #1: The expected number of coin tosses required in order to get one head is two Probabilistic Fact #2: Expectation is a linear function: 𝐸 𝑋+𝑌 =𝐸 𝑋 +𝐸(𝑌) 𝐸 𝑐𝑋 =𝑐𝐸(𝑋) Let 𝑇 𝑛 denote the expected running time of quick-select. By Fact #2, 𝑇 𝑛 <𝑇 3𝑛 4 +𝑏𝑛∗ 𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 # 𝑜𝑓 𝑐𝑎𝑙𝑙𝑠 𝑏𝑒𝑓𝑜𝑟𝑒 𝑎 𝑔𝑜𝑜𝑑 𝑐𝑎𝑙𝑙 By Fact #1, 𝑇 𝑛 <𝑇 3𝑛 4 +2𝑏𝑛 That is, 𝑇 𝑛 is a geometric series: 𝑇 𝑛 <2𝑏𝑛+2𝑏 3 4 𝑛+2𝑏 3 4 2 𝑛+2𝑏 3 4 3 𝑛+… So 𝑇 𝑛 is 𝑂 𝑛 . We can solve the selection problem in 𝑂 𝑛 expected time.

Deterministic Selection We can do selection in 𝑂 𝑛 worst-case time. Main idea: recursively use the selection algorithm itself to find a good pivot for quick-select: Divide 𝑆 into 𝑛 5 sets of 5 each Find a median in each set Recursively find the median of the “baby” medians. See Exercise C-11.22 for details of analysis. 1 2 3 4 5 Min size for 𝐿 for 𝐺