Adapted from instructor resource slides

Slides:



Advertisements
Similar presentations
Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Hash Tables,
Advertisements

Transform and Conquer Chapter 6. Transform and Conquer Solve problem by transforming into: a more convenient instance of the same problem (instance simplification)
Sorting Chapter Sorting Consider list x 1, x 2, x 3, … x n We seek to arrange the elements of the list in order –Ascending or descending Some O(n.
Sorting Gordon College 13.1 Some O(n2) Sorting Schemes
Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Sorting.
1 Hash Tables Gordon College CS Hash Tables Recall order of magnitude of searches –Linear search O(n) –Binary search O(log 2 n) –Balanced binary.
TCSS 343, version 1.1 Algorithms, Design and Analysis Transform and Conquer Algorithms Presorting HeapSort.
CS 206 Introduction to Computer Science II 12 / 03 / 2008 Instructor: Michael Eckmann.
Sorting Chapter 13 Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved
Maps A map is an object that maps keys to values Each key can map to at most one value, and a map cannot contain duplicate keys KeyValue Map Examples Dictionaries:
1 HEAPS & PRIORITY QUEUES Array and Tree implementations.
CHAPTER 09 Compiled by: Dr. Mohammad Omar Alhawarat Sorting & Searching.
Final Review Dr. Bernard Chen Ph.D. University of Central Arkansas Spring 2010.
Merge Sort. What Is Sorting? To arrange a collection of items in some specified order. Numerical order Lexicographical order Input: sequence of numbers.
Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Searching:
Hashing Dr. Yingwu Zhu.
© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (continued) Advanced Implementation of Tables.
Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.
Sorting. Pseudocode of Insertion Sort Insertion Sort To sort array A[0..n-1], sort A[0..n-2] recursively and then insert A[n-1] in its proper place among.
Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.
Sorting Chapter Sorting Consider list x 1, x 2, x 3, … x n We seek to arrange the elements of the list in order –Ascending or descending Some O(n.
Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Sorting.
Sorting Dr. Yingwu Zhu. Heaps A heap is a binary tree with properties: 1. It is complete Each level of tree completely filled Except possibly bottom level.
Priority Queues and Heaps. October 2004John Edgar2  A queue should implement at least the first two of these operations:  insert – insert item at the.
Sorting part 2 Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2008.
Exam #2 Review. Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved
Chapter 13 Priority Queues. 2 Priority queue A stack is first in, last out A queue is first in, first out A priority queue is least-in-first-out The “smallest”
Sorting Dr. Yingwu Zhu. Heaps A heap is a binary tree with properties: 1. It is complete Each level of tree completely filled Except possibly bottom level.
FALL 2005CENG 213 Data Structures1 Priority Queues (Heaps) Reference: Chapter 7.
Sorting Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2010.
Quicksort Dr. Yingwu Zhu. 2 Quicksort A more efficient exchange sorting scheme than bubble sort – A typical exchange involves elements that are far apart.
Sorting part 2 Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2008.
Sorting Cont. Quick Sort As the name implies quicksort is the fastest known sorting algorithm in practice. Quick-sort is a randomized sorting algorithm.
1 Priority Queues (Heaps). 2 Priority Queues Many applications require that we process records with keys in order, but not necessarily in full sorted.
Priority Queues and Heaps. John Edgar  Define the ADT priority queue  Define the partially ordered property  Define a heap  Implement a heap using.
Prof. U V THETE Dept. of Computer Science YMA
Sorting Dr. Yingwu Zhu.
CSE373: Data Structures & Algorithms Priority Queues
Hashing Exercises.
Heaps © 2010 Goodrich, Tamassia Heaps Heaps
Heap Sort Example Qamar Abbas.
Sorting.
Heap Chapter 9 Objectives Upon completion you will be able to:
ADT Heap data structure
Map interface Empty() - return true if the map is empty; else return false Size() - return the number of elements in the map Find(key) - if there is an.
Description Given a linear collection of items x1, x2, x3,….,xn
7/23/2009 Many thanks to David Sun for some of the included slides!
original list {67, 33,49, 21, 25, 94} pass { } {67 94}
Sorting Chapter 13 Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved
Quicksort and Mergesort
A Kind of Binary Tree Usually Stored in an Array
Tree Representation Heap.
Lecture 3 / 4 Algorithm Analysis
8/04/2009 Many thanks to David Sun for some of the included slides!
Hash Tables Chapter 12.7 Wherein we throw all the data into random array slots and somehow obtain O(1) retrieval time Nyhoff, ADTs, Data Structures and.
A Robust Data Structure
Sorting Dr. Yingwu Zhu.
Sub-Quadratic Sorting Algorithms
Advanced Implementation of Tables
Final Review Dr. Yingwu Zhu.
Sorting And Searching CSE116A,B 4/7/2019 B.Ramamurthy.
Sorting Dr. Yingwu Zhu.
Sorting Dr. Yingwu Zhu.
Heaps and priority queues
Instructor: Dr. Michael Geiger Spring 2019 Lecture 34: Exam 3 Preview
Heaps & Multi-way Search Trees
EECE.3220 Data Structures Instructor: Dr. Michael Geiger Spring 2019
Instructor: Dr. Michael Geiger Spring 2017 Lecture 36: Exam 3 Preview
Instructor: Dr. Michael Geiger Spring 2017 Lecture 30: Sorting & heaps
Instructor: Dr. Michael Geiger Spring 2017 Lecture 33: Hash tables
Presentation transcript:

Adapted from instructor resource slides Sorting Chapter 13 6/11/15 Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Today: Exams are still being corrected. Will post grades over the weekend, return Tuesday. Any questions on project #2? Review Thursday’s material (hashing/sorting) Heaps, priority queue Code exercises in class Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Hash Tables In some situations faster search is needed Solution is to use a hash function Value of key field given to hash function Location in a hash table is calculated Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Hash Functions Simple function could be to mod the value of the key by the size of the table H(x) = x % tableSize Note that we have traded speed for wasted space Table must be considerably larger than number of items anticipated Suggested to be 1.5-2x larger Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Hash Functions Observe the problem with same value returned by h(x) for different values of x Called collisions A simple solution is linear probing Empty slots marked with -1 Linear search begins at collision location Continues until empty slot found for insertion Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Hash Functions When retrieving a value linear probe until found If empty slot encountered then value is not in table If deletions permitted Slot can be marked so it will not be empty and cause an invalid linear probe Ex. -1 for unused slots, -2 for slots which used to contain data Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Collision Reduction Strategies Hash table capacity Size of table must be 1.5 to 2 times the size of the number of items to be stored Otherwise probability of collisions is too high Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Collision Reduction Strategies Linear probing can result in primary clustering Consider quadratic probing Probe sequence from location i is i + 1, i – 1, i + 4, i – 4, i + 9, i – 9, … Secondary clusters can still form Double hashing Use a second hash function to determine probe sequence Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Collision Reduction Strategies Chaining Table is a list or vector of head nodes to linked lists When item hashes to location, it is added to that linked list Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Improving the Hash Function Ideal hash function Simple to evaluate Scatters items uniformly throughout table Modulo arithmetic not so good for strings Possible to manipulate numeric (ASCII) value of first and last characters of a name Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Importance of Sorting Once a set of items is sorted, many other problems become easy Using O(nlgn) sorting algorithms leads to sub-quadratic runtimes Large-scale data processing would be impossible if sorting took Ω(n2) time

Comparison Functions The most common (and natural) way for sorting elements Is “Jones” the same as “jones”? What about “Jones, John” and “Jones – John”? The comparison function determines the results of the sort

Equal Elements Michael Jordon the basketball player vs. Michael Jordan the statistician Elements with equal keys will bunch up together Sometimes the relative order matters A sorting algorithm which maintains this order is said to be stable Most fast sorting algorithms are not stable….

Categories of Sorting Algorithms Selection sort Make passes through a list On each pass reposition correctly some element (largest or smallest) Scan list to find smallest element, put in first location, then loop again, replace position 2

Categories of Sorting Algorithms Exchange sort Systematically interchange pairs of elements which are out of order Bubble sort does this Largest element sinks to the back, smaller elements “bubble up” Out of order, exchange In order, do not exchange

Bubble Sort Algorithm 1. Initialize numCompares to n - 1 2. While numCompares != 0, do following a. Set last = 1 // location of last element in a swap b. For i = 1 to numPairs if xi > xi + 1 Swap xi and xi + 1 and set last = i c. Set numCompares = last – 1 End while Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Categories of Sorting Algorithms Insertion sort Repeatedly insert a new element into an already sorted list Note this works well with a linked list implementation All these have computing time O(n2)

Algorithm for Linear Insertion Sort For i = 2 to n do the following a. set NextElement = x[i] and x[0] = nextElement b. set j = i c. While nextElement < x[j – 1] do following set x[j] equal to x[j – 1] increment j by 1 End while d. set x[j] equal to nextElement End for Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Example of Insertion Sort Given list to be sorted 67, 33, 21, 84, 49, 50, 75 Note sequence of steps carried out Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Quicksort A more efficient exchange sorting scheme than bubble sort A typical exchange involves elements that are far apart Fewer interchanges are required to correctly position an element. Quicksort uses a divide-and-conquer strategy A recursive approach The original problem partitioned into simpler sub-problems, Each sub problem considered independently. Subdivision continues until sub problems obtained are simple enough to be solved directly Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Quicksort Choose some element called a pivot Perform a sequence of exchanges so that All elements that are less than this pivot are to its left and All elements that are greater than the pivot are to its right. Divides the (sub)list into two smaller sub lists, Each of which may then be sorted independently in the same way. Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Quicksort If the list has 0 or 1 elements, return. // the list is sorted Else do: Pick an element in the list to use as the pivot.   Split the remaining elements into two disjoint groups: SmallerThanPivot = {all elements < pivot} LargerThanPivot = {all elements > pivot}    Return the list rearranged as: Quicksort(SmallerThanPivot), pivot, Quicksort(LargerThanPivot). Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Quicksort Example Given to sort: 75, 70, 65, , 98, 78, 100, 93, 55, 61, 81, Select, arbitrarily, the first element, 75, as pivot. Search from right for elements <= 75, stop at first element <75 Search from left for elements > 75, stop at first element >=75 Swap these two elements, and then repeat this process 84 68 Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Quicksort Example When done, swap with pivot 75, 70, 65, 68, 61, 55, 100, 93, 78, 98, 81, 84 When done, swap with pivot This SPLIT operation placed pivot 75 so that all elements to the left were <= 75 and all elements to the right were >75. View code for split() template 75 is now placed appropriately Need to sort sublists on either side of 75 Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Quicksort Example Need to sort (independently): 55, 70, 65, 68, 61 and 100, 93, 78, 98, 81, 84 Let pivot be 55, look from each end for values larger/smaller than 55, swap Same for 2nd list, pivot is 100 Sort the resulting sublists in the same manner until sublist is trivial (size 0 or 1) View quicksort() recursive function Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Quicksort Note visual example of a quicksort on an array etc. … Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Improvements to Quicksort Quicksort is a recursive function stack of activation records must be maintained by system to manage recursion. The deeper the recursion is, the larger this stack will become. The depth of the recursion and the corresponding overhead can be reduced sort the smaller sublist at each stage first Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Improvements to Quicksort An arbitrary pivot gives a poor partition for nearly sorted lists (or lists in reverse) Virtually all the elements go into either SmallerThanPivot or LargerThanPivot all through the recursive calls. Quicksort takes quadratic time to do essentially nothing at all. Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Improvements to Quicksort Better method for selecting the pivot is the median-of-three rule, Select the median of the first, middle, and last elements in each sublist as the pivot. Often the list to be sorted is already partially ordered Median-of-three rule will select a pivot closer to the middle of the sublist than will the “first-element” rule. Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Comparisons of Sorts Sort of a randomly generated list of 500 items Note: times are on 1970s hardware Algorithm Type of Sort Time (sec) Simple selection Heapsort Bubble sort 2 way bubble sort Quicksort Linear insertion Binary insertion Shell sort Selection Exchange Insertion 69 18 165 141 6 66 37 11

Heaps A heap is a binary tree with properties: It is complete Each level of tree completely filled Except possibly bottom level (nodes in left most positions) The key in any node dominates the keys of its children Min-heap: Node dominates by containing a smaller key than its children Max-heap: Node dominates by containing a larger key than its children

Heaps Which of the following are heaps? A B C A is, not B…not, C would be a heap, but does not satisfy heap order A B C

Implementing a Heap Use an array or vector Number the nodes from top to bottom Number nodes on each row from left to right Store data in ith node in ith location of array (vector) Why is an array or vector a good choice? Why not a linked list?

Implementing a Heap Note the placement of the nodes in the array 42 should be 41

Implementing a Heap In an array implementation children of ith node are at myArray[2*i] and myArray[2*i+1] Parent of the ith node is at myArray[i/2] Note: array indexed at 1

Basic Heap Operations Construct an empty heap Check if the heap is empty Insert an item Retrieve the largest/smallest element Remove the largest/smallest element

Basic Heap Operations Constructor Empty Retrieve max/min item Set size to 0, allocate array Empty Check value of size Retrieve max/min item Return root of the binary tree, myArray[1]

Basic Heap Operations Insert an item Place new item at end of array “Bubble” it up to the correct place Interchange with parent so long as it is greater/less than its parent

Basic Heap Operations Delete max/min item Max/Min item is the root, swap with last node in tree Delete last element Bubble the top element down until heap property satisfied Interchange with larger of two children

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Percolate Down Algorithm 1. Set c = 2 * r 2. While r <= n do following a. If c < n and myArray[c] < myArray[c + 1] Increment c by 1 b. If myArray[r] < myArray[c] i. Swap myArray[r] and myArray[c] ii. set r = c iii. Set c = 2 * c else Terminate repetition End while Also called bubble down. Exercise that understand this algorithm. Max number of repetitions is the height of the tree. myArray[r],….myArray[n] stores a semi-heap, only the value in heap[r] mail fail the heap-order condition. C is the location of the left child Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Basic Heap Operations Insert an item Amounts to a percolate up routine Place new item at end of array Interchange with parent so long as it is greater than its parent Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Heapsort Given a list of numbers in an array Convert to a heap Stored in a complete binary tree Convert to a heap Begin at last node not a leaf Apply percolated down to this subtree Continue Called heapify Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Heapsort Algorithm for converting a complete binary tree to a heap – called "heapify" For r = n/2 down to 1: Apply percolate_down to the subtree in myArray[r] , … myArray[n] End for Puts largest element at root Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Heapsort Now swap element 1 (root of tree) with last element This puts largest element in correct location Use percolate down on remaining sublist Converts from semi-heap to heap Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Heapsort Again swap root with rightmost leaf Continue this process with shrinking sublist Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Heapsort Algorithm 1. Consider x as a complete binary tree, use heapify to convert this tree to a heap 2. for i = n down to 2: a. Interchange x[1] and x[i] (puts largest element at end) b. Apply percolate_down to convert binary tree corresponding to sublist in x[1] .. x[i-1] Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Priority Queue A collection of data elements Basic Operations A heap Items stored in order by priority Max-heap == highest priority first Higher priority items removed ahead of lower Basic Operations Constructor Insert Find, remove smallest/largest (priority) element Change priority Delete an item

Priority Queue Useful for many applications Process scheduling (operating systems) Simulation of airports and computer networks More efficient to insert into a priorty queue than to re-sort everything on new arrival

Exercises: Selection Sort Given the following array, show the output of selection sort after each iteration: i x[i] 1 4 2 7 3 11 5 Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Exercises: Selection Sort pseduocode? (array) For i = 1 to n-1 Create variables smallPos = i and smallest = x[smallPos] For j = i+1 to n-1 If x[j] < smallest //smaller element found Set smallPos = j and smallest = x[smallPos] Set x[smallPos] = x[i] and set x[i] = smallest Array: 4,7,2,11,3 Apply this to previous example Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Exercises: Heap Given the same array: 4,7,2,11,3 Apply HeapSort First create heap Apply heapify Swap largest element with last element Percolate down Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Percolate Down Algorithm 1. Set c = 2 * r 2. While r <= n do following a. If c < n and myArray[c] < myArray[c + 1] Increment c by 1 b. If myArray[r] < myArray[c] i. Swap myArray[r] and myArray[c] ii. set r = c iii. Set c = 2 * c else Terminate repetition End while Also called bubble down. Exercise that understand this algorithm. Max number of repetitions is the height of the tree. myArray[r],….myArray[n] stores a semi-heap, only the value in heap[r] mail fail the heap-order condition. C is the location of the left child Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3

Insertion Sort? https://www.youtube.com/watch?v=DFG-XuyPYUQ Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3