Presentation is loading. Please wait.

Presentation is loading. Please wait.

Heaps and basic data structures David Kauchak cs161 Summer 2009.

Similar presentations


Presentation on theme: "Heaps and basic data structures David Kauchak cs161 Summer 2009."— Presentation transcript:

1 Heaps and basic data structures David Kauchak cs161 Summer 2009

2 Administrative Homework 2 due date extended to Fri. 7/10 at 5pm Midterm 7/20 in class. Closed book, etc. Review sessions SCPD students Discussion board – thanks

3 Quicksort partitions – the good vs. the bad

4 Quicksort average case: take 2 cn “good” 50/50 split “bad” split We absorb the “bad” partition. In general, we can absorb any constant number of “bad” partitions

5 Quicksort partitions – the good vs. the bad For Quicksort to “absorb” the cost of bad partitions, as n grows, the proportion of bad to good partitions cannot grow Why? If as we increase the size of n, we proportionately increase the number of good and bad partitions, then there is still a constant number of “bad” partitions to be absorbed by a given “good” partition If, however, as we increase n the proportion of “bad” partitions increases, then we can no longer absorb the cost since of the “bad” partitions since it depends on n

6 Decision-tree model Full binary tree representing the comparisons between elements by a sorting algorithm Internal nodes contain indices to be compared Leaves contain a complete permutation of the input Tracing a path from root to leave gives the correct reordering/permutation of the input for an input 1:3 | 1,3,2 | ≤> | 2,1,3 | [3, 12, 7] [7, 3, 12] [3, 7, 12]

7 Comparison-based sorting Sorted order is determined based only on a comparison between input elements A[i] < A[j] A[i] > A[j] A[i] = A[j] A[i] ≤ A[j] A[i] ≥ A[j] This is why most built-in sorting approaches only require you to define the comparison operator (i.e. compareTo in Java) Can we do better than O(n log n)?

8 A decision tree model 1:2 ≤> 2:3 ≤> 1:3 ≤> |1,2,3| 1:3 ≤ > 2:3 ≤ > |1,3,2||3,1,2| |2,3,1||3,2,1| |2,1,3|

9 A decision tree model 1:2 ≤> 2:3 ≤> 1:3 ≤> |1,2,3| 1:3 ≤ > 2:3 ≤ > |1,3,2||3,1,2| |2,3,1||3,2,1| |2,1,3| [12, 7, 3]

10 A decision tree model 1:2 ≤> 2:3 ≤> 1:3 ≤> |1,2,3| 1:3 ≤ > 2:3 ≤ > |1,3,2||3,1,2| |2,3,1||3,2,1| |2,1,3| [12, 7, 3] Is 12 ≤ 7 or is 12 > 7?

11 A decision tree model 1:2 ≤> 2:3 ≤> 1:3 ≤> |1,2,3| 1:3 ≤ > 2:3 ≤ > |1,3,2||3,1,2| |2,3,1||3,2,1| |2,1,3| [12, 7, 3] Is 12 ≤ 3 or is 12 > 3?

12 A decision tree model 1:2 ≤> 2:3 ≤> 1:3 ≤> |1,2,3| 1:3 ≤ > 2:3 ≤ > |1,3,2||3,1,2| |2,3,1||3,2,1| |2,1,3| [12, 7, 3] Is 12 ≤ 3 or is 12 > 3?

13 A decision tree model 1:2 ≤> 2:3 ≤> 1:3 ≤> |1,2,3| 1:3 ≤ > 2:3 ≤ > |1,3,2||3,1,2| |2,3,1||3,2,1| |2,1,3| [12, 7, 3] Is 12 ≤ 3 or is 12 > 3?

14 A decision tree model 1:2 ≤> 2:3 ≤> 1:3 ≤> |1,2,3| 1:3 ≤ > 2:3 ≤ > |1,3,2||3,1,2| |2,3,1||3,2,1| |2,1,3| [12, 7, 3] Is 7 ≤ 3 or is 7 > 3?

15 A decision tree model 1:2 ≤> 2:3 ≤> 1:3 ≤> |1,2,3| 1:3 ≤ > 2:3 ≤ > |1,3,2||3,1,2| |2,3,1||3,2,1| |2,1,3| [12, 7, 3] Is 7 ≤ 3 or is 7 > 3?

16 A decision tree model 1:2 ≤> 2:3 ≤> 1:3 ≤> |1,2,3| 1:3 ≤ > 2:3 ≤ > |1,3,2||3,1,2| |2,3,1||3,2,1| |2,1,3| [12, 7, 3] 3, 2, 1

17 A decision tree model 1:2 ≤> 2:3 ≤> 1:3 ≤> |1,2,3| 1:3 ≤ > 2:3 ≤ > |1,3,2||3,1,2| |2,3,1||3,2,1| |2,1,3| [12, 7, 3] 3, 2, 1 [3, 7, 12]

18 A decision tree model 1:2 ≤> 2:3 ≤> 1:3 ≤> |1,2,3| 1:3 ≤ > 2:3 ≤ > |1,3,2||3,1,2| |2,3,1||3,2,1| |2,1,3| [7, 12, 3]

19 A decision tree model 1:2 ≤> 2:3 ≤> 1:3 ≤> |1,2,3| 1:3 ≤ > 2:3 ≤ > |1,3,2||3,1,2| |2,3,1||3,2,1| |2,1,3| [7, 12, 3]

20 A decision tree model 1:2 ≤> 2:3 ≤> 1:3 ≤> |1,2,3| 1:3 ≤ > 2:3 ≤ > |1,3,2||3,1,2| |2,3,1||3,2,1| |2,1,3| [7, 12, 3]

21 A decision tree model 1:2 ≤> 2:3 ≤> 1:3 ≤> |1,2,3| 1:3 ≤ > 2:3 ≤ > |1,3,2||3,1,2| |2,3,1||3,2,1| |2,1,3| [7, 12, 3]

22 A decision tree model 1:2 ≤> 2:3 ≤> 1:3 ≤> |1,2,3| 1:3 ≤ > 2:3 ≤ > |1,3,2||3,1,2| |2,3,1||3,2,1| |2,1,3| [7, 12, 3]

23 A decision tree model 1:2 ≤> 2:3 ≤> 1:3 ≤> |1,2,3| 1:3 ≤ > 2:3 ≤ > |1,3,2||3,1,2| |2,3,1||3,2,1| |2,1,3| [7, 12, 3] [3, 7, 12]

24 How many leaves are in a decision tree? Leaves must have all possible permutations of the input Input of size n, n! leaves What if decision tree model didn’t? Some input would exist that didn’t have a correct reordering

25 A lower bound What is the worst-case number of comparisons for a tree? 1:2 ≤> 2:3 ≤> 1:3 ≤> |1,2,3| 1:3 ≤ > 2:3 ≤ > |1,3,2||3,1,2| |2,3,1||3,2,1| |2,1,3|

26 A lower bound The longest path in the tree, i.e. the height 1:2 ≤> 2:3 ≤> 1:3 ≤> |1,2,3| 1:3 ≤ > 2:3 ≤ > |1,3,2||3,1,2| |2,3,1||3,2,1| |2,1,3|

27 A lower bound What is the maximum number of leaves a binary tree of height h can have? A complete binary tree has 2 h leaves log is monotonically increasing from hw1

28 Can we do better than O(n logn) for sorting? What if I told you the maximum value k that any number could take and k = O(n) In some situation (like above) we can sort in Θ(n) counting sort radix sort bucket sort Leverage additional knowledge about the data besides comparisons

29 Why don’t we hear about these more? Constants can be large and running times therefore may be larger for modest input sizes Cache friendliness Memory (Quicksort sorts in place) Hardware considerations

30 Data Structures What is a data structure? Way of storing data that facilitates particular operations Dynamic set operations: For a set S Search(S,k) – Does k exist in S? Insert(S,k) – Add k to S Delete(S,x) – Given a pointer/reference, x, to an elkement, delete it from S Min(S) – Return the smallest element of S Max(S) – Return the largest element of S

31 Array Sequential locations in memory in linear order Elements are accessed via index Cost of operations: Search(S,k) – Insert(S,k) – InsertIndex(S,k) – Delete(S,x) – Min(S) – Max(S) – O(n) Θ(n) Θ(1) Θ(n)

32 Array Uses? constant time access of particular indices

33 Linked list Elements are arranged linearly. An element in list points to the next element in the list Cost of operations: Search(S,k) – Insert(S,k) – InsertIndex(S,k) – Delete(S,x) – Min(S) – Max(S) – O(n) Θ(1) O(n) Θ(n)

34 Linked list Uses? constant time insertion at the cost of linear time access

35 Double linked list Elements are arranged linearly. An element in list points to the next element and previous element in the list What does the back link get us? Θ(1) deletion

36 Stack LIFO Picture the stack of plates at a buffet Can implement with an array or a linked list

37 Stack LIFO Picture the stack of plates at a buffet Can implement with an array or a linked list push(1) push(2) push(3) pop() 3 2 1 top

38 Stack Empty – check if stack is empty Array: check if “top” is at index 0 Linked list: check if “top” pointer is null Runtime: Θ(1)

39 Stack Pop – removes the top element from the list check if empty, if so, “underflow” Array: return element at “top” and decrement “top” Linked list: return and remove at front of linked list Runtime: Push – add an element to the list Array: increment “top” and insert element. Must check for overflow! Linked list: insert element at front of linked list Runtime: Θ(1)

40 Stack Array or linked list? Array: more memory efficient Linked list: don’t have to worry about “overflow” Uses? runtime “stack” graph search algorithms (depth first search) syntactic parsing (i.e. compilers)

41 Queue FIFO Picture a line at the grocery store Can implement with array or double linked list Enqueue(1) Enqueue(2) Enqueue(3) Dequeue() 123123 headtail

42 Queue Operations Empty – Θ(1) Enqueue – add element to end of queue - Θ(1) Dequeue – remove element from the front of the queue - Θ(1) Uses? scheduling graph traversal (breadth first search)

43 Binary heap A binary tree where the value of a parent is greater than or equal to the value of it’s children Additional restriction: all levels of the tree are complete except the last Max heap vs. min heap

44 Binary heap - operations Maximum(S) - return the largest element in the set ExtractMax(S) – Return and remove the largest element in the set Insert(S, val) – insert val into the set IncreaseElement(S, x, val) – increase the value of element x to val BuildHeap(A) – build a heap from an array of elements

45 Binary heap - pointers 1614108241793 parent ≥ child complete tree level does not indicate size all nodes in a heap are themselves heaps

46 Binary heap - array

47 16 14 10 8 7 9 3 2 4 1 1 2 3 4 5 6 7 8 9 10

48 Binary heap - array 16 14 10 8 7 9 3 2 4 1 1 2 3 4 5 6 7 8 9 10 Left child of A[3]?

49 Binary heap - array 16 14 10 8 7 9 3 2 4 1 1 2 3 4 5 6 7 8 9 10 Left child of A[3]? 2*3 = 6

50 Binary heap - array 16 14 10 8 7 9 3 2 4 1 1 2 3 4 5 6 7 8 9 10 Parent of A[8]?

51 Binary heap - array 16 14 10 8 7 9 3 2 4 1 1 2 3 4 5 6 7 8 9 10 Parent of A[8]?

52 Binary heap - array 16 14 10 8 7 9 3 2 4 1 1 2 3 4 5 6 7 8 9 10 1614108241793

53 Identify the valid heaps 8 [15, 12, 3, 11, 10, 2, 1, 7, 8] [20, 18, 10, 17, 16, 15, 9, 14, 13] 16101593

54 Heapify Assume left and right children are heaps, turn current set into a valid heap

55 Heapify Assume left and right children are heaps, turn current set into a valid heap

56 Heapify Assume left and right children are heaps, turn current set into a valid heap find out which is largest: current, left of right

57 Heapify Assume left and right children are heaps, turn current set into a valid heap

58 Heapify Assume left and right children are heaps, turn current set into a valid heap if a child is larger, swap and recurse

59 Heapify 16 3 108241795 16 3 10 8 7 9 5 2 4 1 1 2 3 4 5 6 7 8 9 10

60 Heapify 16 3 108241795 16 3 10 8 7 9 5 2 4 1 1 2 3 4 5 6 7 8 9 10

61 Heapify 16 8 103241795 16 8 10 3 7 9 5 2 4 1 1 2 3 4 5 6 7 8 9 10

62 Heapify 16 8 103241795 16 8 10 3 7 9 5 2 4 1 1 2 3 4 5 6 7 8 9 10

63 Heapify 16 8 104231795 16 8 10 4 7 9 5 2 3 1 1 2 3 4 5 6 7 8 9 10

64 Heapify 16 8 104231795 16 8 10 4 7 9 5 2 3 1 1 2 3 4 5 6 7 8 9 10

65 Heapify 16 8 104231795 16 8 10 4 7 9 5 2 3 1 1 2 3 4 5 6 7 8 9 10

66 Correctness of Heapify Remember both the children are valid heaps Three cases: Case 1: A[i] (current node) is the largest parent is greater than both children both children are heaps current node is a valid heap

67 Correctness of heapify Case 2: left child is the largest When Heapify returns: Left child is a valid heap Right child is unchanged and therefore a valid heap Current node is larger than both children since we selected the largest node of current, left and right current node is a valid heap Case 3: right child is largest similar to above

68 Running time of Heapify What is the cost of each call to Heapify? Θ(1) How many calls are made to Heapify? O(height of the tree) What is the height of the tree? Complete binary tree, except for the last level O(log n)

69 Binary heap - operations Maximum(S) - return the largest element in the set ExtractMax(S) – Return and remove the largest element in the set Insert(S, val) – insert val into the set IncreaseElement(S, x, val) – increase the value of element x to val BuildHeap(A) – build a heap from an array of elements

70 Maximum Return the largest element from the set Return A[1] 16 14 10 8 7 9 3 2 4 1 1 2 3 4 5 6 7 8 9 10

71 ExtractMax Return and remove the largest element in the set 16 1410 8 241 793

72 ExtractMax Return and remove the largest element in the set 14108241793 ?

73 ExtractMax Return and remove the largest element in the set 14108241793 ?

74 ExtractMax Return and remove the largest element in the set 14108241793 ?

75 ExtractMax Return and remove the largest element in the set 14108241793 ?

76 ExtractMax Return and remove the largest element in the set 14108241793

77 ExtractMax Return and remove the largest element in the set 14108241793 Heapify

78 ExtractMax Return and remove the largest element in the set

79 ExtractMax running time Constant amount of work plus one call to Heapify – O(log n)

80 IncreaseElement Increase the value of element x to val 16 1410 8 241 793 15

81 IncreaseElement Increase the value of element x to val 16141082 15 1793

82 IncreaseElement Increase the value of element x to val 161410 15 2 8 1793

83 IncreaseElement Increase the value of element x to val 161410 15 281793

84 IncreaseElement Increase the value of element x to val 161510 14 281793

85 IncreaseElement Increase the value of element x to val

86 Correctness of IncreaseElement Why is it ok to swap values with parent?

87 Correctness of IncreaseElement Stop when heap property is satisfied

88 Running time of IncreaseElement Follows a path from a node to the root Worst case O(height of the tree) O(log n)

89 Insert Insert val into the set 1614108241793 6

90 Insert Insert val into the set 1614108241793 6

91 Insert Insert val into the set 1614108241793 6 propagate value up

92 Insert

93 Running time of Insert Constant amount of work plus one call to IncreaseElement – O(log n)

94 Building a heap Can we build a heap using the functions we have so far? Maximum(S) ExtractMax(S) Insert(S, val)| IncreaseElement(S, x, val)

95 Building a heap

96 Running time of BuildHeap1 n calls to Insert – O(n log n) Can we get a better bound? …

97 Building a heap: take 2 Start with n/2 “simple” heaps call Heapify on element n/2-1, n/2-2, n/2-3 … all children have smaller indices building from the bottom up, makes sure that all the children are heaps

98 4 1 3 2 16 9 10 14 8 7 1 2 3 4 5 6 7 8 9 10 4 13 2 1487 16910

99 4 1 3 2 16 9 10 14 8 7 1 2 3 4 5 6 7 8 9 10 4 13 2 16 heapify 1487 910

100 4 1 3 2 16 9 10 14 8 7 1 2 3 4 5 6 7 8 9 10 4 13 2 16 heapify 1487 910

101 4 1 3 14 16 9 10 2 8 7 1 2 3 4 5 6 7 8 9 10 4 13 14 2 heapify 87 16910

102 4 1 3 14 16 9 10 2 8 7 1 2 3 4 5 6 7 8 9 10 4 13 14 2 heapify 87 16910

103 4 1 10 14 16 9 3 2 8 7 1 2 3 4 5 6 7 8 9 10 4 110 14 2 heapify 87 1693

104 4 1 10 14 16 9 3 2 8 7 1 2 3 4 5 6 7 8 9 10 4 110 14 2 heapify 87 1693

105 4 16 10 14 7 9 3 2 8 1 1 2 3 4 5 6 7 8 9 10 4 1610 14 2 heapify 81 793

106 4 16 10 14 7 9 3 2 8 1 1 2 3 4 5 6 7 8 9 10 4 1610 14 2 heapify 81 793

107 16 14 10 8 7 9 3 2 4 1 1 2 3 4 5 6 7 8 9 10 16 1410 8 2 heapify 41 793

108 Correctness of BuildHeap2 Invariant:

109 Correctness of BuildHeap2 Invariant: elements A[i+1…n] are all heaps Base case: i = floor(n/2). All elements i+1, i+2, …, n are “simple” heaps Inductive case: We know i+1, i+2,.., n are all heaps, therefore the call to Heapify(A,i) generates a heap at node i Termination?

110 Running time of BuildHeap2 n/2 calls to Heapify – O(n log n) Can we get a tighter bound?

111 Running time of BuildHeap2 1614108241793 all nodes at the same level will have the same cost How many nodes are at level d? 2d2d

112 Running time of BuildHeap2 ?

113 Nodes at height h h=0 h=1 h=2 h < ceil(n/2) nodes < ceil(n/4) nodes < ceil(n/8) nodes < ceil(n/2 h+1 ) nodes

114 Running time of BuildHeap2

115 BuildHeap1 vs. BuildHeap2 Runtime Both O(n) BuildHeap2 may have smaller constants (only n/2 calls) Memory Both O(n) BuildHeap1 requires an additional array, i.e. 2n memory Complexity/Ease of implementation

116 Heap uses Heapsort Build a heap Call ExtractMax for all the elements O(n log n) running time Priority queues scheduling tasks: jobs, processes, network traffic A* search algorithm

117 Other heaps

118

119


Download ppt "Heaps and basic data structures David Kauchak cs161 Summer 2009."

Similar presentations


Ads by Google