Download presentation
Presentation is loading. Please wait.
1
CSE 326: Data Structures: Sorting
Lecture 13: Wednesday, Feb 5, 2003
2
Today Finish extensible hash tables Sorting Read Chapter 7 !
Will take several lectures Read Chapter 7 ! Except Shellsort (7.4)
3
Hash Tables on Secondary Storage (Disks)
Main differences: One bucket = one block, hence may hold multiple keys Open chaining: use overflow blocks when needed Closed chaining never used
4
Hash Table Example Assume 1 bucket (block) stores 2 keys + pointers
h(e)=0 h(b)=h(f)=1 h(g)=2 h(a)=h(c)=3 e b f g a c 1 2 3
5
Searching in a Hash Table
Search for a: Compute h(a)=3 Read bucket 3 1 disk access e b f g a c 1 2 3
6
Insertion in Hash Table
Place in right bucket, if space E.g. h(d)=2 e b f g d a c 1 2 3
7
Insertion in Hash Table
Create overflow block, if no space E.g. h(k)=1 More over- flow blocks may be needed e b f g d a c k 1 2 3
8
Hash Table Performance
Excellent, if no overflow blocks Degrades considerably when number of keys exceeds the number of buckets (I.e. many overflow blocks).
9
Extensible Hash Table Allows has table to grow, to avoid performance degradation Assume a hash function h that returns numbers in {0, …, 2k – 1} Start with n = 2i << 2k , only look at first i most significant bits
10
Extensible Hash Table E.g. i=1, n=2i=2, k=4
Note: we only look at the first bit (0 or 1) i=1 0(010) 1 1 1(011) 1
11
Insertion in Extensible Hash Table
0(010) 1 1 1(011) 1(110) 1
12
Insertion in Extensible Hash Table
Now insert 1010 Need to extend table, split blocks i becomes 2 i=1 0(010) 1 1 1(011) 1(110), 1(010) 1
13
Insertion in Extensible Hash Table
0(010) 1 00 01 10(11) 10(10) 2 10 11 11(10) 2
14
Insertion in Extensible Hash Table
Now insert 0000, then 0101 Need to split block i=2 0(010) 0(000), 0(101) 1 00 01 10(11) 10(10) 2 10 11 11(10) 2
15
Insertion in Extensible Hash Table
After splitting the block 00(10) 00(00) 2 i=2 01(01) 2 00 01 10(11) 10(10) 2 10 11 11(10) 2
16
Extensible Hash Table How many buckets (blocks) do we need to touch after an insertion ? How many entries in the hash table do we need to touch after an insertion ? Only one block: that which overflowed But we need to copy all hash table entries from the old table to the new table.
17
Performance Extensible Hash Table
No overflow blocks: access always O(1) More precisely: exactly one disk I/O BUT: Extensions can be costly and disruptive After an extension table may no longer fit in memory
18
Sorting Perhaps the most common operation in programs
The authoritative text: D. Knuth, The Art of Computer Programming, Vol. 3
19
Material to be Covered Sorting by comparision:
Bubble Sort Selection Sort Merge Sort QuickSort Efficient list-based implementations Formal analysis Theoretical limitations on sorting by comparison Sorting without comparing elements Sorting and the memory hierarchy
20
Bubble Sort Idea We want A[1] A[2] … A[N] Bubble sort idea:
If A[i-1] > A[i] then swap A[i-1] and A[i] Do this for i = 1, …, n-1 Repeat this until it’s sorted
21
Bubble Sort procedure BubbleSort (Array A, int N) repeat {
isSorted = true; for (i=1 to N-1) { if ( A[i-1] > A[i] ){ swap( A[i-1], A[i] ); isSorted = false; } until isSorted
22
Bubble Sort Improvements
After the 1st iteration: largest element A[n-1] After the 2nd iteration: Second largest element A[n-2] Question: what is the max number of iterations, and, hence the worst case running time ? Improvement: stop the iterations earlier: for (i=1 to N-1) for (i=1 to N-2) ... for (i=1 to 1) In fact we may be lucky, and be able decrease i more aggresively
23
Bubble Sort procedure BubbleSort (Array A, int N) m = N; repeat {
newM = 1; for (i=1 to m-1) { if ( A[i-1] > A[i] ){ swap( A[i-1], A[i] ); newM = i-1; } m = newM; while m > 1
24
Bubble Sort So the worst-case running time is T(n) = O(n2)
Is the worst-case running time also (n2) ? You need to find a worst-case input of size n for which the running time is n2.
25
Find minimum, move to A[i]
Selection Sort procedure SelectSort (Array A, int N) for (i=0 to N-2) { /* find the minimum among A[i],...,A[n-1] */ /* place it in A[i] */ m = i; for (j=i+1 to N-1) if ( A[m] > A[j] ) m = j; swap(A[i], A[m]); } A[0] ... A[i] A[i+1] A[n-1] Finished Find minimum, move to A[i]
26
Selection Sort Worst case running time: T(n) = O( ?? ) T(n) = ( ?? )
27
Sorted, but not necessarily finished
Insertion Sort procedure InsertSort (Array A, int N) for (i=1 to N-1) { /* A[0], A[1], ..., A[i-1] is sorte */ /* now insert A[i] in the right place */ x = A[i]; for (j=i-1; j>0 && A[j] > x; j--) A[j+1] = A[j]; A[j] = x; } A[0] ... A[i] A[i+1] A[n-1] Sorted, but not necessarily finished insert A[i] to the left
28
Insertion Sort Worst case running time: T(n) = O( ?? ) T(n) = ( ?? )
29
Merge Sort The Merge Operation: given two sorted sequences: A[0] A[1] ... A[m-1] B[0] B[1] ... B[n-1] Construct another sorted sequence that is their union Merge (A[0..m-1],B[0..n-1]) i1=0, i2=0 While i1<m, i2<n If T1[i1] < T2[i2] Next is T1[i1] i1++ Else Next is T2[i2] i2++ End If End While Merging Cars by key [Aggressiveness of driver]. Most aggressive goes first. Photo from
30
Merge Sort Function MergeSort (Array A[0..n-1]) if n 1 return A
Merge(MergeSort(A[0..n/2-1]), MergeSort(A[n/2..n-1]))
31
Merge Sort Running Time
Any difference best / worse case? T(1) = b T(n) = 2T(n/2) + cn for n>1 T(n) = 2T(n/2)+cn T(n) = 4T(n/4) +cn +cn substitute T(n) = 8T(n/8)+cn+cn+cn substitute T(n) = 2kT(n/2k)+kcn inductive leap T(n) = nT(1) + cn log n where k = log n select value for k T(n) = (n log n) simplify This is the same sort of analysis as see before Here’s a function defined in terms of itself. WORK THROUGH Answer: O(n log n) Generally, then, the strategy is to keep expanding these things out until you see a pattern. Then, write the general form. Finally, sub in for the series bounds to make T(?) come out to a known value and solve all the series. Tip: Look for powers/multiples of the numbers that appear in the original equation.
32
Merge Sort Works great with lists, or files Problems with arrays:
We need a scratch array, cannot sort ‘in situ’
33
Heap Sort Recall: a heap is a tree where the min is at the root
A heap is stored in an array A[1], ..., A[n]
34
Heap Sort Start with an unsorted array A[1], ..., A[n] Build a heap
How much time does it take ? Get minimum, store in out array; repeat n times: A[0] ... A[i] A[i+1] A[n-1] B[0] ... B[i]
35
Heap Sort But then we need an extra array !
How can we do it ‘in situ’ ?
36
Heap Sort Input: unordered array A[1..N]
Build a max heap (largest element is A[1]) For i = 1 to N-1: A[N-i+1] = Delete_Max() 7 50 22 15 4 40 20 10 35 25 50 40 20 25 35 15 10 22 4 7 40 35 20 25 7 15 10 22 4 50 35 25 20 22 7 15 10 4 40 50
37
Properties of Heap Sort
Worst case time complexity O(n log n) Build_heap O(n) n Delete_Max’s for O(n log n) In-place sort – only constant storage beyond the array is needed
38
QuickSort Pick a “pivot”. Divide list into two lists:
Picture from PhotoDisc.com Pick a “pivot”. Divide list into two lists: One less-than-or-equal-to pivot value One greater than pivot Sort each sub-problem recursively Answer is the concatenation of the two solutions
39
QuickSort: Array-Based Version
Pick pivot: 7 2 8 3 5 9 6 Partition with cursors 7 2 8 3 5 9 6 < > 2 goes to less-than 7 2 8 3 5 9 6 < >
40
QuickSort Partition (cont’d)
6, 8 swap less/greater-than 7 2 6 3 5 9 8 < > 3,5 less-than 9 greater-than 7 2 6 3 5 9 8 Partition done. 7 2 6 3 5 9 8
41
QuickSort Partition (cont’d)
Put pivot into final position. 5 2 6 3 7 9 8 Recursively sort each side. 2 3 5 6 7 8 9
42
QuickSort Complexity QuickSort is fast in practice, but has (N2) worst-case complexity Friday we will see why
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.