Download presentation
Presentation is loading. Please wait.
Published byBarrie Allen Modified over 9 years ago
1
Oct 29, 2001CSE 373, Autumn 20011 External Storage For large data sets, the computer will have to access the disk. Disk access can take 200,000 times longer than a machine instruction. The RAM model does not account for disk I/O. memory disk 128 MB fast, expensive 60 GB slow, cheap
2
Oct 29, 2001CSE 373, Autumn 20012 Disks, continued The difference between memory speed and disk speed is increasing. Example: State of Florida driving records (256 bytes). 10,000,000 items. 6 disk accesses per second on a time-sharing system. unbalanced binary search tree: possibly 10,000,000 accesses. BST: on avg. 32 accesses (5 sec.) AVL: worst: 1.44 log n typical case: log n, 25 accesses (4 sec.)
3
Oct 29, 2001CSE 373, Autumn 20013 Disk accesses Goal: reduce the number of disk accesses. We are willing to do more complicated computations in memory in order to save disk time. Idea: increase the branching of the tree so that the height is decreased. Defn: An M-ary search tree allows up to M children per node.
4
Oct 29, 2001CSE 373, Autumn 20014 B-Trees 1.All the data items are stored at the leaves. 2.The non-leaf nodes store up to M-1 keys. The ith key represents the smallest key in subtree i+1. 3.The root is either a leaf of has between 2 and M children. 4.All non-leaf nodes (except the root) have between M/2 and M children. 5.All leaves are at the same depth and have between L/2 and L data items.
5
Oct 29, 2001CSE 373, Autumn 20015 B-Trees: Choices Choose M and L based on the size of the keys K and on the size of the record R. Suppose a disk block is of size B (bytes). Choose M so that a non-leaf node fits into one block: B (M-1) · K + M · 4 Choose L so that a leaf node fits into one block: B L · R accesses: log 2 N vs. log M/2 N
6
Oct 29, 2001CSE 373, Autumn 20016 Hash Tables Constant time accesses! A hash table is an array of some fixed size, usually a prime number. General idea: key space (e.g., strings) 0 … TableSize –1 hash func. h(K) hash table
7
Oct 29, 2001CSE 373, Autumn 20017 Desirable Properties We want a hash function to: 1.be simple/fast to compute, 2.map different keys to different cells, (impossible – why?) 3.have keys distributed evenly among cells. Idea: If #1 and #3 are true and the hash table is not very full, then it should be fast to do a find.
8
Oct 29, 2001CSE 373, Autumn 20018 Example key space = integers h(K) = K mod 10 0 141 2 3 434 5 6 77 818 9 We lose all ordering information: findMin, findMax, inorder traversal, printing items in sorted order.
9
Oct 29, 2001CSE 373, Autumn 20019 Example 2 key space = strings s = s 0 s 1 s 2 … s k-1 h(s) = s 0 mod TableSize BAD HASH FUNCTION h(s) = mod TableSize BETTER HASH FUNCTION
10
Oct 29, 2001CSE 373, Autumn 200110 Collision Resolution Separate chaining: All keys that map to the same hash value are kept in a list. 0 1 2 3 4 5 6 7 8 9 10 107 221242
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.