Download presentation
Presentation is loading. Please wait.
Published byMarian Robertson Modified over 9 years ago
1
1 CPS216: Advanced Database Systems Notes 04: Operators for Data Access Shivnath Babu
2
Problem Relation: Employee (ID, Name, Dept, …) Relation: Employee (ID, Name, Dept, …) 10 M tuples 10 M tuples (Filter) Query: SELECT * FROM Employee WHERE Name = “Bob” (Filter) Query: SELECT * FROM Employee WHERE Name = “Bob”
3
3 Solution #1: Full Table Scan Storage: Storage: Employee relation stored in contiguous blocks Employee relation stored in contiguous blocks Query plan: Query plan: Scan the entire relation, output tuples with Name = “Bob” Scan the entire relation, output tuples with Name = “Bob” Cost: Cost: Size of each record = 100 bytes Size of each record = 100 bytes Size of relation = 10 M x 100 = 1 GB Size of relation = 10 M x 100 = 1 GB Time @ 20 MB/s ≈ 1 Minute Time @ 20 MB/s ≈ 1 Minute
4
4 Solution #2 Storage: Storage: Employee relation sorted on Name attribute Employee relation sorted on Name attribute Query plan: Query plan: Binary search Binary search
5
5 Solution #2 Cost: Cost: Size of a block: 1024 bytes Size of a block: 1024 bytes Number of records per block: 1024 / 100 = 10 Number of records per block: 1024 / 100 = 10 Total number of blocks: 10 M / 10 = 1 M Total number of blocks: 10 M / 10 = 1 M Blocks accessed by binary search: 20 Blocks accessed by binary search: 20 Total time: 20 ms x 20 = 400 ms Total time: 20 ms x 20 = 400 ms
6
6 Solution #2: Issues Filters on different attributes: SELECT * FROM Employee WHERE Dept = “Sales” Filters on different attributes: SELECT * FROM Employee WHERE Dept = “Sales” Inserts and Deletes Inserts and Deletes
7
7 Indexes Data structures that efficiently evaluate a class of filter predicates over a relation Data structures that efficiently evaluate a class of filter predicates over a relation Class of filter predicates: Class of filter predicates: Single or multi-attributes (index-key attributes) Single or multi-attributes (index-key attributes) Range and/or equality predicates Range and/or equality predicates (Usually) independent of physical storage of relation: (Usually) independent of physical storage of relation: Multiple indexes per relation Multiple indexes per relation
8
8 Indexes Disk resident Disk resident Large to fit in memory Large to fit in memory Persistent Persistent Updated when indexed relation updated Updated when indexed relation updated Relation updates costlier Relation updates costlier Query cheaper Query cheaper
9
Problem Relation: Employee (ID, Name, Dept, …) Relation: Employee (ID, Name, Dept, …) (Filter) Query: SELECT * FROM Employee WHERE Name = “Bob” (Filter) Query: SELECT * FROM Employee WHERE Name = “Bob” Name Single-Attribute Index on Name that supports equality predicates
10
10 Roadmap Motivation Motivation Single-Attribute Indexes: Overview Single-Attribute Indexes: Overview Order-based Indexes Order-based Indexes B-Trees B-Trees Hash-based Indexes Hash-based Indexes Extensible Hashing Extensible Hashing Linear Hashing Linear Hashing Multi-Attribute Indexes (Chapter 14 GMUW, May cover in future) Multi-Attribute Indexes (Chapter 14 GMUW, May cover in future)
11
Single Attribute Index: General Construction b 1 2 b i b n b a 1 2 a i a n a AB
12
b 1 2 b i b n b a 1 2 a i a n a a 1 2 a i a n a AB A = val A > low A < high
13
13 Exceptions Sparse Indexes Sparse Indexes Require specific physical layout of relation Require specific physical layout of relation Example: Relation sorted on indexed attribute Example: Relation sorted on indexed attribute More efficient More efficient
14
14 Single Attribute Index: General Construction b 1 2 b i b n b a 1 2 a i a n a a 1 2 a i a n a AB A = val A > low A < high Textbook: Dense Index
15
Single Attribute Index: General Construction a 1 2 a i a n a A = val A > low A < high How do we organize (attribute, pointer) pairs? Idea: Use dictionary data structures Issue: Disk resident?
16
16 Roadmap Motivation Motivation Single-Attribute Indexes: Overview Single-Attribute Indexes: Overview Order-based Indexes Order-based Indexes B-Trees B-Trees Hash-based Indexes Hash-based Indexes Extensible Hashing Extensible Hashing Linear Hashing Linear Hashing Multi-Attribute Indexes (Next class) Multi-Attribute Indexes (Next class)
17
17 B-Trees Adaptation of search tree data structure Adaptation of search tree data structure 2-3 trees 2-3 trees Supports range predicates (and equality) Supports range predicates (and equality)
18
Use Binary Search Tree Directly? 16325471748392 16 74 71 54 32 92 83
19
19 Use Binary Search Tree Directly? Store records of type Store records of type Remember position of root Remember position of root Question: will this work? Question: will this work? Yes Yes But we can do better! But we can do better!
20
20 Use Binary Search Tree Directly? Number of keys: 1 M Number of keys: 1 M Number of levels: log (2^20) = 20 Number of levels: log (2^20) = 20 Total cost index lookup: 20 random disk I/O Total cost index lookup: 20 random disk I/O 20 x 20 ms = 400 ms 20 x 20 ms = 400 ms B-Tree: less than 3 random disk I/O
21
21 B-Tree vs. Binary Search Tree k k1 k2k3k40 1 Random I/O prunes tree by half 1 Random I/O prunes tree by 40
22
22 B-Tree Example 15365763768792100
23
23 B-Tree Example null 63 36 153657 84 63 7687 91 92100
24
24 Meaning of Internal Node 8491 key < 84 84 ≤ key < 91 91 ≤ key
25
25 B-Tree Example null 63 36 153657 84 63 7687 91 92100
26
26 Meaning of Leaf Nodes 6376 pointer to record 63pointer to record 76 Next leaf
27
27 Equality Predicates null 63 36 153657 84 63 7687 91 92100 key = 87
28
28 Equality Predicates null 63 36 153657 84 63 7687 91 92100 key = 87
29
29 Equality Predicates null 63 36 153657 84 63 7687 91 92100 key = 87
30
30 Equality Predicates null 63 36 153657 84 63 7687 91 92100 key = 87
31
31 Range Predicates null 63 36 153657 84 63 7687 91 92100 57 ≤ key < 95
32
32 Range Predicates null 63 36 153657 84 63 7687 91 92100 57 ≤ key < 95
33
33 Range Predicates null 63 36 153657 84 63 7687 91 92100 57 ≤ key < 95
34
34 Range Predicates null 63 36 153657 84 63 7687 91 92100 57 ≤ key < 95
35
35 Range Predicates null 63 36 153657 84 63 7687 91 92100 57 ≤ key < 95
36
36 Range Predicates null 63 36 153657 84 63 7687 91 92100 57 ≤ key < 95
37
37 General B-Trees Fixed parameter: n Fixed parameter: n Number of keys: n Number of keys: n Number of pointers: n + 1 Number of pointers: n + 1
38
38 B-Tree Example null 63 36 153657 84 63 7687 91 92100 n = 2
39
39 General B-Trees Fixed parameter: n Fixed parameter: n Number of keys: n Number of keys: n Number of pointers: n + 1 Number of pointers: n + 1 All leaves at same depth All leaves at same depth All (key, record pointer) in leaves All (key, record pointer) in leaves
40
40 B-Tree Example null 63 36 153657 84 63 7687 91 92100 n = 2
41
41 General B-Trees: Space related constraints Use at least Root: 2 pointers Internal: (n+1)/2 pointers Leaf: (n+1)/2 pointers to data Use at least Root: 2 pointers Internal: (n+1)/2 pointers Leaf: (n+1)/2 pointers to data
42
42 n=3 51521 15 31 4256 3142 Internal Leaf Max Min
43
43 Leaf Nodes n key slots (n+1) pointer slots
44
44 Leaf Nodes n key slots (n+1) pointer slots kk k 1 2m k 3 …… unused record of k 1 2 …… m … next leaf
45
45 Leaf Nodes n key slots (n+1) pointer slots kk k 1 2m k 3 …… unused record of k 1 2 …… m … m ≥ (n+1) 2 next leaf
46
46 Internal Nodes n key slots (n+1) pointer slots
47
47 Internal Nodes n key slots (n+1) pointer slots kk k 1 2m k 3 key < k 1 k ≤ key < k 12 k ≤ key m unused
48
48 Internal Nodes n key slots (n+1) pointer slots kk k 1 2m k 3 key < k 1 k ≤ key < k 12 k ≤ key m unused (m+1) ≥ (n+1) 2
49
49 Root Node n key slots (n+1) pointer slots kk k 1 2m k 3 key < k 1 k ≤ key < k 12 k ≤ key m unused (m+1) ≥ 2
50
50 Limits Why the specific limits (n+1)/2 and (n+1)/2 ? Why the specific limits (n+1)/2 and (n+1)/2 ? Why different limits for leaf and internal nodes? Why different limits for leaf and internal nodes? Can we reduce each limit? Can we reduce each limit? Can we increase each limit? Can we increase each limit? What are the implications? What are the implications?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.