Download presentation
Presentation is loading. Please wait.
Published byStephen Knight Modified over 9 years ago
1
2015-12-21 Index Tuning Conventional index
2
2015-12-22 Secondary index To speed up queries on attributes not within primary key Primary index –Determine the placement of records in the data file –Each table has only one primary index Secondary –Only give the location of the records –One table may have multiple secondary index –Always dense
3
2015-12-23 Secondary indexes Sequence field 50 30 70 20 40 80 10 100 60 90
4
2015-12-24 Secondary indexes Sequence field 50 30 70 20 40 80 10 100 60 90 Sparse index 30 20 80 100 90... does not make sense!
5
2015-12-25 Secondary indexes Sequence field 50 30 70 20 40 80 10 100 60 90 Dense index 10 20 30 40 50 60 70... 10 50 90... sparse high level
6
2015-12-26 With secondary indexes: Lowest level is dense Other levels are sparse Also: Pointers are record pointers (not block pointers; not computed)
7
2015-12-27 Application of secondary indexes in clustered file Given relations –Movie(title, year, length, incolor, studioName, producerC#) –Studio(name, address, presC#) Suppose the following query is typical –SELECT t i t l e, year FROM Movie, Studio WHERE presC# = zzz AND Movie.studioName = Studio.name; Clustered file structure Secondary index on presC# can minimize disk I/Os!
8
2015-12-28 Duplicate values & secondary indexes 10 20 40 20 40 10 40 10 40 30
9
2015-12-29 Duplicate values & secondary indexes 10 20 40 20 40 10 40 10 40 30 10 20 30 40... one option... Problem: excess overhead! disk space search time
10
2015-12-210 Duplicate values & secondary indexes 10 20 40 20 40 10 40 10 40 30 10 another option... 403020 Problem: variable size records in index!
11
2015-12-211 Duplicate values & secondary indexes 10 20 40 20 40 10 40 10 40 30 10 20 30 40 50 60... Another idea (suggested in class): Chain records with same key? Problems: Need to add fields to records Need to follow chain to know records
12
2015-12-212 Duplicate values & secondary indexes 10 20 40 20 40 10 40 10 40 30 10 20 30 40 50 60... buckets Using Indirection!
13
2015-12-213 Why “ bucket ” idea is useful IndexesRecords Name: primary EMP (name,dept,floor,...) Dept: secondary Floor: secondary We can use the pointers in the buckets to help answer queries without looking at most of records in the data file!
14
2015-12-214 Query: Get employees in (Toy Dept) ^ (2nd floor) Dept. indexEMP Floor index Toy 2nd Intersect toy bucket and 2nd Floor bucket to get set of matching EMP ’ s
15
2015-12-215 This idea used in text information retrieval Documents...the cat is fat......was raining cats and dogs......Fido the dog... Inverted lists cat dog
16
2015-12-216 IR QUERIES Find articles with “ cat ” and “ dog ” Find articles with “ cat ” or “ dog ” Find articles with “ cat ” and not “ dog ” Find articles with “ cat ” in title Find articles with “ cat ” and “ dog ” within 5 words
17
2015-12-217 Common technique: more info in inverted list cat Title5 100 Author10 Abstract57 Title12 d3d3 d2d2 d1d1 dog type position location
18
2015-12-218 Summary so far Conventional index –Basic Ideas: sparse, dense, multi-level … –Duplicate Keys –Deletion/Insertion –Secondary indexes –Buckets of Postings List
19
2015-12-219 Conventional indexes Advantage: - Simple - Index is sequential file good for scans Disadvantage: - Inserts expensive, and/or - Lose sequentiality & balance
20
2015-12-220 ExampleIndex (sequential) continuous free space 10 20 30 40 50 60 70 80 90 39 31 35 36 32 38 34 33 overflow area (not sequential)
21
summary
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.