1 Chapter 12: Indexing and Hashing Indexing Indexing Basic Concepts Basic Concepts Ordered Indices Ordered Indices B+-Tree Index Files B+-Tree Index Files.

1 Chapter 12: Indexing and Hashing Indexing Indexing Basic Concepts Basic Concepts Ordered Indices Ordered Indices B+-Tree Index Files B+-Tree Index Files Hashing Hashing Static Static Dynamic Hashing Dynamic Hashing

2 Basic Concepts Value Search Key - set of attributes used to look up records in a file. Search Key - set of attributes used to look up records in a file.  value record search key pointer

3 Index Evaluation Metrics Access types supported efficiently. E.g., Access types supported efficiently. E.g., Point query: find “Tom” Point query: find “Tom” Range query: find students whose age is between 20- 40 Range query: find students whose age is between 20- 40 Access time Access time Update time Update time Space overhead Space overhead

4 Ordered Indices In an ordered index, index entries are stored sorted on the search key value. E.g., author catalog in library. In an ordered index, index entries are stored sorted on the search key value. E.g., author catalog in library.

5 20 10 40 30 60 50 80 70 100 90 10 30 50 70 90 110 130 150 170 190 210 230 Primary index Also called clustering index The search key of a primary index is usually but not necessarily the primary key. same order Search key

6 50 30 70 20 40 80 10 100 60 90 10 20 30 40 50 60 70... Secondary index: non-clustering index. different order

7 Sequential File 20 10 40 30 60 50 80 70 100 90 Dense Index 10 20 30 40 50 60 70 80 90 100 110 120 Dense Index: contains index records for every search-key values.

8 Sequential File 20 10 40 30 60 50 80 70 100 90 Sparse Index 10 30 50 70 90 110 130 150 170 190 210 230 Sparse Index: contains index records for only some search- key values. Applicable when records are sequentially ordered on search-key

9 Secondary indexes Sequence field 50 30 70 20 40 80 10 100 60 90 Sparse index 30 20 80 100 90... does not make sense!

10 Sequential File 20 10 40 30 60 50 80 70 100 90 Sparse 2nd level 10 30 50 70 90 110 130 150 170 190 210 230 10 90 170 250 330 410 490 570 Multilevel Index

11 Secondary indexes Sequence field 50 30 70 20 40 80 10 100 60 90 10 20 30 40 50 60 70... 10 50 90... sparse high level Lowest level is dense Lowest level is dense Other levels are sparse Other levels are sparse Multilevel Index

12 Conventional indexes Advantage: - Simple - Index is sequential file good for scans scans Disadvantage: - Inserts expensive

13 Outline Conventional indexes Conventional indexes B+-Tree  NEXT B+-Tree  NEXT

14 NEXT: Another type of index NEXT: Another type of index Give up on sequentiality of index Give up on sequentiality of index Try to get “balance” Try to get “balance”

15 Root B+Tree Examplen=4 100 120 150 180 30 3 5 11 30 35 100 101 110 120 130 150 156 179 180 200

16 Sample non-leaf 57 81 95 Key is moved (not copied) from lower level non-leaf node to upper level non-leaf node to keysto keysto keys to keys < 5757  k<8181  k<95  95

17 Sample leaf node: From non-leaf node to next leaf in sequence 57 81 95 To record with key 57 To record with key 81 To record with key 85 Key is copied (not moved) from leaf node to non-leaf node

18 n=4 Leaf:Non-leaf: 30 35 30 35 30

19 Size of nodes: n pointers n-1 keys n-1 keys

20 Don’t want nodes to be too empty Use at least Use at least Root : 2 pointers Root : 2 pointers Non-leaf:  n/2  pointers Leaf :  (n-1)/2  keys

21 Full nodemin. node Non-leafLeaf n=4 120 150 180 30 3 5 11 30 35 counts even if null

22 B+tree rulestree of order n (1) All leaves at same lowest level (balanced tree) (2) Pointers in leaves point to records except for “sequence pointer” except for “sequence pointer”

23 (3) Number of pointers/keys for B+tree Non-leaf (non-root) nn-1  n/ 2   n/ 2  - 1 Leaf (non-root) nn-1 Rootnn-121 Max Max Min Min ptrs keys ptrs  data keys  (n-1)/2 

24 Insert into B+tree (a) simple case space available in leaf space available in leaf (b) leaf overflow (c) non-leaf overflow (d) new root

25 (a) Insert key = 32 n=4 3 5 11 30 31 30 100 32

26 (b) Insert key = 7 n=4 3 5 11 30 31 30 100 3535 7 7

27 (c) Insert key = 160 n=4 100 120 150 180 150 156 179 180 200 160 180 160 179

28 (d) New root, insert 45 n=4 10 20 30 123123 10 12 20 25 30 32 40 45 4030 new root

29 (a) Simple case - no example (b) Coalesce with neighbor (sibling) (c) Re-distribute keys (d) Cases (b) or (c) at non-leaf Deletion from B+tree

30 (b) Coalesce with sibling Delete 50 Delete 50 10 40 100 10 20 30 40 50 n=5 40

31 (c) Redistribute keys Delete 50 Delete 50 10 40 100 10 20 30 35 40 50 n=5 35

32 40 45 30 37 25 26 20 22 10 14 1313 10 2030 40 (d) Non-leaf coalese Delete 37 Delete 37 n=5 40 30 25 new root

33 B+tree deletions in practice – Often, coalescing is not implemented Too hard and not worth it! Too hard and not worth it!

34 Index Definition in SQL Create an index Create an index create index create index on ( ) on ( ) E.g.: create index gindex on country(gdp); To drop an index To drop an index drop index drop index E.g.: drop index gindex;

1 Chapter 12: Indexing and Hashing Indexing Indexing Basic Concepts Basic Concepts Ordered Indices Ordered Indices B+-Tree Index Files B+-Tree Index Files.

Similar presentations

Presentation on theme: "1 Chapter 12: Indexing and Hashing Indexing Indexing Basic Concepts Basic Concepts Ordered Indices Ordered Indices B+-Tree Index Files B+-Tree Index Files."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Chapter 12: Indexing and Hashing Indexing Indexing Basic Concepts Basic Concepts Ordered Indices Ordered Indices B+-Tree Index Files B+-Tree Index Files.

Similar presentations

Presentation on theme: "1 Chapter 12: Indexing and Hashing Indexing Indexing Basic Concepts Basic Concepts Ordered Indices Ordered Indices B+-Tree Index Files B+-Tree Index Files."— Presentation transcript:

Similar presentations

About project

Feedback