PART 4 DATA STORAGE AND QUERY. Chapter 12 Indexing and Hashing.

Slides:



Advertisements
Similar presentations
CpSc 3220 File and Database Processing Lecture 17 Indexed Files.
Advertisements

©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part C Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
Quick Review of Apr 10 material B+-Tree File Organization –similar to B+-tree index –leaf nodes store records, not pointers to records stored in an original.
File Processing : Hash 2015, Spring Pusan National University Ki-Joune Li.
Chapter 12: Indexing and Hashing Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree Index Files Static.
CIS552Indexing and Hashing1 Cost estimation Basic Concepts Ordered Indices B + - Tree Index Files B - Tree Index Files Static Hashing Dynamic Hashing Comparison.
Index Basic Concepts Indexing mechanisms used to speed up access to desired data. E.g., author catalog in library Search Key - attribute to set of attributes.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 12: Indexing and.
CST203-2 Database Management Systems Lecture 7. Disadvantages on index structure: We must access an index structure to locate data, or must use binary.
INDEXING AND HASHING.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 12: Indexing and.
Slides adapted from A. Silberschatz et al. Database System Concepts, 5th Ed. Indexing and Hashing Database Management Systems I Alex Coman, Winter 2006.
File and Index Structure
B+-tree and Hash Indexes
Chapter 8 File organization and Indices.
Data Indexing Herbert A. Evans. Purposes of Data Indexing What is Data Indexing? Why is it important?
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part A Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
Database Management Systems I Alex Coman, Winter 2006
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part B Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
Quick Review of material covered Apr 8 B+-Tree Overview and some definitions –balanced tree –multi-level –reorganizes itself on insertion and deletion.
Indexing and Hashing.
1 Indexing Structures for Files. 2 Basic Concepts  Indexing mechanisms used to speed up access to desired data without having to scan entire.
Homework #3 Due Thursday, April 17 Problems: –Chapter 11: 11.6, –Chapter 12: 12.1, 12.2, 12.3, 12.4, 12.5, 12.7.
Ch12: Indexing and Hashing  Basic Concepts  Ordered Indices B+-Tree Index Files B+-Tree Index Files B-Tree Index Files B-Tree Index Files  Hashing Static.
1 CS 728 Advanced Database Systems Chapter 17 Database File Indexing Techniques, B- Trees, and B + -Trees.
Indexing and Hashing.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 File Organizations and Indexing Chapter 5, 6 of Elmasri “ How index-learning turns no student.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Hashing.
Computing & Information Sciences Kansas State University Friday, 24 Oct 2008CIS 560: Database System Concepts Lecture 23 of 42 Friday, 24 October 2008.
Chapter 12: Indexing and Hashing
Computing & Information Sciences Kansas State University Monday. 20 Oct 2008CIS 560: Database System Concepts Lecture 21 of 42 Monday, 20 October 2008.
1 Chapter 12: Indexing and Hashing Indexing Indexing Basic Concepts Basic Concepts Ordered Indices Ordered Indices B+-Tree Index Files B+-Tree Index Files.
Chapter 11 Indexing & Hashing. 2 n Sophisticated database access methods n Basic concerns: access/insertion/deletion time, space overhead n Indexing 
1 Index Structures. 2 Chapter : Objectives Types of Single-level Ordered Indexes Primary Indexes Clustering Indexes Secondary Indexes Multilevel Indexes.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree.
1 Chapter 2 Indexing Structures for Files Adapted from the slides of “Fundamentals of Database Systems” (Elmasri et al., 2003)
12.1 Chapter 12: Indexing and Hashing Spring 2009 Sections , , Problems , 12.7, 12.8, 12.13, 12.15,
Nimesh Shah (nimesh.s) , Amit Bhawnani (amit.b)
Basic Concepts Indexing mechanisms used to speed up access to desired data. E.g., author catalog in library Search Key - attribute to set of attributes.
Indexing and hashing Azita Keshmiri CS 157B. Basic concept An index for a file in a database system works the same way as the index in text book. For.
Computing & Information Sciences Kansas State University Wednesday, 22 Oct 2008CIS 560: Database System Concepts Lecture 22 of 42 Wednesday, 22 October.
Indexing and Hashing By Dr.S.Sridhar, Ph.D.(JNUD), RACI(Paris, NICE), RMR(USA), RZFM(Germany) DIRECTOR ARUNAI ENGINEERING COLLEGE TIRUVANNAMALAI.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan Chapter 12: Indexing and Hashing.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Indexing.
Marwan Al-Namari Hassan Al-Mathami. Indexing What is Indexing? Indexing is a mechanisms. Why we need to use Indexing? We used indexing to speed up access.
Indexing Database Management Systems. Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files File Organization 2.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Module D: Hashing.
Computing & Information Sciences Kansas State University Monday, 31 Mar 2008CIS 560: Database System Concepts Lecture 25 of 42 Monday, 31 March 2008 William.
Indexing COMSATS INSTITUTE OF INFORMATION TECHNOLOGY, VEHARI.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 12: Indexing and.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 11: Indexing.
1 Chapter 12: Indexing and Hashing Indexing Indexing Basic Concepts Basic Concepts Ordered Indices Ordered Indices B+-Tree Index Files B+-Tree Index Files.
CS4432: Database Systems II
Chapter 11 Indexing And Hashing (1) Yonsei University 1 st Semester, 2016 Sanghyun Park.
Em Spatiotemporal Database Laboratory Pusan National University File Processing : Hash 2004, Spring Pusan National University Ki-Joune Li.
Indexing and hashing.
Azita Keshmiri CS 157B Ch 12 indexing and hashing
Chapter 12: Indexing and Hashing
File organization and Indexing
Chapter 11: Indexing and Hashing
Indexing and Hashing Basic Concepts Ordered Indices
Indexing and Hashing B.Ramamurthy Chapter 11 2/5/2019 B.Ramamurthy.
Chapter 11 Indexing And Hashing (1)
2018, Spring Pusan National University Ki-Joune Li
Chapter 11: Indexing and Hashing
Advance Database System
Presentation transcript:

PART 4 DATA STORAGE AND QUERY

Chapter 12 Indexing and Hashing

April 2008Database System Concepts - Chapter 12 Indexing and Hashing -3 Contents in This Chapter Basic concepts about and classification of indexing ordered indices, hash indices Properties/types of ordered indices primary/clustering indices, secondary/non-clustering indices dense indices, sparse indices single-level indices, multi-level indices (e.g. B + -tree, B-tree) Hash indices hash functions static hash, dynamic hash

April 2008Database System Concepts - Chapter 12 Indexing and Hashing -4 §12.1 Basic Concepts How to locate records in DB file quickly? Indexing ( 索引技术 ) mechanisms used to speed up access to desired data e.g. for relation account(account-number, branch-name, balance) shown in Fig.12.1, the index branch-name  physical address of record (i.e.tuple) in DB file account Search Key attributes or a set of attributes used to look up the records in a file e.g. branch-name

A-217 A-110 A-101 A-215 A-201 A-218A-102 A-222 A-305 Fig DB indexed file account and its index file Note: the file account is logically a sequential file, but its records may be stored non-contiguously or non-ordered on the disk logical file account physical file account index file (account-number, branch-name, balance) indexed file

April 2008Database System Concepts - Chapter 12 Indexing and Hashing -6 §12.1 Basic Concepts (cont.) Indexing mapping from search-key to storage locations of the file records, i.e. search-key  storage locations of the records in disks

搜索键 (search key) 索引文件 (index file) 数据文件 (主文件、被索引文件 indexed file ) s1s2s3..si..sj.sns1s2s3..si..sj.sn 散列 函数 H s1s2s3..si..sj.sns1s2s3..si..sj.sn a3a3 a2a2 a1a1 … aiai … ajaj … amam … a n-1 anan b3b3 b2b2 b1b1 … bibi … bjbj … bmbm … b n-1 bnbn s3s3 s2s2 s1s1 … sisi … sjsj … smsm … s n-1 snsn … … … … … … … … … … … … d3d3 d2d2 d1d1 … didi … djdj … dmdm … d n-1 dndn R (A, B, S, ….,, D) … … … … h(s i ) h(s n ) h(s 1 ) { 索引项 } index entry s1s1 s2s2 s3s3 … sisi … smsm 图 索引技术 (indexing) 及其分类 ordered indiceshash indices 搜索键 (search key)

April 2008Database System Concepts - Chapter 12 Indexing and Hashing -8 Two basic kinds of indices: ordered indices: the index file is used to store the index entries in which the search key of the records and the address of the records are stored in sorted order hash indices: the “hash function” is used to map the the search key of the records to the address of the records the records are stored in the “buckets”, the number of the bucket is as the address of the records and is determined by the hash function §12.1 Basic Concepts (cont.)

April 2008Database System Concepts - Chapter 12 Indexing and Hashing -9 DBS file with indexing mechanism include two parts : indexed file, in which data records are stored index file, in which index entries are included e.g. Fig.12.1 The indexed file can be organized as sequential file heap file hash file clustering file §12.2 Ordered Indices

April 2008Database System Concepts - Chapter 12 Indexing and Hashing -10 §12.1 Basic Concepts (cont.) Index file set of index entries of the form Index files are typically much smaller than the original file search-keylocation

April 2008Database System Concepts - Chapter 12 Indexing and Hashing -11 Ordered index (in index file) the index entries in the index file are stored in some sorted order, for instance, in accordance with the order of the search key /* 索引项的排列顺序与搜索键的排列顺序一致 e.g. in Fig.12.1( ), the index file is sorted by branch- name 12.2 Ordered Indices (cont.)

April 2008Database System Concepts - Chapter 12 Indexing and Hashing -12 Primary/clustering index (in index file) considering a index file and its corresponding indexed file. if the indexed file is a sequential file, and the search key in index file also specifies the sequential order of the indexed file /* 索引文件的搜索键所规定的顺序与被索引的顺序文 件中的纪录顺序一致 e.g. in Fig.12.1, (branch-name, address of records) defines the same orders of the records as that in sequential indexed file account note: also called clustering index 12.2-I Primary and Secondary Indices

April 2008Database System Concepts - Chapter 12 Indexing and Hashing -13 The search key of a primary index is usually but not necessarily the primary key e.g. in Fig. 12.1, branch-name is not the primary key of account 12.2-I Primary and Secondary Indices (cont.)

April 2008Database System Concepts - Chapter 12 Indexing and Hashing -14 Secondary index an index whose search key specifies an order different from the sequential order of the file. also called non-clustering index e.g. Fig.12.5, secondary index on balance field of account Secondary indices have to be dense indices Index-sequential file ( 索引顺序文件 ) ordered sequential file with a primary index on the search key e.g. Fig I Primary and Secondary Indices (cont.)

April 2008Database System Concepts - Chapter 12 Indexing and Hashing -15 Fig.12.5 Secondary Index on balance field of account 12.2-I Primary and Secondary Indices (cont.)

April 2008Database System Concepts - Chapter 12 Indexing and Hashing -16 Dense index the index record of the index file appears for every search- key value in the indexed file each value of search-key in the indexed file corresponds to an index entry in the index file e.g. Fig.12.1 In a dense primary index, the index record contains the search- key value and a pointer to the first record with that search-key value the rest of the records with the same search-key value would be stored sequentially after the first record e.g. Fig.12.1, search-key value “Perryridge” corresponds to three records in the file 12.2-II Dense and Sparse Indices

April 2008Database System Concepts - Chapter 12 Indexing and Hashing -17 Sparse Index index file contains index entries for only some search-key values in the indexed file e.g. Fig.12.3 With respect to the sparse index, to locate a file record with search-key value K find index entry with largest search-key value ≤ K search file sequentially starting at the record to which this index entry points 12.2-II Dense and Sparse Indices (cont.)

April 2008Database System Concepts - Chapter 12 Indexing and Hashing -18 Fig.12.3 Sparse index for the file account index file indexed file 12.2-II Dense and Sparse Indices

April 2008Database System Concepts - Chapter 12 Indexing and Hashing -19 The indices in Fig.12.1, Fig.12.3, and Fig.12.5 are single-level indices The index file may be very large, and cannot be entirely kept in memory To access the index file quickly, the index file is stored on disk as a sequential file and construct a sparse index on this sequential index file outer index – a sparse index of primary index file inner index – the primary index file If even outer index is too large to fit in main memory, yet another level of index can be created, and so on III Single-level and Multi-level Indices

Fig Two-level sparse index indexed file index file

April 2008Database System Concepts - Chapter 12 Indexing and Hashing -21 B + -tree indices and B-tree indices are two types of efficient multi-level indices, and widely used in DBS III Single-level and Multi-level Indices Fig. An Example of B + -tree index

April 2008Database System Concepts - Chapter 12 Indexing and Hashing -22 The file records are stored in a set of buckets a bucket is a unit of storage containing one or more records (a bucket is typically a disk block). Hash function h a function from the set of all search-key values K in a file to the set of all bucket addresses, i.e. the set of the addresses of file records typical hash functions perform computation on the internal binary representation of the search-key. Hash file organization obtaining the bucket of a record directly from its search-key value using a hash function §12.5 Hashing Index Files

Fig Hash file organization of account file, with branch-name as key Hash function: returns the sum of the binary representations of the characters modulo 10

April 2008Database System Concepts - Chapter 12 Indexing and Hashing -24 Handling of Bucket Overflows Bucket overflow ( 溢出 ) can occur because of insufficient buckets skew in distribution of records. This can occur due to two reasons: multiple records have same search-key value chosen hash function produces non-uniform distribution of key values Although the probability of bucket overflow can be reduced, it cannot be eliminated; it is handled by using overflow buckets.

April 2008Database System Concepts - Chapter 12 Indexing and Hashing -25 Handling of Bucket Overflows (cont.) Overflow chaining – the overflow buckets of a given bucket are chained together in a linked list. Above scheme is called closed hashing. Overflows chaining Fig Overflows chaining in an hash structure

April 2008Database System Concepts - Chapter 12 Indexing and Hashing -26 Hash Indices Hashing can be used not only for file organization, but also for index-structure creation. A hash index organizes the search keys, with their associated record pointers, into a hash file structure. Strictly speaking, hash indices are always secondary indices if the file itself is organized using hashing, a separate primary hash index on it using the same search-key is unnecessary. However, we use the term hash index to refer to both secondary index structures and hash organized files.

Fig Hash index on search key account_number of account file

April 2008Database System Concepts - Chapter 12 Indexing and Hashing -28 Static Hashing vs Dynamic Hashing Static hashing hash function h cannot be modified, while being used Dynamic hashing hash function h to be modified dynamically

April 2008Database System Concepts - Chapter 12 Indexing and Hashing -29 Create an index create index or ) E.g.: create index b-index on branch(branch-name) To drop an index drop index §12.8 Index Definition in SQL