Questions and Answers (Q&A)

Slides:

Advertisements

Similar presentations

Hash-Based Indexes Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY.

Advertisements

Hash-based Indexes CS 186, Spring 2006 Lecture 7 R &G Chapter 11 HASH, x. There is no definition for this word -- nobody knows what hash is. Ambrose Bierce,

1 Hash-Based Indexes Module 4, Lecture 3. 2 Introduction As for any index, 3 alternatives for data entries k* : – Data record with key value k – –Choice.

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Hash-Based Indexes Chapter 11.

Department of Computer Science and Engineering, HKUST Slide 1 Dynamic Hashing Good for database that grows and shrinks in size Allows the hash function.

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Hash-Based Indexes Chapter 11.

Chapter 11 (3 rd Edition) Hash-Based Indexes Xuemin COMP9315: Database Systems Implementation.

Copyright 2003Curt Hill Hash indexes Are they better or worse than a B+Tree?

©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree.

1 Hash-Based Indexes Chapter Introduction  Hash-based indexes are best for equality selections. Cannot support range searches.  Static and dynamic.

1 Hash-Based Indexes Chapter Introduction : Hash-based Indexes  Best for equality selections.  Cannot support range searches.  Static and dynamic.

E.G.M. PetrakisHashing1 Hashing on the Disk  Keys are stored in “disk pages” (“buckets”)  several records fit within one page  Retrieval:  find address.

Hashing and Hash-Based Index. Selection Queries Yes! Hashing  static hashing  dynamic hashing B+-tree is perfect, but.... to answer a selection query.

Grade Book Database Presentation Jeanne Winstead CINS 137.

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Hash-Based Indexes Chapter 11 Modified by Donghui Zhang Jan 30, 2006.

Introduction to Database, Fall 2004/Melikyan1 Hash-Based Indexes Chapter 10.

1.1 CS220 Database Systems Indexing: Hashing Slides courtesy G. Kollios Boston University via UC Berkeley.

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Indexed Sequential Access Method.

Database Management Systems, R. Ramakrishnan and J. Gehrke1 Hash-Based Indexes Chapter 10.

Chapter 5 Record Storage and Primary File Organizations

Database Applications (15-415) DBMS Internals- Part III Lecture 13, March 06, 2016 Mohammad Hammoud.

CS422 Principles of Database Systems Indexes

Data Indexing Herbert A. Evans.

Module 11: File Structure

May 3rd – Hashing & Graphs

Indexing Structures for Files and Physical Database Design

Tree-Structured Indexes

Dynamic Hashing (Chapter 12)

Lecture 21: Hash Tables Monday, February 28, 2005.

Are they better or worse than a B+Tree?

Hash-Based Indexes Chapter 11

Database Management Systems (CS 564)

Review Graph Directed Graph Undirected Graph Sub-Graph

Dynamic Hashing.

Database Applications (15-415) DBMS Internals- Part III Lecture 15, March 11, 2018 Mohammad Hammoud.

Database Implementation Issues

Chapter 11: Indexing and Hashing

Machine Independent Features

Introduction to Database Systems

B+-Trees and Static Hashing

External Memory Hashing

Tree-Structured Indexes

CS222: Principles of Data Management Notes #8 Static Hashing, Extendible Hashing， Linear Hashing Instructor: Chen Li.

Hash-Based Indexes Chapter 10

Extendible Hashing Primarily used for storage of files on disk

Indexing and Hashing Basic Concepts Ordered Indices

CS222P: Principles of Data Management Notes #8 Static Hashing, Extendible Hashing， Linear Hashing Instructor: Chen Li.

B+Trees The slides for this text are organized into chapters. This lecture covers Chapter 9. Chapter 1: Introduction to Database Systems Chapter 2: The.

Hash-Based Indexes Chapter 11

Tree-Structured Indexes

Index tuning Hash Index.

Advance Database System

Database Systems (資料庫系統)

Module 12a: Dynamic Hashing

Indexing 4/11/2019.

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence

Hash-Based Indexes Chapter 11

Chapter 8: Estimating with Confidence

Chapter 11 Instructor: Xin Zhang

Tree-Structured Indexes

Chapter 8: Estimating with Confidence

Chapter 11: Indexing and Hashing

CS222/CS122C: Principles of Data Management UCI, Fall 2018 Notes #07 Static Hashing, Extendible Hashing， Linear Hashing Instructor: Chen Li.

Database Implementation Issues

Lecture-Hashing.

Presentation transcript:

Questions and Answers (Q&A) Chapter 1 > In the section 'introduction' when you have defined what is database, you > have written that "REPOSITORY" implies "persistence".This is little bit > confusing to me. > Could you kindly throw some light on this matter? Persistence means that the data create or changed by the program survives beyond the termination of the program. That's what databases are for - repositories for this persistent data following the termination of the user query or user program that uses the database. 11

Questions and Answers (Q&A) Chapter 2 > 1) While using MOD function as hashing function, you have to > choose a prime no.Is it because one gets all the reminders > or there is any other reason ? Any divisor will do, however, studies have shown that prime numbers give one a better distribution (more even distribution of values to buckets). That's the only reason for choosing a prime. It is certainly not mandatory either, if there is a reason to prefer another divisor (e.g., allocated, say 16 pages, then 16 would be a good divisor). > 2) How do you choose the prime no.? The divisor is chosen based on the number of buckets (it is always the number of buckets available in the case of a MOD hash function). 11

Questions and Answers (Q&A) > 3) In the extendable hashing section, you have mentioned about local depths. It is little bit vague to me. The local depth is the number of bits used to get to that page. The global depth is, essentially, the maximum of the local depths. > 4)In the same page , you have used the concept of page- > splitting. How did you choose page# 17 and then page#32' The idea is that a request is made to the OS for a page and there is a page allocated. It could have any page number whatsoever, so I just pick randomly. > Some other faculty in the department do not have a very high opinion of it as a database. > Faults sited include a lack of current development and lack of > features. From my limited knowledge of this I gather that these criticisms are unfounded. Probably the criticisms come mainly from those looking for production DBMS to use, rather than one to do research with. Postgres is not a commercial product and if someone is interested in a full-featured, supported, commercial product, Informix is the commercial product that came out of the research Postgres prototype. However, for our purposes, Postgres is a good choice because we have source code (can change things to do our research), whereas finished commercial products are never available at a source level. 11

Questions and Answers (Q&A) > The DBMS, uniVerse (by informix, I beleive, but I have heard rumors of > a merger so I'm not sure of the vendor), uses dynamically > allocated space for different numbers of attributes in a single table. > so that a table could be defined as: > ID FIRST_NAME LAST_NAME ADDR HOBBY > 3 JACK FROST 3rd reading > Then when a person is added with two hobbies the table would contain > records of both the above type and this: > ID FIRST_NAME LAST_NAME ADDR HOBBY HOBBY1 > 4 JOHN DOE 4th biking singing > Allocating only enough space for the additional hobby in the record > which contains two hobbies, not wasting space in the first record with > only one hobby. ad infinitum as new hobbies are added. > This would be done usually as multiple tables such as > ID FIRSTNAME LASTNAME ADDR > 3 JACK FROST 3rd > 4 JOHN DOE 4th > > ID HOBBY > 3 READING > 4 BIKING > 4 SINGING > I beleive. Different Anyway. I might guess that perhaps the dynamic memory allocation is actually allocated in separate > table and that it is merely hidden from the programmer. It does seem to bring in some rather annoying questions if it is not. You may be right. Another, possibilty is that those systems taht show users a view which has the separate table (and in "normalized" in the sense we will talk about when we get to normalization) may actually store the data by just allocating space the way uniVerse does. Many times there is substantial difference between the way in which data is store and the way in which it is "presented" for view to users. 11

Questions and Answers (Q&A) > Any thoughts on this system? It just seemed like a rather curious way of looking at the system. It would be quite interesting if uniVerse actually presents the data with non-flat (repeating groups) and still claims to be relational. That would be a pretty serious abuse of the terminology, "relational". If they claim only "object-relational" then, like most object-relational system, complex datatypes are accomodated through the "Large Object" construct (which allows a pointer to a complex object in any field instead of a single value, basically). B-trees: Consider the follow B-tree. > 32|50 > / | \ > / | \ > / | \ > 15|17 35|40 52|60 > Where each node can hold a maximum of 2 entries. In this case how to insert a data say 55 in to the tree? As in the notes, when the appropriate node is full, split it into 2 nodes and promote the middle value the the next higher level (and if that node is full, split it and promote the middle value, etc.) Thus, we split 52|60 and promote 55: 32|50 / | 55 / | ^ / | : 15|17 35|40 52|__ 60|__ 32|50 is full, so we split into 32|__ and 55|__ and promote 50: 50|_ / \ / \ / \ 32|_ 55|_ / \ / \ / \ / \ 15|17 35|40 52|_ 60|_ 11