COP4540 Database Management System Final Review Reviewed by Ramakrishna. Parts of this are taken from Fernando Farfan’s presentation.

Slides:



Advertisements
Similar presentations
Introduction to Database Systems1 Records and Files Storage Technology: Topic 3.
Advertisements

Hashing and Indexing John Ortiz.
Quiz 2 Review. For which of the following attributes would a hash- index most likely be a better fit than a B+-tree index? A. Social Security Number B.
Advanced Databases: Lecture 2 Query Optimization (I) 1 Query Optimization (introduction to query processing) Advanced Databases By Dr. Akhtar Ali.
1 Lecture 8: Data structures for databases II Jose M. Peña
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Storage and Indexing Chapter 8 “How index-learning turns no student pale Yet.
1 Overview of Storage and Indexing Chapter 8 (part 1)
CS263 Lecture 19 Query Optimisation.  Motivation for Query Optimisation  Phases of Query Processing  Query Trees  RA Transformation Rules  Heuristic.
1 Overview of Storage and Indexing Yanlei Diao UMass Amherst Feb 13, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Storage and Indexing Chapter 8 “How index-learning turns no student pale Yet.
Query Optimization 3 Cost Estimation R&G, Chapters 12, 13, 14 Lecture 15.
1 PART III STORAGE AND INDEXING CH. 8: OVERVIEW OF STORAGE AND INDEXING (introduction) CH. 9: STORING DATA: DISKS AND FILES (hardware) CH. 10: TREE-STRUCTURED.
CSCI 5708: Query Processing I Pusheng Zhang University of Minnesota Feb 3, 2004.
1 Overview of Storage and Indexing Chapter 8 1. Basics about file management 2. Introduction to indexing 3. First glimpse at indices and workloads.
Chapter 17 Methodology – Physical Database Design for Relational Databases Transparencies © Pearson Education Limited 1995, 2005.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Storage and Indexing Chapter 8.
Advance Computer Programming Java Database Connectivity (JDBC) – In order to connect a Java application to a database, you need to use a JDBC driver. –
Database Management 8. course. Query types Equality query – Each field has to be equal to a constant Range query – Not all the fields have to be equal.
1 Physical Data Organization and Indexing Lecture 14.
1 IT420: Database Management and Organization Storage and Indexing 14 April 2006 Adina Crăiniceanu
Lecture 9 Methodology – Physical Database Design for Relational Databases.
Database Tuning Prerequisite Cluster Index B+Tree Indexing Hash Indexing ISAM (indexed Sequential access)
CPSC 404, Laks V.S. Lakshmanan1 Tree-Structured Indexes BTrees -- ISAM Chapter 10 – Ramakrishnan & Gehrke (Sections )
Physical Database Design I, Ch. Eick 1 Physical Database Design I About 25% of Chapter 20 Simple queries:= no joins, no complex aggregate functions Focus.
External data structures
1 Overview of Storage and Indexing Chapter 8 (part 1)
Storage and Indexing1 Overview of Storage and Indexing.
Namespace information are represented as namespace node which maps in scope on an element Attach to every element node where namespace is declared root.
1 Overview of Storage and Indexing Chapter 8 “How index-learning turns no student pale Yet holds the eel of science by the tail.” -- Alexander Pope ( )
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Storage and Indexing Chapter 8.
1 Overview of Storage and Indexing Chapter 8. 2 Data on External Storage  Disks: Can retrieve random page at fixed cost  But reading several consecutive.
Overview of Storage and Indexing Content based on Chapter 4 Database Management Systems, (Third Edition), by Raghu Ramakrishnan and Johannes Gehrke. McGraw.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Storage and Indexing Chapter 8 “How index-learning turns no student pale Yet.
Indexing and hashing Azita Keshmiri CS 157B. Basic concept An index for a file in a database system works the same way as the index in text book. For.
Index tuning-- B+tree. overview Overview of tree-structured index Indexed sequential access method (ISAM) B+tree.
Methodology – Physical Database Design for Relational Databases.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Storage and Indexing Chapter 8 “If you don’t find it in the index, look very.
Appendix C File Organization & Storage Structure.
Spring 2003 ECE569 Lecture 05.1 ECE 569 Database System Engineering Spring 2003 Yanyong Zhang
Database Indexing 1 After this lecture, you should be able to:  Understand why we need database indexing.  Define indexes for your tables in MySQL. 
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Physical Database Design I, Ch. Eick 1 Physical Database Design I Chapter 16 Simple queries:= no joins, no complex aggregate functions Focus of this Lecture:
Session 1 Module 1: Introduction to Data Integrity
File Organizations and Indexing
Spring 2004 ECE569 Lecture 05.1 ECE 569 Database System Engineering Spring 2004 Yanyong Zhang
CS 440 Database Management Systems Lecture 5: Query Processing 1.
Data on External Storage – File Organization and Indexing – Cluster Indexes - Primary and Secondary Indexes – Index data Structures – Hash Based Indexing.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Storage and Indexing Chapter 8.
Appendix C File Organization & Storage Structure.
1 Overview of Query Evaluation Chapter Outline  Query Optimization Overview  Algorithm for Relational Operations.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Storage and Indexing Chapter 8.
1 Overview of Storage and Indexing Chapter 8. 2 Review: Architecture of a DBMS  A typical DBMS has a layered architecture.  The figure does not show.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Storage and Indexing Chapter 8.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Storage and Indexing Chapter 8 “If you don’t find it in the index, look very.
CS522 Advanced database Systems Huiping Guo Department of Computer Science California State University, Los Angeles 3. Overview of data storage and indexing.
Indexes By Adrienne Watt.
CS522 Advanced database Systems
Record Storage, File Organization, and Indexes
CS 440 Database Management Systems
Azita Keshmiri CS 157B Ch 12 indexing and hashing
CS522 Advanced database Systems
Database Management Systems (CS 564)
CS222P: Principles of Data Management Notes #6 Index Overview and ISAM Tree Index Instructor: Chen Li.
File organization and Indexing
Lecture 12 Lecture 12: Indexing.
Overview of Storage and Indexing
CS222/CS122C: Principles of Data Management Notes #6 Index Overview and ISAM Tree Index Instructor: Chen Li.
CS222/CS122C: Principles of Data Management UCI, Fall 2018 Notes #05 Index Overview and ISAM Tree Index Instructor: Chen Li.
ICOM 5016 – Introduction to Database Systems
Presentation transcript:

COP4540 Database Management System Final Review Reviewed by Ramakrishna. Parts of this are taken from Fernando Farfan’s presentation

AGENDA Exercises to do…. Ch6 JDBC Ch7 3-Tier Architecture Ch8.Storage and Indexing Ch10.Tree-Structured Indexing Ch7.XML Documents Ch27.XQUERY: Querying XML Data

Exercises to do…. Remember to practice even numbered exercises of the book Solutions are available online at rdEdition/supporting_material.htm rdEdition/supporting_material.htm Important Recommended exercises: 8.3, 8.4, 8.5, 8.7, 8.10, 8.11

JDBC What is JDBC ? Explain its purpose.  JDBC is Java DataBase Connectivity used to enable integration of SQL with a general purpose programming language. Explain JDBC Architecture  4 components : Application, Driver Manager, Data Source Specific Drivers, Data Sources (stored in MySql,Oracle,DB2,MSSQL,Access and so on).

JDBC Explain the individual steps required to submit a query to a data source and to retrieve results in JDBC ?  JDBC Driver Management (class.forName(..)),  Establishing Connection (Connection Object),  Executing SQL Statements: Statement, PreparedStatement, CallableStatement.  Retrieving results ( Examining ResultSets).

Stored Procedures Explain the term stored procedure, and give examples why stored procedures are useful.  Stored procedures are programs that run on the database server and can be called with a single SQL statement.  Runs inside the process space of the database server.  Stored procedures can also be used to reduce network communication.  They are locally-executed and results are packaged in to one big result.  Different users can re-use the stored procedure.

3-Tier Application Architecture What is a 2-tier architecture ? Explain different types of 2-tier architectures.  It’s a client-server architecture. Application Logic DBMS Network Client Architecture 1: Thin Clients ( web browsers)

3-Tier Application Architecture  Architecture 2: Thick Clients Application Logic DBMS Network Client Application Logic Disadvantages ? 1)No central place to update 2)Need to trust client

3-tier architecture What are the advantages of 3-tier architecture ?  Heterogeneous Systems  Thin Clients  Integrated Data Access  Scalable DBMS (database) Network Client (user interface) Client (web browser) Application Logic (in C++ or Java..) Network

CH8. OVERVIEW OF STORAGE AND INDEXING DBMS stores vast quantities of data Data must persist across program executions  Data is stored on external storage devices. The unit of information read from or written to disk is a page. The cost of page I/O dominates the cost of typical database operations.  Input from disk to main memory  Output from main memory to disk

CH8. OVERVIEW OF STORAGE AND INDEXING Simplest file structure is an unordered file, or heap file. An index is a data structure to optimize certain kinds of retrieval operations.  CLUSTERED - When a file is organized so that the ordering of data records is the same as the ordering of data entries in some index  UNCLUSTERED - It is un-clustered otherwise There is UTMOST one clustered index and several un-clustered ones for a table.

Example Employee (Eid: integer, name: String, age: integer…..) Assume data records sorted by name Assume an index on name  Index on name  CLUSTERED Assume an index on age  Index on age  UNCLUSTERED

Example: Using Indexes Employee Data Records : Sorted by Name  Page 1 : (1,Alex,30) (2,Amy,21)  Page 2: (3, Bob,31) (4, Brenda,21) Select * from Employee where name like “A%”;  Use index on name  Retrieve Page 1. Select * from Employee where age = 21;  Use index on age  Retrieve Page 1 & Page 2.

Clustered and UnClustered Indexes Retrieval using a Clustered index  Data retrieval using minimum number of Data page I/Os. Retrieval using an Unclustered index  In worst case, 1 page I/O for every qualifying tuple.

Another Example Consider “Select E.dno from Employee E where E.age > 40”;  Assume B+ Tree index on age.  Is it better to use this index ? Or just do a segment scan ? Answer : Depends on several factors:  Depends on Selectivity of the condition.  Whether the index is clustered or unclustered.

CH8. OVERVIEW OF STORAGE AND INDEXING 8.3 Consider a relation stored as a randomly ordered file for which the only index is an unclustered index on a field called sal. If you want to retrieve all records with sal > 20, is using the index always the best alternative? Explain. No. In this case, the index is unclustered, each qualifying data entry could contain an rid that points to a distinct data page, leading to as many data page I/Os as the number of data entries that match the range query. In this situation, using index is actually worse than file scan.

CH8. OVERVIEW OF STORAGE AND INDEXING 8.5 Explain the difference between Hash indexes and B+-tree indexes. In particular, discuss how equality and range searches work. Hash Index: Hashing function. Quickly maps a search key value to a bucket. Inserts/Deletes are simple. Linked list for collisions. Good at equality searches. Terrible for range queries. B+-tree Index: Hierarchical search data structure. Maintenance is costly. Costly for individual record lookup. Efficient for range queries.

CH8. OVERVIEW OF STORAGE AND INDEXING Consider the following relation Answer: Update the salary of an employee using the employee id.

CH8. OVERVIEW OF STORAGE AND INDEXING Update the age or employee id for some department id Update the salaries of all employees in some department

Constraints

Assume tables are already created. ALTER TABLE ENROLLED add constraint cons1 check((select count(*) from ENROLLED group by cname)>=5) ALTER TABLE ENROLLED add constraint cons2 check((select count(*) from ENROLLED group by cname)<=30)

Constraints

Views

Tree-Structured Indexing Explain ISAM and B+ Trees ? What are the differences, advantages and disadvantages ?  ISAM – Indexes Sequential access Method Effective for static files Unsuitable for dynamically changing files  B+ Tree Dynamic index structure Adjusts well to changes Does well for both range and equality selections

Tree-Structured Indexing Insertions and deletions in ISAM  Happen only in leaf pages  Insertions – adding records to the overflow chain  Search – inefficient as the chain grows In B+ tree even after several insertions and deletions,  The tree is kept balanced  Height of the tree  length of the path from root to leaf

Query Evaluation Consider a relation R(a,b,c,d,e) containing 5,000,000 records, where each data page of the relation holds 10 records. R is organized as a sorted file with secondary indexes. Assume that R.a is a candidate key for R, with values lying in the range 0 to 4,999,999, and that R is stored in R.a order. For each of the following relational algebra queries, state which of the following three approaches is most likely to be the cheapest:  Access the sorted file for R directly.  Use a (clustered) B+ tree index on attribute R.a.  Use a linear hashed index on attribute R.a.

Query Evaluation

CH7. XML DOCUMENTS 7.1 When is an XML document well-formed? When is an XML document valid? An XML document is valid if it has an associated DTD and the document follows the rules of the DTD. An XML document is well- formed if it follows three guidelines: 1. It starts with an XML declaration. 2. It contains a root element that contains all other elements. 3. All elements are properly nested.

CH27. XQUERY Everyday Italian Giada De Laurentiis Harry Potter J K. Rowling XQuery Kick Start James McGovern Per Bothner Kurt Cagle James Linn Vaidyanathan Nagarajan Learning XML Erik T. Ray

CH27. XQUERY Query: doc("books.xml")/bookstore/book/title Result: Everyday Italian Harry Potter XQuery Kick Start Learning XML

CH27. XQUERY Query: doc("books.xml")/bookstore/book[price <30] Result: Harry Potter J K. Rowling

CH27. XQUERY Query: for $x in doc("books.xml")/bookstore/book where $x/price>30 order by $x/title return $x/title Result: Learning XML XQuery Kick Start

CH27. XQUERY Query: { for $x in doc("books.xml")/bookstore/book/title order by $x return {$x} } Result: Everyday Italian Harry Potter Learning XML XQuery Kick Start