More Optimization Exercises. Block Nested Loops Join Suppose there are B buffer pages Cost: M + ceil (M/(B-2))*N where –M is the number of pages of R.

Slides:



Advertisements
Similar presentations
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Evaluation of Relational Operations Chapter 12, Part A.
Advertisements

Equality Join R X R.A=S.B S : : Relation R M PagesN Pages Relation S Pr records per page Ps records per page.
Copyright © 2011 Ramez Elmasri and Shamkant Navathe Algorithms for SELECT and JOIN Operations (8) Implementing the JOIN Operation: Join (EQUIJOIN, NATURAL.
6.830 Lecture 9 10/1/2014 Join Algorithms. Database Internals Outline Front End Admission Control Connection Management (sql) Parser (parse tree) Rewriter.
Lecture 8 Join Algorithms. Intro Until now, we have used nested loops for joining data – This is slow, n^2 comparisons How can we do better? – Sorting.
CS 540 Database Management Systems
Query Execution, Concluded Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems November 18, 2003 Some slide content may.
Database Management Systems 3ed, R. Ramakrishnan and Johannes Gehrke1 Evaluation of Relational Operations: Other Techniques Chapter 14, Part B.
Database Management Systems, R. Ramakrishnan and Johannes Gehrke1 Evaluation of Relational Operations: Other Techniques Chapter 12, Part B.
1 Implementation of Relational Operations Module 5, Lecture 1.
DB performance tuning using indexes Section 8.5 and Chapters 20 (Raghu)
1  Simple Nested Loops Join:  Block Nested Loops Join  Index Nested Loops Join  Sort Merge Join  Hash Join  Hybrid Hash Join Evaluation of Relational.
SPRING 2004CENG 3521 Join Algorithms Chapter 14. SPRING 2004CENG 3522 Schema for Examples Similar to old schema; rname added for variations. Reserves:
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
1 Optimization Recap and examples. 2 Optimization introduction For every SQL expression, there are many possible ways of implementation. The different.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Query Evaluation Chapter 12.
Query Processing and Optimization
1 Optimization - Selection. 2 The Selection Operation Table: Reserves(sid, bid, day, agent) A page (block) can hold 100 Reserves tuples There are 1,000.
Query Optimization II R&G, Chapters 12, 13, 14 Lecture 9.
Quick Review of Apr 22 material Sections 13.1 through 13.3 in text Query Processing: take an SQL query and: –parse/translate it into an internal representation.
Query Optimization 3 Cost Estimation R&G, Chapters 12, 13, 14 Lecture 15.
CS186 Final Review Query Optimization.
Introduction to Database Systems 1 Join Algorithms Query Processing: Lecture 1.
1 Evaluation of Relational Operations: Other Techniques Chapter 12, Part B.
1 40T1 60T2 30T3 10T4 20T5 10T6 60T7 40T8 20T9 R S C C R JOIN S?
CS 4432query processing - lecture 171 CS4432: Database Systems II Lecture #17 Join Processing Algorithms (cont). Professor Elke A. Rundensteiner.
Evaluation of Relational Operations. Relational Operations v We will consider how to implement: – Selection ( ) Selects a subset of rows from relation.
1 Implementation of Relational Operations: Joins.
Overview of Implementing Relational Operators and Query Evaluation
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 14 – Join Processing.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Implementing Relational Operators and Query Evaluation Chapter 12.
Database Tuning Prerequisite Cluster Index B+Tree Indexing Hash Indexing ISAM (indexed Sequential access)
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
Status “Lifetime of a Query” –Query Rewrite –Query Optimization –Query Execution Optimization –Use cost-estimation to iterate over all possible plans,
Relational Operator Evaluation. Overview Index Nested Loops Join If there is an index on the join column of one relation (say S), can make it the inner.
Copyright © Curt Hill Query Evaluation Translating a query into action.
RELATIONAL JOIN Advanced Data Structures. Equality Joins With One Join Column External Sorting 2 SELECT * FROM Reserves R1, Sailors S1 WHERE R1.sid=S1.sid.
Implementing Natural Joins, R. Ramakrishnan and J. Gehrke with corrections by Christoph F. Eick 1 Implementing Natural Joins.
Review Jun 5th, HW#5.2 TableTupleTuple/pagePage R S R R.a = S.b S (52buffers)
Storage and Indexing1 Overview of Storage and Indexing.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Implementing Relational Operators and Query Evaluation Chapter 12.
Chapter 12 Query Processing. Query Processing n Selection Operation n Sorting n Join Operation n Other Operations n Evaluation of Expressions 2.
CPSC 404, Laks V.S. Lakshmanan1 Evaluation of Relational Operations – Join Chapter 14 Ramakrishnan and Gehrke (Section 14.4)
Query Execution. Where are we? File organizations: sorted, hashed, heaps. Indexes: hash index, B+-tree Indexes can be clustered or not. Data can be stored.
CS 440 Database Management Systems Lecture 5: Query Processing 1.
Computing & Information Sciences Kansas State University Wednesday, 08 Nov 2006CIS 560: Database System Concepts Lecture 32 of 42 Monday, 06 November 2006.
Database Management Systems 1 Raghu Ramakrishnan Evaluation of Relational Operations Chpt 14.
File Processing : Query Processing 2008, Spring Pusan National University Ki-Joune Li.
Hash Tables and Query Execution March 1st, Hash Tables Secondary storage hash tables are much like main memory ones Recall basics: –There are n.
Relational Operator Evaluation. overview Projection Two steps –Remove unwanted attributes –Eliminate any duplicate tuples The expensive part is removing.
Implementation of Database Systems, Jarek Gryz1 Evaluation of Relational Operations Chapter 12, Part A.
CS 540 Database Management Systems
Query Execution Query compiler Execution engine Index/record mgr. Buffer manager Storage manager storage User/ Application Query update Query execution.
Alon Levy 1 Relational Operations v We will consider how to implement: – Selection ( ) Selects a subset of rows from relation. – Projection ( ) Deletes.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Evaluation of Relational Operations Chapter 14, Part A (Joins)
Fan Qi Database Lab 1, com1 #01-08 CS3223 Tutorial 6.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2007.
CS 440 Database Management Systems
Database Applications (15-415) DBMS Internals- Part VII Lecture 16, October 25, 2016 Mohammad Hammoud.
Evaluation of Relational Operations
Database Management Systems (CS 564)
Relational Operations
Yan Huang - CSCI5330 Database Implementation – Access Methods
Selected Topics: External Sorting, Join Algorithms, …
Performance Join Operator Select * from R, S where R.a = S.a;
Lecture 2- Query Processing (continued)
Implementation of Relational Operations
Overview of Query Evaluation: JOINS
Sorting We may build an index on the relation, and then use the index to read the relation in sorted order. May lead to one disk block access for each.
Presentation transcript:

More Optimization Exercises

Block Nested Loops Join Suppose there are B buffer pages Cost: M + ceil (M/(B-2))*N where –M is the number of pages of R –N is the number of pages of S foreach block of B-2 pages of R do foreach page of S do { for all matching in-memory pairs r, s: add to result }

Index Nested Loops Join Suppose there is an index on the join attribute of S We find the inner tuples using the index! Cost: Read R once + for each tuple in R, find the appropriate tuples of S foreach tuple r of R foreach tuple s of S where r i =s j add to result

Sort-Merge Join Sort both relations on join attribute. This creates “partitions” according to the join attributes. Join relations while merging them. Tuples in corresponding partitions are joined. Cost depends on whether partitions are large and therefore, are scanned multiple times. In best case: O(M+N+MlogM + NlogN) Note that the log is not on base 2

Hash Join //Partition R into k partitions foreach tuple r in R do //flush when fills read r and add it to buffer page h(r i ) foreach tuple s in S do //flush when fills read s and add it to buffer page h(s j ) for l = 1..k //Build in-memory hash table for R l using h2 foreach tuple r in R l do read r and insert into hash table with h2 foreach tuple s in S l do read s and probe table using h2 output matching pairs Cost: 3(M + N), assuming there is enough buffer space

Question 1 Consider the query: select * from R, S where R.a < S.b Can you use a variation on sort-merge join to compute this query? what about hash join? index nested loops join? block nested loops join?

Question 2 Consider the query: –select * from R, S where R.a = S.b Suppose that b is a primary key in S R contains 10,000 tuples and 10 tuples per page S contains 2,000 tuples and 10 tuples per page There are 52 buffer pages

Question 2 (cont) Suppose that there are unclustered BTree indexes on R.a and S.b. Is it cheaper to do an index nested loop or block nested loop join? Would the answer change if there were only 5 buffer pages Would your answer change if S contained only 10 tuples?

Question 2 (cont) Suppose that there are clustered BTree indexes on R.a and S.b. Is it cheaper to do an index nested loop or block nested loop join? Would the answer change if there were only 5 buffer pages Would your answer change if S contained only 10 tuples?

Question 3 Consider the query: select E.eid from Employees E where E.age = 25 and E.sal >= 3000 and E.sal <=5000 Which index would you build in order to be able to evaluate the query quickly? Hint: A multicolumn index

Question 4 Consider the query: select E.dno, COUNT(*) from Employees E group by E.dno Which index would you build in order to be able to evaluate the query quickly? Hint: Create an index that allows avoiding access to the actual Employee table

Question 5 Consider the query: select E.dno, MIN(E.sal) from Employees E group by E.dno Which index would you build in order to be able to evaluate the query quickly? Hint: Create an index that allows avoiding access to the actual Employee table