Quick Review of Apr 17 material Multiple-Key Access –There are good and bad ways to run queries on multiple single keys Indices on Multiple Attributes.

Slides:



Advertisements
Similar presentations
Chapter 15 Algorithms for Query Processing and Optimization Copyright © 2004 Pearson Education, Inc.
Advertisements

Evaluation of Relational Operators CS634 Lecture 11, Mar Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part C Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
Advanced Databases: Lecture 2 Query Optimization (I) 1 Query Optimization (introduction to query processing) Advanced Databases By Dr. Akhtar Ali.
SPRING 2004CENG 3521 Query Evaluation Chapters 12, 14.
Query processing and optimization. Advanced DatabasesQuery processing and optimization2 Definitions Query processing –translation of query into low-level.
CS263 Lecture 19 Query Optimisation.  Motivation for Query Optimisation  Phases of Query Processing  Query Trees  RA Transformation Rules  Heuristic.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
Query Processing (overview)
Reading and Review Chapter 12: Indexing and Hashing
CSCI 5708: Query Processing I Pusheng Zhang University of Minnesota Feb 3, 2004.
ACS-4902 Ron McFadyen Chapter 15 Algorithms for Query Processing and Optimization.
1 Anna Östlin Pagh and Rasmus Pagh IT University of Copenhagen Advanced Database Technology March 25, 2004 QUERY COMPILATION II Lecture based on [GUW,
Quick Review of Apr 15 material Overflow –definition, why it happens –solutions: chaining, double hashing Hash file performance –loading factor –search.
Quick Review of Apr 22 material Sections 13.1 through 13.3 in text Query Processing: take an SQL query and: –parse/translate it into an internal representation.
Query Optimization 3 Cost Estimation R&G, Chapters 12, 13, 14 Lecture 15.
Query Optimization. General Overview Relational model - SQL  Formal & commercial query languages Functional Dependencies Normalization Physical Design.
CSCI 5708: Query Processing I Pusheng Zhang University of Minnesota Feb 3, 2004.
Bitmap Indexes.
Query Processing & Optimization
©Silberschatz, Korth and Sudarshan14.1Database System Concepts 3 rd Edition Chapter 14: Query Optimization Overview Catalog Information for Cost Estimation.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
Dr. Kalpakis CMSC 461, Database Management Systems Query Processing.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 13: Query Processing.
Query Processing Chapter 12
CSCE Database Systems Chapter 15: Query Execution 1.
Database Management 9. course. Execution of queries.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004.
©Silberschatz, Korth and Sudarshan7.1 Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting Join Operation Other Operations.
Query Processing. Steps in Query Processing Validate and translate the query –Good syntax. –All referenced relations exist. –Translate the SQL to relational.
Advanced Databases: Lecture 6 Query Optimization (I) 1 Introduction to query processing + Implementing Relational Algebra Advanced Databases By Dr. Akhtar.
©Silberschatz, Korth and Sudarshan14.1Database System Concepts 3 rd Edition Chapter 14: Query Optimization Overview Catalog Information for Cost Estimation.
12.1Database System Concepts - 6 th Edition Chapter 12: Query Processing Overview Measures of Query Cost Selection Operation Join Operation Sorting 、 Other.
SCUHolliday - COEN 17814–1 Schedule Today: u Query Processing overview.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan Chapter 13: Query Processing.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 13: Query Processing.
Temple University – CIS Dept. CIS331– Principles of Database Systems V. Megalooikonomou Query Processing (based on notes by C. Faloutsos at CMU)
Computing & Information Sciences Kansas State University Tuesday, 03 Apr 2007CIS 560: Database System Concepts Lecture 29 of 42 Tuesday, 03 April 2007.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 12: Query Processing.
Lecture 1- Query Processing Advanced Databases Masood Niazi Torshiz Islamic Azad university- Mashhad Branch
Chapter 12 Query Processing. Query Processing n Selection Operation n Sorting n Join Operation n Other Operations n Evaluation of Expressions 2.
Chapter 12 Query Processing (1) Yonsei University 2 nd Semester, 2013 Sanghyun Park.
Chapter 13: Query Processing
CPSC 404, Laks V.S. Lakshmanan1 Overview of Query Evaluation Chapter 12 Ramakrishnan & Gehrke (Sections )
Advance Database Systems Query Optimization Ch 15 Department of Computer Science The University of Lahore.
1 B + -Trees: Search  If there are n search-key values in the file,  the path is no longer than  log  f/2  (n)  (worst case).
13.1 Chapter 13: Query Processing n Overview n Measures of Query Cost n Selection Operation n Sorting n Join Operation n Other Operations n Evaluation.
File Processing : Query Processing 2008, Spring Pusan National University Ki-Joune Li.
Computing & Information Sciences Kansas State University Wednesday, 02 Apr 2008CIS 560: Database System Concepts Lecture 27 of 42 Wednesday, 02 April 2008.
Chapter 13 Query Optimization Yonsei University 1 st Semester, 2015 Sanghyun Park.
1 Overview of Query Evaluation Chapter Outline  Query Optimization Overview  Algorithm for Relational Operations.
Chapter 13: Query Processing
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 12: Query Processing.
Chapter 4: Query Processing
Database Management System
Chapter 12: Query Processing
Overview of Query Optimization
Chapter 15 QUERY EXECUTION.
File Processing : Query Processing
File Processing : Query Processing
Dynamic Hashing Good for database that grows and shrinks in size
Query Processing and Optimization
C. Faloutsos Query Optimization – part 1
Query Processing B.Ramamurthy Chapter 12 11/27/2018 B.Ramamurthy.
Lecture 2- Query Processing (continued)
Advance Database Systems
Chapter 12 Query Processing (1)
Overview of Query Evaluation
C. Faloutsos Query Optimization – part 2
Presentation transcript:

Quick Review of Apr 17 material Multiple-Key Access –There are good and bad ways to run queries on multiple single keys Indices on Multiple Attributes –Combining two keys into a single concatenated attribute –Grid files crude array with one dimension and linear scale for each attribute more than one cell may point to a given bucket of values array grid may be dynamically resized during use –Other alternatives are spatial databases: R-tree, quad-trees, k-d tree Bitmap Indices –linear array of bits: bit j is set if tuple j has the attribute that this bitmap tracks (e.g., “Moonroof”: bit j is 1 if record j is a car with moonroof) –Queries are answered by combining several bitmaps using and, or, not

Today HW #4: due Thursday April 24 (next class) –Questions: 12.11, 12.12, 12.13, No HW for next week Today: –Start Chapter 13: Query Processing

Query Processing SQL is good for humans, but not as an internal (machine) representation of how to calculate a result Processing an SQL (or other) query requires these steps: –parsing and translation turning the query into a useful internal representation in the extended relational algebra –optimization manipulating the relational algebra query into the most efficient form (one that gets results the fastest) –evaluation actually computing the results of the query

Query Processing Diagram

Query Processing Steps 1. parsing and translation –details of parsing are covered in other places (texts and courses on compilers). We’ve already covered SQL and relational algebra; translating between the two should be relatively familiar ground 2. optimization –This is the meat of chapter 13. How to figure out which plan, among many, is the best way to execute a query 3. evaluation –actually computing the results of the query is mostly mechanical (doesn’t require much cleverness) once a good plan is in place.

Query Processing Example Initial query:selectbalance fromaccount wherebalance<2500 Two different relational algebra expressions could represent this query: –sel balance<2500 (Pro balance (account)) –Pro balance ( sel balance<2500 (account)) which choice is better? It depends upon metadata (data about the data) and what indices are available for use on these operations.

Query Processing Metadata Cost parameters (some are easy to maintain; some are very hard - - this is statistical info maintained in the system’s catalog) –n(r ): number of tuples in relation r –b(r ): number of disk blocks containing tuples of relation r –s(r ): average size of a tuple of relation r –f(r ): blocking factor of r: how many tuples fit in a disk block –V(A,r): number of distinct values of attribute A in r. (V(A,r)=n(r ) if A is a candidate key) –SC(A,r): average selectivity cardinality factor for attribute A of r. Equivalent to n(r )/V(A,r). (1 if A is a key) –min(A,r): minimum value of attribute A in r –max(A,r): maximum value of attribute A in r

Query Processing Metadata (2) Cost parameters are used in two important computations: –I/O cost of an operation –the size of the result In the following examination we’ll find it useful to differentiate three important operations: –Selection (search) for equality (R.A1=c) –Selection (search) for inequality (R.A1>c) (range queries) –Projection on attribute A1

Selection for Equality (no indices) Selection (search) for equality (R.A1=c) –cost (sequential search on a sorted relation) = b(r )/2average unsuccessful b(r )/2 + SC(A1,r) -1average successful –cost (binary search on a sorted relation) = log b(r )average unsuccessful log b(r ) + SC(A1,r) -1average successful –size of the resultn(select(R.A1=c)) = SC(A1,r) = n(r )/V(A1,r)

Selection for Inequality (no indices) Selection (search) for inequality (R.A1>c) –cost (file unsorted) = b(r ) –cost (file sorted on A1) = b(r )/2 + b(r )/2 (if we assume that half the tuples qualify) b(r ) in general (regardless of the number of tuples that qualify. Why?) –size of the result = depends upon the query; unpredictable

Projection on A1 Projection on attribute A1 –cost = b(r ) –size of the result n(Pro(R,A1)) = V(A1,r)

Selection (Indexed Scan) for Equality Primary Index on key: cost = (height+1) unsuccessful cost = (height+1) +1 successful Primary (clustering) Index on non-key: cost = (height+1) + SC(A1,r)/f(r ) all tuples with the same value are clustered Secondary Index cost = (height+1) + SC(A1,r) tuples with the same value are scattered

Selection (Indexed Scan) for Inequality Primary Index on key: search for first value and then pick tuples >= value cost = (height+1) +1+ size of the result (in disk pages) = height+2 + n(r ) * (max(A,r)- c)/(max(A,r)-min(A,r))/f(r ) Primary (clustering) Index on non-key: cost as above (all tuples with the same value are clustered) Secondary (non-clustering) Index cost = (height+1) +B-treeLeaves/2 + size of result (in tuples) = height+1 + B-treeLeaves/2 + n(r ) * (max(A,r)-c)/(max(A,r)-min(A,r))

Complex Selections Conjunction (select where theta1 and theta2) (s1 = # of tuples satisfying selection condition theta1) combined SC = (s1/n(r )) * (s2/n(r )) = s1*s2/n(r ) 2 assuming independence of predicates Disjunction (select where theta1 or theta2) combined SC = 1 - (1 - s1/n(r )) * (1 - s2/n(r )) = s1/n(r )) + s2/n(r ) - s1*s2/n(r ) 2 Negation (select where not theta1) n(! Theta1) = n(r ) - n(Theta1)

Complex Selections with Indices GOAL: apply the most restrictive condition first and combined use of multiple indices to reduce the intermediate results as early as possible Why? No index will be available on intermediate results! Conjunctive selection using one index B: –select using B and then apply remaining predicates on intermediate results Conjunctive selection using a composite key index (R.A1, R.A2): –create a composite key or range from the query values and search directly (range search on the first attribute (MSB of the composite key) only) Conjunctive selection using two indices B1 and B2: –search each separately and intersect the tuple identifiers (TIDs) Disjunctive selection using two indices B1 and B2: –search each separately and union the tuple identifiers (TIDs)