Access Path Selection in a Relation Database Management System (summarized in section 2)

Slides:



Advertisements
Similar presentations
Query Optimization Reserves Sailors sid=sid bid=100 rating > 5 sname (Simple Nested Loops) Imperative query execution plan: SELECT S.sname FROM Reserves.
Advertisements

Query Optimization CS634 Lecture 12, Mar 12, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
CS 540 Database Management Systems
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Relational Query Optimization Chapters 14.
Database Management Systems 3ed, R. Ramakrishnan and Johannes Gehrke1 Evaluation of Relational Operations: Other Techniques Chapter 14, Part B.
Database Management Systems, R. Ramakrishnan and Johannes Gehrke1 Evaluation of Relational Operations: Other Techniques Chapter 12, Part B.
Advanced Databases: Lecture 2 Query Optimization (I) 1 Query Optimization (introduction to query processing) Advanced Databases By Dr. Akhtar Ali.
1 Overview of Query Evaluation Chapter Objectives  Preliminaries:  Core query processing techniques  Catalog  Access paths to data  Index matching.
Query Evaluation. An SQL query and its RA equiv. Employees (sin INT, ename VARCHAR(20), rating INT, age REAL) Maintenances (sin INT, planeId INT, day.
Query Evaluation. SQL to ERA SQL queries are translated into extended relational algebra. Query evaluation plans are represented as trees of relational.
1 Relational Query Optimization Module 5, Lecture 2.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Query Evaluation Chapter 12.
CS263 Lecture 19 Query Optimisation.  Motivation for Query Optimisation  Phases of Query Processing  Query Trees  RA Transformation Rules  Heuristic.
Relational Query Optimization (this time we really mean it)
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Query Evaluation Chapter 12.
Access Path Selection in a RDBMS Shahram Ghandeharizadeh Computer Science Department University of Southern California.
1 Optimization - Selection. 2 The Selection Operation Table: Reserves(sid, bid, day, agent) A page (block) can hold 100 Reserves tuples There are 1,000.
Query Optimization II R&G, Chapters 12, 13, 14 Lecture 9.
1 Evaluation of Relational Operations: Other Techniques Chapter 12, Part B.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Relational Query Optimization Chapter 15.
CSCI 5708: Query Processing I Pusheng Zhang University of Minnesota Feb 3, 2004.
Query Optimization Overview Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2004 Some slide content derived.
Query Processing & Optimization
Overview of Query Optimization v Plan : Tree of R.A. ops, with choice of alg for each op. –Each operator typically implemented using a `pull’ interface:
Query Processing Presented by Aung S. Win.
Relational Database Performance CSCI 6442 Copyright 2013, David C. Roberts, all rights reserved.
Query Optimization, part 2 CS634 Lecture 13, Mar Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
Overview of Implementing Relational Operators and Query Evaluation
Introduction to Database Systems1 Relational Query Optimization Query Processing: Topic 2.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Database Performance Tuning and Query Optimization.
CSE 6331 © Leonidas Fegaras System R1 System R Optimizer Read the paper (available at the course web page): G. Selinger, M. Astrahan, D. Chamberlin, R.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Query Evaluation Chapter 12: Overview.
Access Path Selection in a Relational Database Management System Selinger et al.
Query Optimization. overview Histograms A histogram is a data structure maintained by a DBMS to approximate a data distribution Equiwidth vs equidepth.
1 Overview of Query Evaluation Chapter Overview of Query Evaluation  Plan : Tree of R.A. ops, with choice of alg for each op.  Each operator typically.
Database systems/COMP4910/Melikyan1 Relational Query Optimization How are SQL queries are translated into relational algebra? How does the optimizer estimates.
Advanced Databases: Lecture 8 Query Optimization (III) 1 Query Optimization Advanced Databases By Dr. Akhtar Ali.
Database Management 9. course. Execution of queries.
Ashwani Roy Understanding Graphical Execution Plans Level 200.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
Query Optimization Chap. 19. Evaluation of SQL Conceptual order of evaluation – Cartesian product of all tables in from clause – Rows not satisfying where.
Academic Year 2014 Spring. MODULE CC3005NI: Advanced Database Systems “QUERY OPTIMIZATION” Academic Year 2014 Spring.
Copyright © Curt Hill Query Evaluation Translating a query into action.
1 Chapter 10 Joins and Subqueries. 2 Joins & Subqueries Joins – Methods to combine data from multiple tables – Optimizer information can be limited based.
1 Relational Query Optimization Chapter Query Blocks: Units of Optimization  An SQL query is parsed into a collection of query blocks :  An SQL.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Implementing Relational Operators and Query Evaluation Chapter 12.
Chapter 12 Query Processing (1) Yonsei University 2 nd Semester, 2013 Sanghyun Park.
Introduction to Query Optimization, R. Ramakrishnan and J. Gehrke 1 Introduction to Query Optimization Chapter 13.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Introduction to Query Optimization Chapter 13.
Query Processing – Query Trees. Evaluation of SQL Conceptual order of evaluation – Cartesian product of all tables in from clause – Rows not satisfying.
Chapter 8 Physical Database Design. Outline Overview of Physical Database Design Inputs of Physical Database Design File Structures Query Optimization.
Query Optimization CMPE 226 Database Systems By, Arjun Gangisetty
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 12 – Introduction to.
Query Processing – Implementing Set Operations and Joins Chap. 19.
Chapter 13: Query Processing
Execution Plans Detail From Zero to Hero İsmail Adar.
How is data stored? ● Table and index Data are stored in blocks(aka Page). ● All IO is done at least one block at a time. ● Typical block size is 8Kb.
Tuning Transact-SQL Queries
CS222P: Principles of Data Management Lecture #15 Query Optimization (System-R) Instructor: Chen Li.
Introduction to Query Optimization
Overview of Query Optimization
Access Path Selection in a Relational Database Management System
Database Management Systems (CS 564)
Evaluation of Relational Operations: Other Operations
Relational Query Optimization
Relational Query Optimization
Evaluation of Relational Operations: Other Techniques
CS222: Principles of Data Management Lecture #15 Query Optimization (System-R) Instructor: Chen Li.
Relational Query Optimization
Relational Query Optimization
Presentation transcript:

Access Path Selection in a Relation Database Management System (summarized in section 2)

Processing an SQL statement parsing, optimization, code generation, execution an SQL statement may have many query blocks (nesting)

Optimizer validates parsed query collects statistics on referenced relations & columns discovers available access paths for each relation checks for type errors in expressions Access path selection: –determines order of evaluation of query blocks –a tree of alternate path choices is created for each query block with more than one relation –minimum cost access path is chosen from the tree results of optimizer is passed to code generation and execution components

RSS (Research Storage System) storage manager for System R Maintains physical storage, access paths, locking, logging, and recovery Relations are stored as a collection of tuples tuples are stored on 4K pages; pages are organized into segments segments completely contain one or more relations tuples are accessed via a scan: sequential scan or index scan indexes are B-trees with linked leaves sequential scan touches all the pages of a segment that contains a relation once index scans touch all the leaf pages of the index once; relation pages >=1 times if index and data tuples are in the same order, the data is “clustered” scans may takes a set of predicates to apply to a tuple before returning it –predicates are of the form (column op value)

Cost computation cost = page fetches + W*(RSI calls) –cost = IO costs + W * CPU costs an index that matches a boolean factor of the query is an efficient access path

Statistics NCARD(T): cardinality of the relation T TCARD(T): number of pages used for T P(T): fraction of pages in a segment used for T ICARD(I): number of distinct keys in index I NINDX(I): number of pages in index I

Selectivity column = value : F = 1/ICARD(column) if there is an index. F = 1/10 otherwise column1 = column2: F = 1/MAX(ICARD(column1), ICARD(column2)); F = 1/ICARD(column i); F = 1/10 column > value: F = (high key value - value) / (high key - low key) column between value1 and value2: F= (value2 - value1)/ (high key - low key) column IN (list of values): F = (# of items in list) * (selectivity for column = value) max is 1/2 columnA IN subquery: F = (card. of subquery) / (  card. of subquery relations) (pred1) OR (pred2): F = F(pred1) + F(pred2) - F(pred1) * F(pred2) (pred1) AND (pred2): F = F(pred1) * F(pred2) NOT pred: F = 1 - F(pred)

QCARD QCARD is (  card. of all relations) * (  F(pred i)) RSICARD is the expected number of calls to RSI (  card. of all relations) * (  F(sargable pred i)) An “interesting order” is an order specified by the GROUP BY or ORDER BY clause Single relation cost: cheapest access path which produces the “interesting order” or cheapest access path plus sorting cost of result

Cost Table (p. 515) index pages fetched plus data pages fetched plus W times RSI tuple retrieval calls. unique index matching an equal predicate: 1+1+W clustered index I matching one or more boolean factors: F(preds) *(NINDX(I) + TCARD) + W * RSICARD etc…

Joins nested loops and merging scans merging scans require sorts on the join column -- another “interesting order” n-way joins can be done by a succession of 2-way joins; not necessarily using the same technique. Results may be pipelined if a sort is not required.

Join ordering n! permutations of relation join orders join of (k+1) relation with previous k relations is independent of first k join order avoid Cartesian products when possible; make them as late as possible

Construct a tree construct a tree of possible join orderings: keep the cheapest order that produces an interesting ordering. First find the best way to access each single relation for each interesting ordering and unordered. Next, find the best way of joining any relation to each of these. Repeat until all relations have been added to each branch Choose the cheapest strategy that has an interesting ordering, or the cheapest strategy plus a sort. Total number of solutions to store: 2 n