Optimizing Queries Using Materialized Views Qiang Wang CS848.

Slides:



Advertisements
Similar presentations
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Advertisements

Tuning: overview Rewrite SQL (Leccotech)Leccotech Create Index Redefine Main memory structures (SGA in Oracle) Change the Block Size Materialized Views,
February 18, 2012 Lesson 3 Standard SQL. Lesson 3 Standard SQL.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification.
1 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 SQL: Queries, Programming, Triggers Chapter 5 Modified by Donghui Zhang.
EXECUTION PLANS By Nimesh Shah, Amit Bhawnani. Outline  What is execution plan  How are execution plans created  How to get an execution plan  Graphical.
Answering Queries Using Views Advanced DB Class Presented by David Fuhry March 9, 2006.
Introduction to Structured Query Language (SQL)
XP Chapter 3 Succeeding in Business with Microsoft Office Access 2003: A Problem-Solving Approach 1 Analyzing Data For Effective Decision Making.
FALL 2004CENG 351 File Structures and Data Management1 SQL: Structured Query Language Chapter 5.
Introduction to Structured Query Language (SQL)
1 Relational Model. 2 Relational Database: Definitions  Relational database: a set of relations  Relation: made up of 2 parts: – Instance : a table,
The Relational Database Model. 2 Objectives How relational database model takes a logical view of data Understand how the relational model’s basic components.
Database Systems More SQL Database Design -- More SQL1.
Introduction to Structured Query Language (SQL)
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 7 Introduction to Structured Query Language (SQL)
Optimizing queries using materialized views J. Goldstein, P.-A. Larson SIGMOD 2001.
Architecting a Large-Scale Data Warehouse with SQL Server 2005 Mark Morton Senior Technical Consultant IT Training Solutions DAT313.
Relational Database Performance CSCI 6442 Copyright 2013, David C. Roberts, all rights reserved.
Query Optimization, part 2 CS634 Lecture 13, Mar Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem.
1 IT420: Database Management and Organization SQL - Data Manipulation Language 27 January 2006 Adina Crăiniceanu
Chapter 3 Single-Table Queries
CSE314 Database Systems More SQL: Complex Queries, Triggers, Views, and Schema Modification Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E Pearson.
Physical Database Design Chapter 6. Physical Design and implementation 1.Translate global logical data model for target DBMS  1.1Design base relations.
Database Management 9. course. Execution of queries.
Analyzing Data For Effective Decision Making Chapter 3.
Ashwani Roy Understanding Graphical Execution Plans Level 200.
The Relational Database Model
1 Single Table Queries. 2 Objectives  SELECT, WHERE  AND / OR / NOT conditions  Computed columns  LIKE, IN, BETWEEN operators  ORDER BY, GROUP BY,
Instructor: Jinze Liu Fall Basic Components (2) Relational Database Web-Interface Done before mid-term Must-Have Components (2) Security: access.
Data Warehouse Design Xintao Wu University of North Carolina at Charlotte Nov 10, 2008.
DATABASE TRANSACTION. Transaction It is a logical unit of work that must succeed or fail in its entirety. A transaction is an atomic operation which may.
JOI/1 Data Manipulation - Joins Objectives –To learn how to join several tables together to produce output Contents –Extending a Select to retrieve data.
6 1 Lecture 8: Introduction to Structured Query Language (SQL) J. S. Chou, P.E., Ph.D.
IST 220 Introduction to Databases Course Wrap-up.
1 Chapter 10 Joins and Subqueries. 2 Joins & Subqueries Joins – Methods to combine data from multiple tables – Optimizer information can be limited based.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
IS 230Lecture 6Slide 1 Lecture 7 Advanced SQL Introduction to Database Systems IS 230 This is the instructor’s notes and student has to read the textbook.
Oracle 11g: SQL Chapter 4 Constraints.
Chapter 4 Constraints Oracle 10g: SQL. Oracle 10g: SQL 2 Objectives Explain the purpose of constraints in a table Distinguish among PRIMARY KEY, FOREIGN.
Database Fundamental & Design by A.Surasit Samaisut Copyrights : All Rights Reserved.
AL-MAAREFA COLLEGE FOR SCIENCE AND TECHNOLOGY INFO 232: DATABASE SYSTEMS CHAPTER 7 (Part II) INTRODUCTION TO STRUCTURED QUERY LANGUAGE (SQL) Instructor.
Indexes and Views Unit 7.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2009.
Session 1 Module 1: Introduction to Data Integrity
CS34311 The Relational Model. cs34312 Why Relational Model? Currently the most widely used Vendors: Oracle, Microsoft, IBM Older models still used IBM’s.
Query Processing – Implementing Set Operations and Joins Chap. 19.
Lecture 15: Query Optimization. Very Big Picture Usually, there are many possible query execution plans. The optimizer is trying to chose a good one.
7 1 Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 7.6 Advanced Select Queries SQL provides useful functions that.
Quiz Which of the following is not a mandatory characteristic of a relation? Rows are not ordered (Not required) Each row is a unique There is a.
LM 5 Introduction to SQL MISM 4135 Instructor: Dr. Lei Li.
1 CS122A: Introduction to Data Management Lecture #4 (E-R  Relational Translation) Instructor: Chen Li.
1 Agenda TMA02 M876 Block 4. 2 Model of database development data requirements conceptual data model logical schema schema and database establishing requirements.
Concepts of Database Management, Fifth Edition Chapter 3: The Relational Model 2: SQL.
IT 5433 LM4 Physical Design. Learning Objectives: Describe the physical database design process Explain how attributes transpose from the logical to physical.
More SQL: Complex Queries, Triggers, Views, and Schema Modification
Practical Database Design and Tuning
More SQL: Complex Queries,
Optimizing Queries Using Materialized Views
CS 405G: Introduction to Database Systems
Chapter # 7 Introduction to Structured Query Language (SQL) Part II.
More SQL: Complex Queries, Triggers, Views, and Schema Modification
SQL: Structured Query Language
Contents Preface I Introduction Lesson Objectives I-2
All about Indexes Gail Shaw.
Presentation transcript:

Optimizing Queries Using Materialized Views Qiang Wang CS848

Answering queries using views Cost-based rewriting (query optimization and physical data independence) Logical rewriting (data integration and ontology integration) System-R styleTransformational approaches Rewriting algorithms Query answering algorithms

Transformational approach Transformation-based optimizer Related work  Oracle 8i DBMS  IBM DB2  Microsoft SQL Server

A practical and scalable solution Microsoft SQL Server 2000 Limitations on the materialized view  A single-level SQL statement containing selections, inner joins and an optional group-by  It must reference base tables  Aggregation view must output all grouping columns and a count column  Aggregation functions are limited to SUM and COUNT A solution by Goldstein and Larson  An efficient view-matching algorithm  A novel index structure on view definitions

Problem definition View matching with single-view substitution: Given a relational expression in SPJG form, find all materialized views from which the expression can be computed and, for each view found, construct a substitute expression equivalent to the given expression.

Create view v1 as select p_partkey,p_name,p_retailprice, count_big(*) as cnt sum(l_extendedprice*l_quantity) as gross_revenue from dbo.lineitem, dbo.part where p_partkey < 1000 and p_partkey = l_partkey and p_name like ‘%steel%’ group by p_partkey,p_name,p_retailprice Create unique clustered index v1_cidx on v1(p_partkey) select p_partkey,p_name, sum(l_extendedprice*l_quantity), from dbo.lineitem, dbo.part where p_partkey < 500 and p_partkey = l_partkey and p_name like ‘%steel%’ group by p_partkey,p_retailprice Materialized View Query

substitutability requirements 1. The view contains all rows needed by the query expression 2. All required rows can be selected from the view 3. All output expressions can be computed from the output of the view 4. All output rows occur with the correct duplication factor

Do all required rows exist in the view(1) Check whether the source tables in the view are superset of the source tables in the query Check whether the query selection predicates can imply the view selection predicates Equijoin subsumption test Range subsumption test Residual subsumption test

Equijoin subsumption test Test  Step1: equivalence class construction {p_partkey,l_partkey} as Vc1 and Qc1  Step2: check whether every nontrivial view equivalence class is a subset of some query equivalence class Compensating equality constraints Missing cases

Range subsumption test Test  Associate (view/query) equivalence classes with lower and upper bound according to the range predicates; uninitialized bounds are assigned infinitum Upperbound(Vc1) = 1000 Upperbound(Qc1) = 500  Check whether the view is more tightly constrained than the query on ranges Compensating range constraints Missing cases

Residual subsumption test Test (a shallow matching algorithm)  An expression is represented by a text string and a list of equivalence class referenced Text string: like ‘%steel%’ Attached equivalence class: Vc1  Check whether every conjunct in the view residual predicate matches a conjunct in the query residual predicate Compensating residual constraints Missing cases

Can the required rows be selected (2) Check whether the view covers output columns for compensating equality, range, and residual predicates

Can output expressions be computed (3) Check whether all output expressions of the query can be computed from the view  “l_expendedprice * l_price”

Do rows occur with correct duplication factor(4) Cardinality-preserving joins (table extension joins) A stronger condition is employed: an equijoin between all columns in a non-null foreign key in T and a unique key in S

Test  Given the query references tables T 1,…T n and the view reference tables T 1,…T n,T n+1,…T n+m  Construct foreign key join graph  Build cardinality-preserving relationships among tables  Check whether the extra tables can be eliminated Compensation (not needed)

Substitution with aggregation views Special requirements  The view contains no aggregation or is less aggregated than the query Exact subsumption Functionally determination  All columns required to perform further grouping are available in the view output COUNT,SUM,AVG Compensation  Add extra grouping and aggregation expression computation

A clever index ---- filter tree A1An ….. G1Gm ….. G1’Gm’ ….. View definition list

Internal structure: lattice index

Partitioning conditions Hub condition Source table condition Output expression Output column Residual constraint Range constraint Grouping expression Grouping column

Experiment & extension work Experiment  Concerning 1000 views, candidate set is reduced to less than 0.4% of the views on average; Optimization time is 0.15 second per query where around 10 different substitutions are generated Extension  Constraint test refinement  Relaxation on single-table substitutes

Consideration How about a consolidated lattice structure over views?  Construction && maintenance  Introducing DL for reasoning, but how about the aggregation support?