Answering Queries Using Views Advanced DB Class Presented by David Fuhry March 9, 2006.

Slides:



Advertisements
Similar presentations
Independent consultant Available for consulting In-house workshops Cost-Based Optimizer Performance By Design Performance Troubleshooting Oracle ACE Director.
Advertisements

Lukas Blunschi Claudio Jossen Donald Kossmann Magdalini Mori Kurt Stockinger.
1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011.
Query Optimization of Frequent Itemset Mining on Multiple Databases Mining on Multiple Databases David Fuhry Department of Computer Science Kent State.
Query Optimization CS634 Lecture 12, Mar 12, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
The Volcano/Cascades Query Optimization Framework
EXECUTION PLANS By Nimesh Shah, Amit Bhawnani. Outline  What is execution plan  How are execution plans created  How to get an execution plan  Graphical.
Query Execution, Concluded Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems November 18, 2003 Some slide content may.
Relational Databases for Querying XML Documents: Limitations & Opportunities VLDB`99 Shanmugasundaram, J., Tufte, K., He, G., Zhang, C., DeWitt, D., Naughton,
Technical BI Project Lifecycle
Management Information Systems, Sixth Edition
Outline SQL Server Optimizer  Enumeration architecture  Search space: flexibility/extensibility  Cost and statistics Automatic Physical Tuning  Database.
Manish Bhide, Manoj K Agarwal IBM India Research Lab India {abmanish, Amir Bar-Or, Sriram Padmanabhan IBM Software Group, USA
1 Primitives for Workload Summarization and Implications for SQL Prasanna Ganesan* Stanford University Surajit Chaudhuri Vivek Narasayya Microsoft Research.
FACT: A Learning Based Web Query Processing System Hongjun Lu, Yanlei Diao Hong Kong U. of Science & Technology Songting Chen, Zengping Tian Fudan University.
3-1 Chapter 3 Data and Knowledge Management
BUSINESS DRIVEN TECHNOLOGY
Chapter 7 Managing Data Sources. ASP.NET 2.0, Third Edition2.
Online Analytical Processing (OLAP) Hweichao Lu CS157B-02 Spring 2007.
IBM User Technology March 2004 | Dynamic Navigation in DITA © 2004 IBM Corporation Dynamic Navigation in DITA Erik Hennum and Robert Anderson.
SharePoint 2010 Business Intelligence Module 6: Analysis Services.
Databases From A to Boyce Codd. What is a database? It depends on your point of view. For Manovich, a database is a means of structuring information in.
Lecture 2 The Relational Model. Objectives Terminology of relational model. How tables are used to represent data. Connection between mathematical relations.
Presenter: Dongning Luo Sept. 29 th 2008 This presentation based on The following paper: Alon Halevy, “Answering queries using views: A Survey”, VLDB J.
IST722 Data Warehousing Business Intelligence Development with SQL Server Analysis Services and Excel 2013 Michael A. Fudge, Jr.
Practical Database Design and Tuning. Outline  Practical Database Design and Tuning Physical Database Design in Relational Databases An Overview of Database.
Optimizing Queries Using Materialized Views Qiang Wang CS848.
Context Tailoring the DBMS –To support particular applications Beyond alphanumerical data Beyond retrieve + process –To support particular hardware New.
DBXplorer: A System for Keyword- Based Search over Relational Databases Sanjay Agrawal Surajit Chaudhuri Gautam Das Presented by Bhushan Pachpande.
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
Physical Database Design Chapter 6. Physical Design and implementation 1.Translate global logical data model for target DBMS  1.1Design base relations.
CSE 636 Data Integration Limited Source Capabilities Slides by Hector Garcia-Molina Fall 2006.
OnLine Analytical Processing (OLAP)
SQL Server 2000 Acropolis Institute of Technology and Research Database fundamentals Prepared By: Rahul Patel.
Databases From A to Boyce Codd. What is a database? It depends on your point of view. For Manovich, a database is a means of structuring information in.
Using SAS® Information Map Studio
Efficiently Processing Queries on Interval-and-Value Tuples in Relational Databases Jost Enderle, Nicole Schneider, Thomas Seidl RWTH Aachen University,
XP New Perspectives on The Internet, Sixth Edition— Comprehensive Tutorial 3 1 Searching the Web Using Search Engines and Directories Effectively Tutorial.
The Internet 8th Edition Tutorial 4 Searching the Web.
Keyword Searching and Browsing in Databases using BANKS Seoyoung Ahn Mar 3, 2005 The University of Texas at Arlington.
Prepared By Prepared By : VINAY ALEXANDER ( विनय अलेक्सजेंड़र ) PGT(CS),KV JHAGRAKHAND.
McGraw-Hill/Irwin The O’Leary Series © 2002 The McGraw-Hill Companies, Inc. All rights reserved. Microsoft Access 2002 Lab 3 Analyzing Tables and Creating.
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
Indexes and Views Unit 7.
Multi-Query Optimization and Applications Prasan Roy Indian Institute of Technology - Bombay.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
McGraw-Hill/Irwin © 2008 The McGraw-Hill Companies, All Rights Reserved Chapter 7 Storing Organizational Information - Databases.
Chapter 9: Web Services and Databases Title: NiagaraCQ: A Scalable Continuous Query System for Internet Databases Authors: Jianjun Chen, David J. DeWitt,
Automatic Categorization of Query Results Kaushik Chakrabarti, Surajit Chaudhuri, Seung-won Hwang Sushruth Puttaswamy.
Lecture 15: Query Optimization. Very Big Picture Usually, there are many possible query execution plans. The optimizer is trying to chose a good one.
Chapter 13 Query Optimization Yonsei University 1 st Semester, 2015 Sanghyun Park.
1 Database Systems, 8 th Edition Star Schema Data modeling technique –Maps multidimensional decision support data into relational database Creates.
Query Processing and Query Optimization Database System Implementation CSE 507 Slides adapted from Silberschatz, Korth and Sudarshan Database System Concepts.
Management Information Systems by Prof. Park Kyung-Hye Chapter 7 (8th Week) Databases and Data Warehouses 07.
CHAPTER 19 Query Optimization. CHAPTER 19 Query Optimization.
Capability-Sensitive Query Processing on Internet Sources
Supporting Ranking and Clustering as Generalized Order-By and Group-By
Database System Implementation CSE 507
Physical Changes That Don’t Change the Logical Design
Chapter 25: Advanced Data Types and New Applications
Database Performance Tuning and Query Optimization
Introduction to Query Optimization
Chapter 15 QUERY EXECUTION.
CPSC-310 Database Systems
Enhance BI Applications and Simplify Development
Query Processing CSD305 Advanced Databases.
Distributed Databases
Query Optimization.
Syllabus Introduction Website Management Systems
Course Instructor: Supriya Gupta Asstt. Prof
Presentation transcript:

Answering Queries Using Views Advanced DB Class Presented by David Fuhry March 9, 2006

Presentation Outline 1) Introduction to views 2) Where views are used 3) How a database processes views 4) Query equivalence and containment 5) Using views to solve queries 6) Means of optimizing the above i.System-R, Transformational

What is a View? ● A named query [Hal0?] ● Virtual or logical table composed of the result set of a query [Wik06] ● Any relation that is not a part of the logical model, but is made visible to a user as a visual relation [SKS02]

An Example View CREATE VIEW CHEAP_HOTELS AS SELECT Hotel_name, Distance FROM HOTELS WHERE Price < 250; HOTELS SELECT * FROM HOTELS CHEAP_HOTELS SELECT * FROM CHEAP_HOTELS

Presentation Outline 1) Introduction to views 2) Where views are used 3) How a database processes views 4) Query equivalence and containment 5) Using views to solve queries 6) Means of optimizing the above i.System-R, Transformational

Where are views used? ● Query Optimization & DB Design – Significant performance gain (if materialized) – Logical perspective of physical data ● Data Integration – Provide common query interface to non-uniform data sources – Query -> Mediated Schema -> Source Descriptor -> Source Data

When might I use a view? ● Organize the data to be presented by a screen or page of an application ● Secure a protected global table by making only parts of it visible to users ● Reduce size of query statement – As do stored procedures and prepared statements – Views integrate into SQL expressions more easily though

When else might I use a view? ● Result set is too large to exist on disk – Frequent itemsets when the number of items is realistically large ● I can only access chunks of the data at a time – Web “screen scraping” of detail pages

Presentation Outline 1) Introduction to views 2) Where views are used 3) How a database processes views 4) Query equivalence and containment 5) Using views to solve queries 6) Means of optimizing the above i.System-R, Transformational

How does the database process views? SELECT * FROM CHEAP_HOTELS SELECT Hotel_name, Distance FROM HOTELS WHERE Price < 250 SELECT Hotel_name from CHEAP_HOTELS WHERE Distance > 0.3 SELECT Hotel_name FROM HOTELS WHERE Distance > 0.3 AND Price < 250

Data Integration ● Searching a website ● SELECT page_title, url FROM site_index WHERE MATCH (title, body) AGAINST ('+hotels -small') – But no need to use a view in the above; materialize it into a table and update it on a regular interval

Data Integration ● SELECT page_title, url FROM internet WHERE MATCH (title, body) AGAINST ('+hotels -small') LIMIT 10 – Pretty much impossible to materialize 'internet' – Not too difficult if you employ a search engine's API and make 'internet' a view of their cache

Answering queries with views ● Physical data independence – Normal RDBMS views ● Data integration – Ex: search tools that parse multiple formats

Presentation Outline 1) Introduction to views 2) Where views are used 3) How a database processes views 4) Query equivalence and containment 5) Using views to solve queries 6) Means of optimizing the above i.System-R, Transformational

Query Containment ● Q 1 Q 2 if the tuples (rows) returned by Q 1 are a subset of those returned by Q 2 – Q 1 is contained in Q 2 SELECT Hotel_name, Price, Distance FROM hotels WHERE Price < 400; SELECT Hotel_name, Price, Distance FROM hotels WHERE Price < 240; Q1 Q2 In the above case Q 1 Q 2

Query Equivalence ● Q1 and Q2 are equivalent if Q1 Q2 and Q2 Q1 SELECT Hotel_name, Price, Distance FROM hotels WHERE Distance >= 0.3 Q1 SELECT Hotel_name, Price, Distance FROM hotels WHERE Distance BETWEEN(0.3, MAX_FLOAT) Q2

Presentation Outline 1) Introduction to views 2) Where views are used 3) How a database processes views 4) Query equivalence and containment 5) Using views to solve queries 6) Means of optimizing the above i.System-R, Transformational

When can a view be useful for solving part of a query? ● If it has relation(s) in common with the query and selects some attributes selected by the query US_HotelsHawaii_Buildings Norwegian_BeaglesJordanian_Hotels

Grouping and aggregation ● How useful can views with grouping or aggregation be in solving the query? – If the view uses weaker predicates than the query, very useful – If the view uses stronger predicates, then perhaps as a subset of the results

Grouping and aggregation Price Distance Rooms Adapted from: Essbase Database Administrator's Guide – Understanding Multidimensional Databases

Presentation Outline 1) Introduction to views 2) Where views are used 3) How a database processes views 4) Query equivalence and containment 5) Using views to solve queries 6) Means of optimizing the above i.System-R, Transformational

Query Optimization ● For optimization purposes, views = tables to abstract logic ● Normally we want an equivalent rewriting ● For data integration, we may only want a contained rewriting – Although millions match, just get top 10 search results

Problem Statement ● How can we use more efficiently answer queries using a predefined set of materialized views?

Efficiently answering a query ● Suppose a query like the following is being run very often: – SELECT attr1, attr2,..., attrN FROM t1 INNER JOIN t2 ON t1.some_attr = t2.id... OUTER JOIN tM ON t1.other_attr = tM.id – Lots of JOINs. – M tables must be joined. The operation will be expensive. – Can we do better? (Hint: yes)

Efficiently answering a query M Source Tables Result Set How can database systems determine which (if any) materialized views to use to solve the query? Materialized Views

Query Optimization Techinques ● Here are a few techniques: – Bottom-up (System-R style) – Transformational – Other

Presentation Outline 1) Introduction to views 2) Where views are used 3) How a database processes views 4) Query equivalence and containment 5) Using views to solve queries 6) Means of optimizing the above i.System-R, Transformational

System-R style optimization 1. Identify potentially useful views 2. Termination testing 3. Pruning of plans 4. Combining partial plans

System-R style optimization 1. Identify potentially useful views – Here is where we use the concepts of query containment and equivalence discussed earlier – But to recap: “A view can be useful for a query if the set of relations it mentions overlaps with that of the query, and it selects some of the attributes selected by the query”

System-R style optimization 2. Termination testing – Differentiate partial query plans from complete query plans – Enumerate possible join orders and explore all partial paths Source TablesResult SetMaterialized Views

System-R style optimization 3. Pruning of plans – A plan is pruned if a cheaper plan exists which contains it Plan 0 Cost: 30 Plan 1 Cost: 25 Plan 2 Cost: 18

System-R style optimization 4. Combining partial plans – Consider different possible ways of joining the views – Use dynamic programming ● To solve optimal plan for Join({A, B, C, D}), find optimal (cheapest) plan among – Join({A, B, C}, D) – Join({A, B, D}, C) – Join({A, C, D}, B) – Join({B, C, D}, A) ● Use recursion to solve ● Discard the other three

System-R style optimization Source: An overview of Query Optimization in Relational Systems Chaudhuri, Surajit

Presentation Outline 1) Introduction to views 2) Where views are used 3) How a database processes views 4) Query equivalence and containment 5) Using views to solve queries 6) Means of optimizing the above i.System-R, Transformational

Transformational query rewriting ● Top-down approach ● Cache materialized view metadata – Relations the view is composed of – Columns the view covers – Groupings the view applies – etc. ● Build a multiway search tree out of all views' metadata – It partitions the views by the above attributes – Idea is to reject irrelevant views quickly

A Filter Tree Source table condition Hub condition Output column condition Grouping columns Range constrained columns Residual predicate condition Output expression condition Grouping expression condition {V1,V3}{V7,V9,V10}... Leaf nodes point to sets of relevant views

Other types of view rewriting ● Query Graph Model (QGM) – Split query into multiple boxes, and try to match the view's boxes with the query's

References ● [Hal0?] A.Y. Halevy. Answering Queries Using Views: A Survey. VLDB Journal, 10(4). ● [Ull97] Jeffrey D. Ullman. Information Integration Using Logical Views. ICDT ● [Wik06] Wikipedia contributors (2006). View (database). Wikipedia, The Free Encyclopedia ● [SKS02] Silbershatz, Korth, Sudarshan. Database System Concepts, 4 th Ed (100) ● [JL01] Jonathan Goldstein and Per-Ake Larson. Optimizing queries using materialized views: a practical, scalable solution. In Proc. Of SIGMOD, pages , ● [Ess06] IBM Corp. Essbase Analytic Services Database Administrator's Guide. Understanding Multidimensional Databases

Recap 1) Introduction to views 2) Where views are used 3) How a database processes views 4) Query equivalence and containment 5) Using views to solve queries 6) Means of optimizing the above i.System-R, Transformational

Appendix (Misc. Slides)

Are tables views? ● Maybe yes: ● Maybe no: – Views aren't supposed to contain concrete data or take up space. Physical Representation: Logical Representation: