DOLAP'04 - Washington DC1 Constructing Search Space for Materialized View Selection Dimiti Theodoratos Wugang Xu New Jersey Institute of Technology.

Slides:



Advertisements
Similar presentations
Three-Step Database Design
Advertisements

1 Constraint operations: Simplification, Optimization and Implication.
Query Optimization of Frequent Itemset Mining on Multiple Databases Mining on Multiple Databases David Fuhry Department of Computer Science Kent State.
Manipulation Planning. In 1995 Alami, Laumond and T. Simeon proposed to solve the problem by building and searching a ‘manipulation graph’.
UNIT-III By Mr. M. V. Nikum (B.E.I.T). Programming Language Lexical and Syntactic features of a programming Language are specified by its grammar Language:-
1 CS 201 Compiler Construction Lecture 7 Code Optimizations: Partial Redundancy Elimination.
Interactive Generation of Integrated Schemas Laura Chiticariu et al. Presented by: Meher Talat Shaikh.
Constraint Logic Programming Ryan Kinworthy. Overview Introduction Logic Programming LP as a constraint programming language Constraint Logic Programming.
Firewall Policy Queries Author: Alex X. Liu, Mohamed G. Gouda Publisher: IEEE Transaction on Parallel and Distributed Systems 2009 Presenter: Chen-Yu Chang.
Chapter 7: Relational Database Design. ©Silberschatz, Korth and Sudarshan7.2Database System Concepts Chapter 7: Relational Database Design First Normal.
Logic in Computer Science Transparency No Chapter 3 Propositional Logic 3.6. Propositional Resolution.
CS 201 Compiler Construction
1 Functional Dependency and Normalization Informal design guidelines for relation schemas. Functional dependencies. Normal forms. Normalization.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 6 The Relational Algebra and Relational Calculus.
The Implicit Mapping into Feature Space. In order to learn non-linear relations with a linear machine, we need to select a set of non- linear features.
D Nagesh Kumar, IIScOptimization Methods: M1L4 1 Introduction and Basic Concepts Classical and Advanced Techniques for Optimization.
Advanced Topics in Database Systems Cost-Based Optimization of Decision Support Queries using Transient - Views Subbu N. Subramanian IBM Santa Teresa Labs.
PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.
LOGICAL DATABASE DESIGN
Partial and Total derivatives Derivative of a function of several variables Notation and procedure.
USING SAT-BASED CRAIG INTERPOLATION TO ENLARGE CLOCK GATING FUNCTIONS Ting-Hao Lin, Chung-Yang (Ric) Huang Graduate Institute of Electrical Engineering,
Chapter 8: Relational Database Design First Normal Form First Normal Form Functional Dependencies Functional Dependencies Decomposition Decomposition Boyce-Codd.
Operations Research Models
Transfer Graph Approach for Multimodal Transport Problems
Module Title? DBMS E-R Model to Relational Model.
CSCI 3140 Module 2 – Conceptual Database Design Theodore Chiasson Dalhousie University.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining By Tan, Steinbach, Kumar Lecture.
DBSQL 3-1 Copyright © Genetic Computer School 2009 Chapter 3 Relational Database Model.
Functional Programming Universitatea Politehnica Bucuresti Adina Magda Florea
Normalization. Learners Support Publications 2 Objectives u The purpose of normalization. u The problems associated with redundant data.
CSE314 Database Systems The Relational Algebra and Relational Calculus Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E Pearson Ed Slide Set.
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
Graph Indexing: A Frequent Structure- based Approach Alicia Cosenza November 26 th, 2007.
CP Summer School Modelling for Constraint Programming Barbara Smith 2. Implied Constraints, Optimization, Dominance Rules.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan Chapter 13: Query Processing.
Daniel Kroening and Ofer Strichman 1 Decision Procedures An Algorithmic Point of View BDDs.
Spatial Query Processing Spatial DBs do not have a set of operators that are considered to be basic elements in a query evaluation. Spatial DBs handle.
1 Computing Full Disjunctions Yaron Kanza Yehoshua Sagiv The Selim and Rachel Benin School of Engineering and Computer Science The Hebrew University of.
Multi-Query Optimization and Applications Prasan Roy Indian Institute of Technology - Bombay.
Materialized View Selection and Maintenance using Multi-Query Optimization Hoshi Mistry Prasan Roy S. Sudarshan Krithi Ramamritham.
Chapter 8 Physical Database Design. Outline Overview of Physical Database Design Inputs of Physical Database Design File Structures Query Optimization.
Chapter 7 Functional Dependencies Copyright © 2004 Pearson Education, Inc.
Globally Consistent Range Scan Alignment for Environment Mapping F. LU ∗ AND E. MILIOS Department of Computer Science, York University, North York, Ontario,
Copyright © Curt Hill Other Trees Applications of the Tree Structure.
Lecture 15: Query Optimization. Very Big Picture Usually, there are many possible query execution plans. The optimizer is trying to chose a good one.
Chapter 13 Query Optimization Yonsei University 1 st Semester, 2015 Sanghyun Park.
PPL Syntax & Formal Semantics Lecture Notes: Chapter 2.
CENG 424-Logic for CS Introduction Based on the Lecture Notes of Konstantin Korovin, Valentin Goranko, Russel and Norvig, and Michael Genesereth.
Chapter 2 Concept Learning
Functional Dependency and Normalization
CS 9633 Machine Learning Concept Learning
Chapter 2: Entity-Relationship Model
Dr. Rachel Ben-Eliyahu – Zohary
Chapter 7: Entity-Relationship Model
Associative Query Answering via Query Feature Similarity
Spatial Forest Planning with Integer Programming
Module 8 – Database Design Using the E-R Model
Chapter 14 Normalization – Part I Pearson Education © 2009.
Chapter 2: Intro to Relational Model
Advance Database Systems
A Framework for Testing Query Transformation Rules
Relational Database Design
Materializing Views With Minimal Size To Answer Queries
Machine Learning Chapter 2
Illustrative Example p p Lookup Table for Digits of h g f e ) ( d c b
Chapter 7a: Overview of Database Design -- Normalization
Chapter 6b: Database Design Using the E-R Model
Machine Learning Chapter 2
Presentation transcript:

DOLAP'04 - Washington DC1 Constructing Search Space for Materialized View Selection Dimiti Theodoratos Wugang Xu New Jersey Institute of Technology

DOLAP'04 - Washington DC2 Problem (1) Many problems in Databases require the selection of views to materialize. A general form of these problems is the following: –Given a set of queries, select a set of views to materialize such that a cost function is optimized and a number of constraints is satisfied.

DOLAP'04 - Washington DC3 Problem (2) Examples of view selection problems in DWing. –Given a set of queries to be satisfied by the DW, select a set of views to materialize such that the combination of the query evaluation and view maintenance cost is minimized and the size of the materialized views does not exceed the space allocated for materialization. –Find the best global evaluation plan for multiple incremental maintenance expressions for materialized views.

DOLAP'04 - Washington DC4 Problem (3) Solving view selection problems requires the identification of common sub-expressions between queries. Usually, this is done by identifying equivalent (or subsumed) view nodes in query evaluation plans of two queries in a bottom-up way. However, for this approach to be successful, all the alternative query evaluation plans of the queries need to be considered – an unfeasible task.

DOLAP'04 - Washington DC5 Example - Query Evaluation Plans and common subexpressions

DOLAP'04 - Washington DC6 Example - Query Evaluation Plans and common subexpressions

DOLAP'04 - Washington DC7 Example - Query Evaluation Plans and common subexpressions

DOLAP'04 - Washington DC8 Example - Query Evaluation Plans and common subexpressions

DOLAP'04 - Washington DC9 Example - Query Evaluation Plans and common subexpressions

DOLAP'04 - Washington DC10 Our approach

DOLAP'04 - Washington DC11 Goals Formalize the concept of ‘closeness’ of a common subexpression to two queries. Design algorithms for computing common sub- expressions that are as close to the queries as possible (these common subexpressions are called Closest Common Derivators). We address these problems starting with SPJ queries that involve self-joins.

DOLAP'04 - Washington DC12 Example Q1 Select R1.A, R2.B, R3.C From U, R as R1, R as R2, R as R3, S as S1 Where U.A=R1.A and R1.B =4 and R3.A =3 Q2 Select R4.C, R5.A, S3.C From S as s2, R as R4, R as R5, S as S3, T Where S2.C =5 and R5.A =3

DOLAP'04 - Washington DC13 Query Graph Representation

DOLAP'04 - Washington DC14 Query rewritings A rewriting Q’ of a query Q using view V is a query that references V and possibly base relations such that replacing V by its definition results in a query equivalent to Q. Notation: Q |-- V. If there is a rewriting of Q that references only V (no base relations), we call it complete rewriting. Notation: Q ||-- V. Otherwise, we call it a partial rewriting. A rewriting Q’ of query Q using a view V is called simple rewriting if view V has a single occurrence in Q’. A rewriting Q’ of a query Q using a view V is minimal if for every relation R that has n, n>0, occurrences in Q, R has k, 0 ≤ k≤ n, occurrences in V and n- k occurrences in Q’. Notation: Q |-- m V.

DOLAP'04 - Washington DC15 Common Derivator (CD) of two queries Let Q1 and Q2 be two queries and R1, R2 be two sets of relation occurrences from Q1 and Q2, respectively, that have the same number of relation occurrences of each relation. A common derivator (CD) of Q1 and Q2 over the respective sets R1 and R2 is a view V such that there is a minimal rewriting of Q1 (resp. Q2) using V that involves V and only those relation occurrences of Q1 (resp. Q2) that do not appear in R1 (resp. R2.)

DOLAP'04 - Washington DC16 Example - Common Derivator

DOLAP'04 - Washington DC17 Example - Common Derivator

DOLAP'04 - Washington DC18 Closeness relationship between CDs Let Q1, Q2 be two queries, V=  X (  C (R)) is a CD of Q1 and Q2 over R1 and R2, V’=  X (  C’ (R’)) be a CD of Q1 and Q2 over R1’ and R2’, and R1  R1’ and R2  R2’. CD V’ is closer to Q1 and Q2 than CD V if the following conditions are satisfied (a) V’ |-- V (b) if  C’ (R’) ||--  C (R) then V ||─V’

DOLAP'04 - Washington DC19 Example – Closeness relationship V2 is closer to Q1 and Q2 than V1

DOLAP'04 - Washington DC20 Example – Closeness relationship V3 is closer to Q1 and Q2 than V2

21 Example – Closeness relationship V4 is closer to Q1 and Q2 than V3

DOLAP'04 - Washington DC22 Closest Common Derivator (CCD) Let Q1 and Q2 be two queries. A Closest Common Derivator (CCD) of Q1 and Q2 over R1 and R2 is a CD V of Q1 and Q2 over R1 and R2 such that there exists no CD of Q1 and Q2 that is closer to Q1 and Q2 than V.

DOLAP'04 - Washington DC23 Example

DOLAP'04 - Washington DC24 How to compute a CCD Query graph in Full Form Condition merging Candidate CCDs Comparison of Candidate CCDs over the same occurrence set

DOLAP'04 - Washington DC25 Full Form Condition and Query A condition C is in full form if: 1.For every atomic condition A i such that C |= A i, there is an atomic condition A j in C such that A j |= A i (|= denotes logical implication) 2.Condition C does not include strongly redundant atomic conditions. A query  X (  C (R) is in full form if its condition C is in full from.

DOLAP'04 - Washington DC26 Example—Query graph Full Form

DOLAP'04 - Washington DC27 Example—Query graph full form

DOLAP'04 - Washington DC28 Condition Merging Two conditions C1 and C2 are mergeable if there is a non-valid condition C such that C1|=C and C2|=C and there exists no condition C', C'≡C, such that C1|=C', C2 |= C' and C’|= C. Condition C is called a merge of C1 and C2. We show how the merge of two conditions can be computed.

DOLAP'04 - Washington DC29 CCD Computation We introduce the concept of a candidate CCD: a graph representation of a CCD resulting by ‘merging’ common subparts of two query graphs. We show that a CCD of two queries is a candidate CCD. We express the CCD closeness relationship on candidate CCDs.

DOLAP'04 - Washington DC30 CCD Computation (2) In order to compute all the CCDs of two queries: We compute all the candidate CCDs of two query graphs in full form. We discard a candidate CCD V if there is another CCD V’ that is closer to the queries than V.

DOLAP'04 - Washington DC31 Future work Extend the concept of a CCD so that it applies to a more general class of queries. Use the concept of a CCD to identify common sub-expressions within one query Use the concept of a CCD to design algorithms for different materialized view selection problems.

DOLAP'04 - Washington DC32 Thanks