CSE 636 Data Integration Answering Queries Using Views Bucket Algorithm.

Slides:



Advertisements
Similar presentations
Manipulation of Query Expressions. Outline Query unfolding Query containment and equivalence Answering queries using views.
Advertisements

CSE 636 Data Integration Data Integration Approaches.
CSE 636 Data Integration Conjunctive Queries Containment Mappings / Canonical Databases Slides by Jeffrey D. Ullman.
Information Integration Using Logical Views Jeffrey D. Ullman.
CPSC 504: Data Management Discussion on Chandra&Merlin 1977 Laks V.S. Lakshmanan Dept. of CS UBC.
Data integration Chitta Baral Arizona State University.
1 Global-as-View and Local-as-View for Information Integration CS652 Spring 2004 Presenter: Yihong Ding.
2005lav-ii1 Local as View: Some refinements  IM: Filtering irrelevant sources  Views with restricted access patterns  A summary of IM.
1 A Scalable Algorithm for Answering Queries Using Views Rachel Pottinger Qualifying Exam October 29, 1999 Advisor: Alon Levy.
1 Answering Queries Using Views Alon Y. Halevy Based on Levy et al. PODS ‘95.
Generating Efficient Plans for Queries Using Views Chen Li Stanford University with Foto Afrati (National Technical University of Athens) and Jeff Ullman.
1 Rewriting Nested XML Queries Using Nested Views Nicola Onose joint work with Alin Deutsch, Yannis Papakonstantinou, Emiran Curtmola University of California,
Relational Data Mining in Finance Haonan Zhang CFWin /04/2003.
The Last Lecture Agenda –1:40-2:00pm Integrating XML and Search Engines—Niagara way –2:00-2:10pm My concluding remarks (if any) –2:10-2:45pm Interactive.
A scalable algorithm for answering queries using views Rachel Pottinger, Alon Levy [2000] Rachel Pottinger and Alon Y. Levy A Scalable Algorithm for Answering.
MiniCon: A Scalable Algorithm for Answering Queries Using Views Rachel Pottinger and Alon Levy Affiliates Meeting February 24, 2000.
Multiplant Monopoly Here we study the situation where a monopoly sells in one market but makes the output in two facilities.
Local-as-View Data Integration Zachary G. Ives University of Pennsylvania CIS 650 – Database & Information Systems February 21, 2005.
2005lav-iv1  On the Inverse rules algorithm It is guaranteed to compute the certain answers But, what about its efficiency? As presented, it computes.
CSE 636 Data Integration Answering Queries Using Views Overview.
CSE 636 Data Integration Answering Queries Using Views MiniCon Algorithm.
Querying Heterogeneous Information Sources Using Source Descriptions Authors: Alon Y. Levy Anand Rajaraman Joann J. Ordille Presenter: Yihong Ding.
CSE 636 Data Integration Limited Source Capabilities Slides by Hector Garcia-Molina.
1 Query Planning with Limited Source Capabilities Chen Li Stanford University Edward Y. Chang University of California, Santa Barbara.
Adding and Subtracting Integers. RULE #1: Same Signs!! When you’re adding two numbers with the same sign, just ignore the signs! Add them like normal!
1 Presentation on Row Reduction Techniques Presented By : Name: Amit Grover
Presenter: Dongning Luo Sept. 29 th 2008 This presentation based on The following paper: Alon Halevy, “Answering queries using views: A Survey”, VLDB J.
1 Data integration Most slides are borrowed from Dr. Chen Li, UC Irvine.
The Database and Info. Systems Lab. University of Illinois at Urbana-Champaign Light-weight Domain-based Form Assistant: Querying Web Databases On the.
CSE 636 Data Integration Limited Source Capabilities Slides by Hector Garcia-Molina Fall 2006.
DBSQL 3-1 Copyright © Genetic Computer School 2009 Chapter 3 Relational Database Model.
14/10/04 AIPP Lecture 7: The Cut1 Controlling Backtracking: The Cut Artificial Intelligence Programming in Prolog Lecturer: Tim Smith Lecture 7 14/10/04.
Mediators, Wrappers, etc. Based on TSIMMIS project at Stanford. Concepts used in several other related projects. Goal: integrate info. in heterogeneous.
Factoring a polynomial means expressing it as a product of other polynomials.
SPARQL Query Graph Model (How to improve query evaluation?) Ralf Heese and Olaf Hartig Humboldt-Universität zu Berlin.
Answering Queries Using Views LMSS’95 Laks V.S. Lakshmanan Dept. of Comp. Science UBC.
Factoring Trinomials with ax 2 + bx + c 6x x Now you need to find the right combination of numbers in the correct order.
Google’s Deep-Web Crawl By Jayant Madhavan, David Ko, Lucja Kot, Vignesh Ganapathy, Alex Rasmussen, and Alon Halevy August 30, 2008 Speaker : Sahana Chiwane.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Logic Programming and Prolog Goal: use formalism of first-order logic Output described by logical formula (theorem) Input described by set of formulae.
Presented by Jiwen Sun, Lihui Zhao 24/3/2004
Answering Queries Using Views: The Last Frontier.
Algebra Simplifying and collecting like terms. Before we get started! Believe it or not algebra is a fairly easy concept to deal with – you just need.
CS848 Presentation Heng YU (Henry)
Title of your Study Your Name Date of your defense.
Types of factoring put the title 1-6 on the inside of your foldable and #7 on the back separating them into sum and cubes 1.Greatest Common Factor 2.Difference.
The Database and Info. Systems Lab. University of Illinois at Urbana-Champaign Light-weight Domain-based Form Assistant: Querying Web Databases On the.
Objective: Learn to describe the relationships and extend the terms in arithmetic sequence.
Welcome to CPSC 534B: Information Integration Laks V.S. Lakshmanan Rm. 315.
Math Problem -- ages Making your own variable equation to solve a math riddle.
Answering Queries Using Views Presented by: Mahmoud ELIAS.
Harnessing the Deep Web : Present and Future -Tushar Mhaskar Jayant Madhavan, Loredana Afanasiev, Lyublena Antova, Alon Halevy January 7,
DeMorgan’s Theorem DeMorgan’s 2nd Theorem
Solving Two- Step Equations
Answering Queries using Templates with Binding Patterns
CHAPTER 4: Representing Integer Data
Solving Equations by Adding or Subtracting
Basic SQL Lecture 6 Fall
Solving Two- Step Equations
State Standards Review Project - Guidelines
GAME TIME: Solving 2-Step Equations
TE2: Combining Like Terms
Overview of Query Evaluation
Improving Overlap Farrokh Alemi, Ph.D.
ECE 352 Digital System Fundamentals
Materializing Views With Minimal Size To Answer Queries
Implementation of Learning Systems
Answering Queries Using Views: A Survey
Solving Linear Equations
Test Design Techniques Software Testing: IN3240 / IN4240
Presentation transcript:

CSE 636 Data Integration Answering Queries Using Views Bucket Algorithm

2 Each subgoal g of Q must be “covered” by some view Make a list of candidates (buckets) per query subgoal Consider combinations of candidates from different buckets Not all combos are “compatible” Keep the compatible ones and minimize them Discard the ones contained in another Take their union The Bucket Algorithm

3 q(X,Y,R) :- ForSale(X,Y,C,”auto”), Review(X,R,”auto”), Y > 1985 Step 1: For each subgoal, put the relevant sources into a bucket: V1(name, year) :- ForSale(name, year, “France”, “auto”), year > 1990 would be relevant V3(name, year) :- ForSale(name, year, “France”, “cheese”) would be irrelevant Step 2: Take the Cartesian product of the buckets Algorithm produces maximally contained rewriting Ignores interactions between subgoals in Step 1

4 The Bucket Algorithm: Example V1(Std,Crs,Qtr,Title) :- reg(Std,Crs,Qtr), course(Crs,Title), Crs ≥ 500, Qtr ≥ Aut98 V2(Std,Prof,Crs,Qtr) :- reg(Std,Crs,Qtr), teaches(Prof,Crs,Qtr) V3(Std,Crs) :- reg(Std,Crs,Qtr), Qtr ≤ Aut94 V4(Prof,Crs,Title,Qtr) :- reg(Std,Crs,Qtr), course(Crs,Title), teaches(Prof,Crs,Qtr), Qtr ≤ Aut97 q(S,C,P) :- teaches(P,C,Q), reg(S,C,Q), course(C,T), C ≥ 300, Q ≥ Aut95 Step 1: For each query subgoal, put the relevant sources into a bucket

5 The Bucket Algorithm: Example V1(Std,Crs,Qtr,Title) :- reg(Std,Crs,Qtr), course(Crs,Title), Crs ≥ 500, Qtr ≥ Aut98 V2(Std,Prof,Crs,Qtr) :- reg(Std,Crs,Qtr), teaches(Prof,Crs,Qtr) V3(Std,Crs) :- reg(Std,Crs,Qtr), Qtr ≤ Aut94 V4(Prof,Crs,Title,Qtr) :- reg(Std,Crs,Qtr), course(Crs,Title), teaches(Prof,Crs,Qtr), Qtr ≤ Aut97 q(S,C,P) :- teaches(P,C,Q), reg(S,C,Q), course(C,T), C ≥ 300, Q ≥ Aut95 P  Prof, C  Crs, Q  Qtr Note: Arithmetic predicates don’t pose a problem V2 Buckets V4 teachesregcourse

6 The Bucket Algorithm: Example V1(Std,Crs,Qtr,Title) :- reg(Std,Crs,Qtr), course(Crs,Title), Crs ≥ 500, Qtr ≥ Aut98 V2(Std,Prof,Crs,Qtr) :- reg(Std,Crs,Qtr), teaches(Prof,Crs,Qtr) V3(Std,Crs) :- reg(Std,Crs,Qtr), Qtr ≤ Aut94 V4(Prof,Crs,Title,Qtr) :- reg(Std,Crs,Qtr), course(Crs,Title), teaches(Prof,Crs,Qtr), Qtr ≤ Aut97 q(S,C,P) :- teaches(P,C,Q), reg(S,C,Q), course(C,T), C ≥ 300, Q ≥ Aut95 S  Std, C  Crs, Q  Qtr Note:V3 doesn’t work: arithmetic predicates not consistent V4 doesn’t work: S not in the output of V4 V2 Buckets V4 teachesregcourse V1 V2

7 The Bucket Algorithm: Example V1(Std,Crs,Qtr,Title) :- reg(Std,Crs,Qtr), course(Crs,Title), Crs ≥ 500, Qtr ≥ Aut98 V2(Std,Prof,Crs,Qtr) :- reg(Std,Crs,Qtr), teaches(Prof,Crs,Qtr) V3(Std,Crs) :- reg(Std,Crs,Qtr), Qtr ≤ Aut94 V4(Prof,Crs,Title,Qtr) :- reg(Std,Crs,Qtr), course(Crs,Title), teaches(Prof,Crs,Qtr), Qtr ≤ Aut97 q(S,C,P) :- teaches(P,C,Q), reg(S,C,Q), course(C,T), C ≥ 300, Q ≥ Aut95 C  Crs, T  Title V2 Buckets V4 teachesregcourse V1 V2 V1 V4

8 The Bucket Algorithm: Example Step 2: Try all combos of views, one each from a bucket Test satisfaction of arithmetic predicates in each case –e.g., two views may not overlap, i.e., they may be inconsistent Desired rewriting = union of surviving ones Query rewriting 1: q1(S,C,P) :- V2(S’,P,C,Q), V1(S,C,Q,T’), V1(S”,C,Q’,T) –no problem from arithmetic predicates (none in V2) –May or may not be minimal (why?) V2 V4 teachesregcourse V1 V2 V1 V4

9 The Bucket Algorithm: Example Unfolding of rewriting 1: q1’(S,C,P) :- r(S’,C,Q), t(P,C,Q), r(S,C,Q), c(C,T’), r(S”,C,Q’), c(C,T), C ≥ 500, Q ≥ Aut98, C ≥ 500, Q’ ≥ Aut98 Black r’s can be mapped to green r: S’  S, S”  S, Q’  Q Black c can be mapped to green c: just extend above mapping to T  T’ Minimized unfolding of rewriting 1: q1m’(S,C,P) :- t(P,C,Q), r(S,C,Q), c(C,T’), C ≥ 500, Q ≥ Aut98 Minimized rewriting 1: q1m(S,C,P) :- V2(S’,P,C,Q), V1(S,C,Q,T’)

10 The Bucket Algorithm: Example Query Rewriting 2: q2(S,C,P) :- V2(S’,P,C,Q), V1(S,C,Q,T’), V4(P’,C,T,Q’) q2’(S,C,P) :- r(S’,C,Q), t(P,C,Q), r(S,C,Q), r(S,C,Q), c(C,T’), C ≥ 500, Q ≥ Aut98, r(S”,C,Q’), c(C,T), t(P’,C,Q’), Q’ ≤ Aut97 This combo is infeasible: consider the conjunction of arithmetic predicates in V1 and V4 Query rewriting 3: q3(S,C,P) :- V2(S’,P,C,Q), V2(S,P’,C,Q), V4(P”,C,T,Q’) V2 V4 teachesregcourse V1 V2 V1 V4 V2 V4 teachesregcourse V1 V2 V1 V4

11 The Bucket Algorithm: Example Unfolding of rewriting 3: q3’(S,C,P) :- r(S’,C,Q), t(P,C,Q), r(S,C,Q), t(P’,C,Q), r(S”,C,Q’), c(C,T), t(P”,C,Q’), Q’ ≤ Aut97 The green subgoals can cover the black ones under the mapping: S’  S, S”  S, P’  P, P”  P, Q’  Q Minimized rewriting 3: q3m(S,C,P) :- V2(S,P,C,Q), V4(P,C,T,Q) Verify that there are only two rewritings that are not covered by others Maximally Contained Rewriting: q’ = q1m  q3m

12 The Bucket Algorithm: Example 2 Query: q(X) :- cites(X,Y), cites(Y,X), sameTopic(X,Y) Views: V4(A) :- cites(A,B), cites(B,A) V5(C,D) :- sameTopic(C,D) V6(F,H) :- cites(F,G), cites(G,H), sameTopic(F,G) Note: Should we list V4(X) twice in the buckets? V4 Buckets V6 cites sameTopic V4 V6 V5 V6

13 The Bucket Algorithm: Example 2 Consider all combos & check for containment of the unfolded rewriting in Q V4(X) cannot be combined with anything (why?) Try q1(X) :- V4(X), V4(X), V5(X,Y) Try q2(X) :- V4(X), V6(X,Y), V5(X,Y) Does any of these work? When can we discard a view from consideration?

14 The Bucket Algorithm: Example 2 Here is a successful rewriting: q3(X) :- V6(X,Y), V6(X,Y), V6(X,Y) By itself is not contained in Q But, with subgoal X=Y added, it is! By minimizing the rewriting, we get: q3m(X,Y) :- V6(X,X)

15 The Bucket Algorithm: Example 2 Remarks: V4 didn’t contribute to any rewrite, but the bucket algorithm doesn’t recognize it ahead Consider: q2(X,Y) :- cites(X,Y), cites(Y,X) Then both cites predicates can be folded into V4 –Not recognized by the bucket algorithm

16 The State of Affairs Bucket algorithm: –deals well with predicates, Cartesian product can be large (containment check required for every candidate rewriting) Inverse rules: –modular (extensible to binding patterns, FD’s) –no treatment of predicates –resulting rewritings need significant further optimization Neither scales up The MINICON algorithm: –change perspective: look at query variables

17 References Querying Heterogeneous Information Sources Using Source Descriptors –By Alon Y. Levy, Anand Rajaraman and Joann J. Ordille –VLDB, 1996 Laks VS Lakshmanan –Lecture Slides Alon Halevy –Answering Queries Using Views: A Survey –VLDB Journal, 2000 –