Presentation is loading. Please wait.

Presentation is loading. Please wait.

Answering Queries Using Views Presented by: Mahmoud ELIAS.

Similar presentations


Presentation on theme: "Answering Queries Using Views Presented by: Mahmoud ELIAS."— Presentation transcript:

1

2 Answering Queries Using Views Presented by: Mahmoud ELIAS

3 Plan Introduction Motivations Views Problem definition When is a view usable for a query? Answering queries using views for Data Integration Conclusions Bibliography

4 Introduction Relevance to a wide variety of data management problems Example: University schema: –Prof(name, area) –Course(c-number, title) –Teaches(prof, c-number, quarter, evaluation) –Registered(student, c-number, quarter) –Major(student, dept) –WorksIn(prof, dept) –Advises(prof, student)

5 Motivations Query optimization: +Speed up the computation of the query. –Be attention of indexes. Maintaining physical Data independence Vs Vs Data Integration: –uniform query interface to a multitude of autonomous heterogeneous data sources

6 Motivation (cont.) Data warehouse design: – we must be able to answer all the required queries over the warehouse using only these views Semantic data caching –Check whether the cached results of a previously computed query can be used for a new query

7 Choosing and creating views System analysis feedback statistics Building views The user will never be satisfied

8 Problem definition A view is a derived relation defined in terms of stored base relations. A query Q 1 is said to be contained in a query Q 2 (Q 1  Q 2 ), if for all databases D, the set of tuples computed for Q 1 is a subset of those computed for Q 2 Two queries are said to be equivalent if Q 1  Q 2 and Q 2  Q 1 Given a query Q and a set of view definitions V 1,V 2,…,V m, a rewriting of the query using the views is a query expression Q’ that refers only to the given views. Equivalent rewriting vs. maximally-contained rewriting

9 When is a View usable for a Query? A view can be useful for a query if the set of relations it mentions overlaps with that of the query, and it selects some of the attributes selected by the query. When the view contains grouping and aggregation but the query does not, then unless the query removes duplicates in the select clause, the view cannot be used to answer a query.

10 Example1: selectAdvises.prof, Advises.student, Registered.quarter fromRegistered, Teaches, Advises whereRegistered.c-number=Teaches.c-number and Registered.quarter=Teaches. quarter and Advises.prof=Teaches.prof and Advises.student= Registered.student and Registered.quarter ≥ « winter98 » create view V1 as selectRegistered.student, Teaches.prof, Registered.quarter fromRegistered, Teaches whereRegistered.c-number=Teaches.c-number and Registered.quarter=Teaches. quarter and Registered.quarter ≥ « winter97 »

11 Example2: create view V2 as selectc-number, year, max(evaluation) as maxeval, count(*) as offerings fromTeaches wherec-number ≥ 400 groupByc-number, year selectyear, count(*), max(evaluation) fromTeaches wherec-number ≥ 500 groupByyear selectyear, sum(offerings), Max(maxeval) fromV2 wherec-number ≥ 500 groupByyear

12 Conjunctive queries Safe - each variable in the head appears in the body. No arithmetic comparisons in predicates. headbody subgoal

13 The Bucket algorithm Nb of query rewritings that need to be considered can be drastically reduced if we first consider each subgoal in the query in isolation, and determine which views may be relevant to each subgoal. 2 steps: –Create a bucket for each subgoal in Q –Query rewritings that are conjunctive queries, each consisting of one conjunct from every bucket.

14 Example V1(st,c-n,qu,ti):- Registered(st,c-n,qu), Course(c-n,ti), c-n ≥ 500, qu ≥ Aur98 V2(st,pr,c-n,qu):- Registered(st,c-n,qu), Teaches(pr,c-n,qu) V3(st,c-n):- Registered(st,c-n,qu), year ≤ Aut94 V4(pr,c-n,ti,qu):- Registered(st,c-n,qu), Course(c-n,ti), Teaches(pr,c-n,qu), qu ≤ Aut97 Q(S,C,P) :- Teaches(P,C,Q), Registered(S,C,Q), Course(C,T), C ≥300, Q ≥ Aut95 Teaches(P,C,Q)Registered(S,C,Q)Course(C,T) V2(S’,P,C,Q) V4(P,C,T’,Q) V1(S,C,Q,T’) V2(S,P’,C,Q) V1(S’,C,Q’,T) V4(P’,C,T,Q’)

15 Example (cont.) All combinations: q’(S,C,P) :- V2(S’,P,C,Q), V1(S,C,Q,T’), V1(S’,C,Q’,T)  q’(S,C,P) :- V2(S’,P,C,Q), V1(S,C,Q,T') q’(S,C,P) :- V4(P,C,T’,Q), V1(S,C,Q,T’), V4(P’,C,T,Q’) q’(S,C,P) :- V2(S,P,C,Q), V4(P,C,T’,Q) The algorithm produces a maximally-contained rewriting Teaches(P,C,Q)Registered(S,C,Q)Course(C,T) V2(S’,P,C,Q) V4(P,C,T’,Q) V1(S,C,Q,T’) V2(S,P’,C,Q) V1(S’,C,Q’,T) V4(P’,C,T,Q’)

16 The Inverse-rules algorithm Construct a set of rules that invert the view definitions V3(dept, c-number) :- Major(student, dept), Registered(student,c-number) Major(f 1 (dept,X), dept) :- V3(dept,X) Registered(f 1 (Y,c-number),c-number) :- V3(Y,c-number) Q(dept) :- Major(student,dept), Registered(student,444) V3 = {(CS, 444), (EE, 444), (CS,333)}  Registered = {(f 1 (CS,444), CS), (f 1 (EE,444), EE), (f 1 (CS,333), CS)} Major = {(f 1 (CS,444), 444), (f 1 (EE,444), 444), (f 1 (CS,333), 333)}  Q = {CS, EE} more efficient rewriting  unfold the inverse rules and remove redundant subgoals from the unfolded rules.

17 Bucket vs. Inverse-rules Both algorithms produce a maximally-contained rewriting Computing buckets is similar in spirit to that of computing the inverse rules: compute the views that are relevant to single atoms of the DB relations. The Bucket algo. Computes the relevant views by taking into consideration the context in which th atom appears in the query. The inverse rules can be computed once, and be applicable to any query.

18 The MiniCon algorithm It addresses the limitations of previous algorithms instead of building rewritings by combining rewritings for each of the query subgoal or the DB relation, we consider how each of the variables in the query can interact with the available views (MiniCon Description MCD) Q(D) :- Major(S,D), Registered(S,444,Q), Advises(P,S) V1(dept) :- Major(student,dept), Registered(student,444,quarter) V2(prof,dept,area) :- Advises(prof,student), Prof(name,area) V3(dept,c-number) :- Major(student,dept), Registered(student,c- number,quarter), Advises(prof,student)

19 Results The key advantage of the MiniCon algorithm is that the second phase of the algorithm considers much fewer combinations of MCDs compared to the Cartesian product of the buckets or compared to the number of unfoldings of inverse rules.

20 Conclusions Using views to answer queries is an important problem. Especially for information integration on the web. Query containment and containment mappings provide the key for solving the problem.

21 Conclusions (cont.) The variants of the problem are NP- complete. This is not too bad, since queries are usually short. In many practical cases, there is an algorithm for solving the problem.

22 Bibliography [Lev00]Alon Y. Levy. Answering Queries Using Views: A Survey. Department of Computer Science and Engineering, University of Washington, pages 1-43, 2000 [Mit99]Prasenjit Mitra. An Algorithm for Answering Queries Efficiently Using Views. Infolab, Stanford University, pages 1-13, September,1999. [DG97b]Oliver M. Duschka and Michael R. Genesereth. Query Planning in InfoMaster. In Proceedings of the ACM Symposium on Applied Computing, San Jose, CA, 1997.


Download ppt "Answering Queries Using Views Presented by: Mahmoud ELIAS."

Similar presentations


Ads by Google