Presentation is loading. Please wait.

Presentation is loading. Please wait.

ICDT'2001, London, UK1 On Answering Queries in the Presence of Limited Access Patterns Chen Li Stanford University joint work with Edward Chang, UC Santa.

Similar presentations


Presentation on theme: "ICDT'2001, London, UK1 On Answering Queries in the Presence of Limited Access Patterns Chen Li Stanford University joint work with Edward Chang, UC Santa."— Presentation transcript:

1 ICDT'2001, London, UK1 On Answering Queries in the Presence of Limited Access Patterns Chen Li Stanford University joint work with Edward Chang, UC Santa Barbara

2 2 r(Star, Movie)s(Movie, Award) Harrison FordAir Force One Henry FondaOn Golden Pond Kevin SpaceyAmerican Beauty …… On Golden PondOscar, Best Actor On Golden Pond Oscar, Best Actress American BeautyOscar, Best Picture …… A movie database Q(Award) :- r(henry fonda,Movie), s(Movie,Award)

3 3 r(Star, Movie) s(Movie, Award) Harrison FordAir Force One Henry FondaOn Golden Pond Kevin SpaceyAmerican Beauty …… On Golden PondOscar, Best Actor On Golden Pond Oscar, Best Actress American BeautyOscar, Best Picture …… Limited access patterns Should provide a star.Should provide a movie.

4 4 r(Star, Movie) s(Movie, Award) Harrison FordAir Force One Henry FondaOn Golden Pond Kevin SpaceyAmerican Beauty …… On Golden PondOscar, Best Actor On Golden Pond Oscar, Best Actress American BeautyOscar, Best Picture …… Answering Q given the restrictions Q(Award) :- r(henry fonda,Movie), s(Movie,Award)

5 5 Harrison FordAir Force One Henry FondaOn Golden Pond Kevin SpaceyAmerican Beauty …… On Golden PondOscar, Best Actor On Golden Pond Oscar, Best Actress American BeautyOscar, Best Picture …… The answer is complete Q(Award) :- r(henry fonda,Movie), s(Movie,Award) r(Star, Movie) s(Movie, Award) We did not retrieve all the tuples from the relations. Still we computed all tuples in the answer to the query.

6 6 Harrison FordAir Force One Henry FondaOn Golden Pond Kevin SpaceyAmerican Beauty …… On Golden PondOscar, Best Actor On Golden Pond Oscar, Best Actress American BeautyOscar, Best Picture …… Change the restriction Q(Award) :- r(henry fonda,Movie), s(Movie,Award) r(Star, Movie) s(Movie, Award) We cannot compute the complete answer to Q. There can always be some tuples that are not retrievable.

7 7 General questions Given a query on relations with limited access patterns, can we compute its complete answer by accessing the relations with legal patterns? –Stable queries Different classes of queries Another problem studied: testing query containment in the presence of binding patterns.

8 8 Rest of the talk Binding patterns, query stability Testing stability of queries: –Conjunctive queries –Unions of conjunctive queries –Conjunctive queries with arithmetic comparisons –Datalog queries Dynamic computability of complete answer to conjunctive queries Conclusion and related work

9 9 (I) Binding patterns Attributes with adornments: –b: bound –f: free Example: r(Star b, Movie f ), s(Movie b, Award f ) A relation can have multiple binding patterns.

10 10 Reasons of the restrictions: –Web search forms –Legacy databases –Security concerns Observations: If a relation does not have an “all-free” binding pattern, then after certain queries are sent to this relation, there can always be some tuples that have not been retrieved.

11 11 Query stability A query Q on relations with binding patterns is stable if for any database, we can compute Q’s complete answer by accessing the relations with legal patterns. The complete answer is the computable answer if we could retrieve all the tuples from the relations. Use partial tuples to derive the complete answer: we need reasoning.

12 12 Assumptions about bindings Use values from Q and results from the relations as bindings: –The definition says “for any database” –Relations not in the query can be assumed to be empty Not allowed: try arbitrary strings as bindings to access the relations –Does not terminate –Impractical

13 13 (II) Testing stability of queries Conjunctive query: q(X) :- g 1 (X 1 ),…,g n (X n ) Feasible order of some subgoals of a CQ Q. –Each subgoal in the order is executable –That is, we have enough bound variables to satisfy one binding pattern of the relation Example: Q(Award) :- r(henry fonda,Movie), s(Movie,Award)

14 14 Feasible CQs A CQ is feasible if it has a feasible order of all its subgoals. Lemma: A feasible CQ is stable. Testing feasibility of a CQ –A greedy algorithm: Inflationary

15 15 What if Q is not feasible? Q’(Award) :- r(henry fonda,Movie), s(Movie,Award),r(Star,Movie) Not feasible: variable Star cannot be bound Equivalent to the old query: Q(Award) :- r(henry fonda,Movie), s(Movie,Award) The new query Q’ is stable!

16 16 Testing stability of a CQ Theorem: A CQ Q is stable iff its minimal equivalent Q m is feasible. Minimal equivalent query Q m Q m is unique

17 17 Main idea of the proof Construct two databases of the relations They have the same observable tuples, but yield different answers to the query Thus, we cannot tell whether the computed answer is complete or not Same observable tuples Database D1 Database D2 Different answers to Q

18 18 Two algorithms for CQs Algorithm CQStable –Minimize Q, get its minimal equivalent Q m –Test feasibility of Q m by calling Inflationary Algorithm CQStable* –Compute all executable subgoals of Q –If all subgoals become executable, then Q is stable –Otherwise, test equivalence between Q and the new query with the executable subgoals CQStable* is more efficient than CQStable Testing stability of a CQ is NP-complete.

19 19 Other classes of queries Unions of CQs: two algorithms CQs with arithmetic comparisons: –An algorithm for the testing stability Datalog queries: –Undecidable –Give a sufficient condition for stability of Datalog

20 20 (III) Dynamic computability of complete answer to CQs For a nonstable CQ Q, for certain database, Q’s complete answer might be computed.

21 21 An example Q1: ans(B) :- r(a,B,C),s(C,D) Not stable For the following database, we can still compute Q1’s complete answer: {b1,b2}. d1 d2 … r(A b, B f, C f ) ab1 …… c1 ab2c2 ab2c3 … d1 … c1 d2c2 … s(C f, D b )p(D f )

22 22 Change the head argument Q2: ans(D) :- r(a,B,C),s(C,D) Still not stable For the database, we cannot compute Q2’s complete answer. d1 d2 … r(A b, B f, C f ) ab1 …… c1 ab2c2 ab2c3 … d1 … c1 d2c2 … s(C f, D b )p(D f )

23 23 Difference between Q1 and Q2 b f f f b Q1: ans(B) :- r(a,B,C),s(C,D) Q2: ans(D) :- r(a,B,C),s(C,D) Q1’s head argument B is bound by the executable subgoal r(a,B,C). Q2’s head argument D is not bound by the executable subgoal r(a,B,C).

24 24 Generalization q(X) :- g 1 (X 1 ), …, g k (X k ), g k+1 (X k+1 ), …, g n (X n ) Executable subgoals: E = g 1 (X 1 ),…, g k (X k ) If all arguments in X are bound in E: –we might compute its complete answer. –The computability is database dependent. If some arguments in X are not bound in E: –we can never compute its complete answer. –Unless the relation after the subgoals in E is empty.

25 25 A decision tree It guides the planning process of computing the complete answer to a query. Two approaches while traversing the tree: –optimistic –pessimistic

26 26 Conclusion Stability of queries with binding patterns Various classes of queries: –CQs (two algorithms) –Unions of CQs (two algorithms) –CQs with arithmetic comparisons (one algorithm) –Datalog (undecidable) Dynamic computability of a CQ’s complete answer Another contribution: decidability result of testing relative query containment with binding restrictions

27 27 Related work Answering queries using views with binding patterns [RSU95] Query optimization [YLUGM99,FLMS99] Computing maximal answer to queries [DL97,LC00] Our work considers whether the complete answer to a query is computable.


Download ppt "ICDT'2001, London, UK1 On Answering Queries in the Presence of Limited Access Patterns Chen Li Stanford University joint work with Edward Chang, UC Santa."

Similar presentations


Ads by Google