Query Folding Xiaolei Qian Presented by Ram Kumar Vangala.

Slides:



Advertisements
Similar presentations
1 Datalog: Logic Instead of Algebra. 2 Datalog: Logic instead of Algebra Each relational-algebra operator can be mimicked by one or several Database Logic.
Advertisements

CS848: Topics in Databases: Foundations of Query Optimization Topics covered  Introduction to description logic: Single column QL  The ALC family of.
University of Washington Database Group The Complexity of Causality and Responsibility for Query Answers and non-Answers Alexandra Meliou, Wolfgang Gatterbauer,
ICDT'2001, London, UK1 Minimizing View Sets without Losing Query-Answering Power Chen Li Stanford University joint work with Mayank Bawa and Jeff Ullman.
SLD-resolution Introduction Most general unifiers SLD-resolution
CSE 636 Data Integration Conjunctive Queries Containment Mappings / Canonical Databases Slides by Jeffrey D. Ullman.
2005conjunctive-ii1 Query languages II: equivalence & containment (Motivation: rewriting queries using views)  conjunctive queries – CQ’s  Extensions.
16.4 Estimating the Cost of Operations Project GuidePrepared By Dr. T. Y. LinVinayan Verenkar Computer Science Dept San Jose State University.
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
CPSC 504: Data Management Discussion on Chandra&Merlin 1977 Laks V.S. Lakshmanan Dept. of CS UBC.
1 Finite Constraint Domains. 2 u Constraint satisfaction problems (CSP) u A backtracking solver u Node and arc consistency u Bounds consistency u Generalized.
CS4432: Database Systems II
Midwestern State University Department of Computer Science Dr. Ranette Halverson CMPS 2433 – CHAPTER 4 GRAPHS 1.
1 Conjunctions of Queries. 2 Conjunctive Queries A conjunctive query is a single Datalog rule with only non-negated atoms in the body. (Note: No negated.
1 Steiner Tree on graphs of small treewidth Algorithms and Networks 2014/2015 Hans L. Bodlaender Johan M. M. van Rooij.
SECTION 21.5 Eilbroun Benjamin CS 257 – Dr. TY Lin INFORMATION INTEGRATION.
Parallel Scheduling of Complex DAGs under Uncertainty Grzegorz Malewicz.
Induction and Recursion. Odd Powers Are Odd Fact: If m is odd and n is odd, then nm is odd. Proposition: for an odd number m, m k is odd for all non-negative.
Efficient Query Evaluation on Probabilistic Databases
Constraint Logic Programming Ryan Kinworthy. Overview Introduction Logic Programming LP as a constraint programming language Constraint Logic Programming.
SECTIONS 21.4 – 21.5 Sanuja Dabade & Eilbroun Benjamin CS 257 – Dr. TY Lin INFORMATION INTEGRATION.
Anagh Lal Monday, April 14, Chapter 9 – Tree Decomposition Methods Anagh Lal CSCE Advanced Constraint Processing.
1 Polynomial Church-Turing thesis A decision problem can be solved in polynomial time by using a reasonable sequential model of computation if and only.
Firewall Policy Queries Author: Alex X. Liu, Mohamed G. Gouda Publisher: IEEE Transaction on Parallel and Distributed Systems 2009 Presenter: Chen-Yu Chang.
Local-as-View Mediators Priya Gangaraju(Class Id:203)
Validating Streaming XML Documents Luc Segoufin & Victor Vianu Presented by Harel Paz.
Mining Tree-Query Associations in a Graph Bart Goethals University of Antwerp, Belgium Eveline Hoekx Jan Van den Bussche Hasselt University, Belgium.
Computational Complexity, Physical Mapping III + Perl CIS 667 March 4, 2004.
©Silberschatz, Korth and Sudarshan18.1Database System Concepts Centralized Systems Run on a single computer system and do not interact with other computer.
1 Query Planning with Limited Source Capabilities Chen Li Stanford University Edward Y. Chang University of California, Santa Barbara.
Rada Chirkova (North Carolina State University) and Chen Li (University of California, Irvine) Materializing Views With Minimal Size To Answer Queries.
1 Inference Algorithm for Similarity Networks Dan Geiger & David Heckerman Presentation by Jingsong Wang USC CSE BN Reading Club Contact:
1 Efficiently Mining Frequent Trees in a Forest Mohammed J. Zaki.
Auditing Batches of SQL Queries Rajeev Motwani Shubha Nabar Dilys Thomas Stanford University.
Mining Association Rules of Simple Conjunctive Queries Bart Goethals Wim Le Page Heikki Mannila SIAM /8/261.
1 Relational Algebra and Calculus Chapter 4. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.
TEDI: Efficient Shortest Path Query Answering on Graphs Author: Fang Wei SIGMOD 2010 Presentation: Dr. Greg Speegle.
1 The Theory of NP-Completeness 2012/11/6 P: the class of problems which can be solved by a deterministic polynomial algorithm. NP : the class of decision.
Database Management 9. course. Execution of queries.
Lecture 22 More NPC problems
Mehdi Kargar Aijun An York University, Toronto, Canada Keyword Search in Graphs: Finding r-cliques.
1 Relational Algebra and Calculas Chapter 4, Part A.
COSC 2007 Data Structures II Chapter 14 Graphs I.
1 Computing Full Disjunctions Yaron Kanza Yehoshua Sagiv The Selim and Rachel Benin School of Engineering and Computer Science The Hebrew University of.
1 First order theories (Chapter 1, Sections 1.4 – 1.5) From the slides for the book “Decision procedures” by D.Kroening and O.Strichman.
A Dichotomy in the Complexity of Deletion Propagation with Functional Dependencies 2012 ACM SIGMOD/PODS Conference Scottsdale, Arizona, USA PODS 2012 Benny.
Raluca Paiu1 Semantic Web Search By Raluca PAIU
Concepts and Realization of a Diagram Editor Generator Based on Hypergraph Transformation Author: Mark Minas Presenter: Song Gu.
Copyright © Curt Hill Other Trees Applications of the Tree Structure.
Introduction to Graph Theory By: Arun Kumar (Asst. Professor) (Asst. Professor)
REED : Robust, Efficient Filtering and Event Detection in Sensor Network Daniel J. Abadi, Samuel Madden, Wolfgang Lindner Proceedings of the 31st VLDB.
Chapter 13 Query Optimization Yonsei University 1 st Semester, 2015 Sanghyun Park.
Chapter 18 Query Processing and Optimization. Chapter Outline u Introduction. u Using Heuristics in Query Optimization –Query Trees and Query Graphs –Transformation.
Written By: Presented By: Swarup Acharya,Amr Elkhatib Phillip B. Gibbons, Viswanath Poosala, Sridhar Ramaswamy Join Synopses for Approximate Query Answering.
Answering Queries Using Views Presented by: Mahmoud ELIAS.
1 SAT SAT: Given a Boolean function in CNF representation, is there a way to assign truth values to the variables so that the function evaluates to true?
CS589 Principles of DB Systems Fall 2008 Lecture 4c: Query Language Equivalence Lois Delcambre
Theory of Computational Complexity Probability and Computing Chapter Hikaru Inada Iwama and Ito lab M1.
COMPLEXITY THEORY IN PRACTICE
11.2 Polar Equations and Graphs
Dr. Rachel Ben-Eliyahu – Zohary
Computing Full Disjunctions
Proper Refinement of Datalog Clauses using Primary Keys
8.2 Polar Equations and Graphs
Chapter 2: Intro to Relational Model
Local-as-View Mediators
CSE 6408 Advanced Algorithms.
Materializing Views With Minimal Size To Answer Queries
Finite Model Theory Lecture 7
Switching Lemmas and Proof Complexity
Presentation transcript:

Query Folding Xiaolei Qian Presented by Ram Kumar Vangala

Query Folding Query Folding refers to the activity of determining if and how a query can be answered using a given set of resources. Query Folding refers to the activity of determining if and how a query can be answered using a given set of resources. Resources can be views or cached results of previous queries. Resources can be views or cached results of previous queries.

Why Query Folding The base relation referred to in a query might be stored remotely and accessing it might be expensive The base relation referred to in a query might be stored remotely and accessing it might be expensive Accessing the database might not be possible because of network problem( disconnected). Accessing the database might not be possible because of network problem( disconnected). Database might be conceptual but not physically available. Database might be conceptual but not physically available.

Query folding Used for Query optimization in centralized database Query optimization in centralized database Query processing in distributed database Query processing in distributed database Query answering in federated database. Query answering in federated database.

Example Patients (patient_id, clinic,dob,insurance) Patients (patient_id, clinic,dob,insurance) Physician (physician_id,clinic,pager_no) Physician (physician_id,clinic,pager_no) Drugs (drug_name,generic) Drugs (drug_name,generic) Notes (note_id,patient_id,physican_id,note_text) Notes (note_id,patient_id,physican_id,note_text) Allergy (note_id,drug_name,allergy_text) Allergy (note_id,drug_name,allergy_text) Prescription (note_id,drug_name,prescription_text) Prescription (note_id,drug_name,prescription_text)

Suppose that the database maintains materialized views defined as Suppose that the database maintains materialized views defined as CREATE VIEW Drug_Allergy (patient_id,drug_name,text) SELECT patient_id, drug_name, allergy_text FROM Notes, Allergy WHERE Notes.note_id=Allergy.note_id CREATE VIEW Drug_Allergy (patient_id,drug_name,text) SELECT patient_id, drug_name, allergy_text FROM Notes, Allergy WHERE Notes.note_id=Allergy.note_id

General query A user might use the following query to get the patient ids who are allergic to drug xd_2001. A user might use the following query to get the patient ids who are allergic to drug xd_2001. SELECT patient_id,allergy_text FROM Patients,Notes, Allergy WHERE Patients.patients_id=Notes.patient_id AND Notes.note_id=Allergy.note_id AND clinic=palo_alto AND drug_name=xd_2001 SELECT patient_id,allergy_text FROM Patients,Notes, Allergy WHERE Patients.patients_id=Notes.patient_id AND Notes.note_id=Allergy.note_id AND clinic=palo_alto AND drug_name=xd_2001

Folded Query Using View SELECT patient_id,text FROM Patients, Drug_Allergy WHERE Patients.patient_id=Drug_Allergy.pat ient_id AND clinic=palo_alto AND drug_name= xd_2001 SELECT patient_id,text FROM Patients, Drug_Allergy WHERE Patients.patient_id=Drug_Allergy.pat ient_id AND clinic=palo_alto AND drug_name= xd_2001 This query is more efficient than the original query This query is more efficient than the original query

Query containment is special case of Query folding Query containment is special case of Query folding The problem of containment for conjunctive queries is known as NP- complete. The problem of containment for conjunctive queries is known as NP- complete. NP-Complete: Toughest problems which do not have perfect solution NP-Complete: Toughest problems which do not have perfect solution

Conjunctive Queries Queries which are result of project- select-join where the selection condition are restricted to equality. Queries which are result of project- select-join where the selection condition are restricted to equality. Conjunctive Query form: Conjunctive Query form: h:- p 1,…….,p n Where h,p1,..,pn are atomic formulas whose arguments are variables or constants, h is the head, and p 1,…,p n is the body.

Variables in the head are distinguished and also appear in the body. Variables in the head are distinguished and also appear in the body. X, Y  distinguished variables X, Y  distinguished variables W, U  other variables W, U  other variables A, B  constants A, B  constants Example of conjunctive query Example of conjunctive query q(X,Y) :- patients(X,palo_alto,W 1,W 2 ), notes(W 3,X,W 4,W 5 ), allergy(W 3,xd_2001,Y) q(X,Y) :- patients(X,palo_alto,W 1,W 2 ), notes(W 3,X,W 4,W 5 ), allergy(W 3,xd_2001,Y)

Hypergraph Representation A hypergraph is a set of nodes A hypergraph is a set of nodes A hypergraph is a graph where edges can connect any number of vertices A hypergraph is a graph where edges can connect any number of vertices Conjunctive query can be represented by a hypergraph. Conjunctive query can be represented by a hypergraph. A conjunctive query is said to be acyclic if its hypergraph is acyclic. A conjunctive query is said to be acyclic if its hypergraph is acyclic. Example: Example: q(X,Y):- notes(W 1,X,W 2,W 3 ), allergy(W 1,Y,W 4 ), notes(W 5,X,W 6,W 7 ), prescription(W 5,Y,W 8 ) q(X,Y):- notes(W 1,X,W 2,W 3 ), allergy(W 1,Y,W 4 ), notes(W 5,X,W 6,W 7 ), prescription(W 5,Y,W 8 )

The example computes patients X and drugs Y such that X is prescribed to Y and is treated with allergy to Y. The example computes patients X and drugs Y such that X is prescribed to Y and is treated with allergy to Y.

Query-Folding Problem Folding Rules Folding Rules Let Q be a query, and R={R 1,…,R n } be a set of resources. Let Q be a query, and R={R 1,…,R n } be a set of resources. We assume that no two resources have the same resource predicate, and there are no variables in common between Q and R i or between R i and R j for 1≤i, j≤n We assume that no two resources have the same resource predicate, and there are no variables in common between Q and R i or between R i and R j for 1≤i, j≤n

Folding types Partial folding Partial folding Strong folding Strong folding Partial Folding: Partial Folding: A partial folding of Q using R is a conjunctive query Q’ such that Q’ Q and the body of Q’ contains one or more resource predicate defined in R. A partial folding of Q using R is a conjunctive query Q’ such that Q’ Q and the body of Q’ contains one or more resource predicate defined in R.

Strong Folding Strong Folding A strong folding of Q using R is a partial folding Q’ of Q using R such that Q Q’ A strong folding of a query is a partial folding that contains the original query.

Example: r1(X 1,X 2,X 3 ):- notes(U 1,X 1,U 2,U 3 ), allergy(U 1,X 2,X 3 ) r2(Y 1,Y 2,Y 3,Y 4 ):-notes(V 1,Y 1,Y 2,V 2 ), prescription(V 1,Y 3,V 3 ), drugs(Y 3,Y 4 ). drugs(Y 3,Y 4 ). Where X,Y  distinguished variable U,V  other variables U,V  other variables A complete folding of the above example will be as follows: q(X,Y) :-r 1 (X,Y,W),r 2 (X,W 1,Y,W 2 ).

Query Folding Algorithm Let Q be a query, G Q be the hypergraph representing Q, and F be a set of folding rules. Then the query folding algorithm computes complete or partial folding of Q using F. Let Q be a query, G Q be the hypergraph representing Q, and F be a set of folding rules. Then the query folding algorithm computes complete or partial folding of Q using F. Two steps: Two steps: InitializationInitialization Folding GenerationFolding Generation

Initialization: Initialization: Compute labels for every hyperedge in G QCompute labels for every hyperedge in G Q Given hyperedge e G Q and conjunct p assosiated with e, its label Le is a relation with attributes var(p). For every F f such that p unifies with head(F). with most general unifier, there is a tuple in Le consisting of two parts: tuple var(p) and expression body (F),where second part is used to store folding of p.Given hyperedge e G Q and conjunct p assosiated with e, its label Le is a relation with attributes var(p). For every F f such that p unifies with head(F). with most general unifier, there is a tuple in Le consisting of two parts: tuple var(p) and expression body (F),where second part is used to store folding of p.

Folding Generation Folding Generation Construct set of folding by u-joining the labels of all the hyperedges in an arbitrary order.Construct set of folding by u-joining the labels of all the hyperedges in an arbitrary order.

Query Folding for Acyclic Queries Existence of Folding Existence of Folding Pairwise consistency is necessary but not sufficient for the existence of foldings of cyclic queries.Pairwise consistency is necessary but not sufficient for the existence of foldings of cyclic queries.Example: q(X,Y):-patients(W 1,W 2,W 3,W 4 ), notes(X,W 1,W 5,Y), physician(W 5,W 2,W 6 ) with resources r 1 (X 1,X 2 ) :-patients(B 1,A 1,U 1,U 2 ), notes(X 1,B 1,C 1,X 2 ),physician(C 1,A 2,U 3 )

r 2 (Y 1,Y 2 ):-patients(B 2,A 2,V 1,V 2 ), notes(Y 1,B 2,C 2,Y 2 ), physician(C 2,A 1,V 3 ) Example:

Label for hyperedges

Theorem: There exists a complete folding of acyclic query Q using folding rules F iff no hyperedges in reduction(G Q ) have empty labels.

Example : consider an acyclic query which computes notes from clinics with allergic reactions. Example : consider an acyclic query which computes notes from clinics with allergic reactions. q(X,Y):- allergy(X,W 1,W 2 ), drug(W 1,W 3 ), notes(X,W 4,W 5,W 6 ), patients(W 4,Y,W 7,W 8 ) q(X,Y):- allergy(X,W 1,W 2 ), drug(W 1,W 3 ), notes(X,W 4,W 5,W 6 ), patients(W 4,Y,W 7,W 8 ) Resources: Resources: r1(X 1,X 2 ):- allergy(X 1,U 1,U 2 ),drugs(U 1,X 2 ),notes(X 1,U 3,U 4,U 5 ) r1(X 1,X 2 ):- allergy(X 1,U 1,U 2 ),drugs(U 1,X 2 ),notes(X 1,U 3,U 4,U 5 ) r2(Y 1,Y 2 ):- notes(Y 1,V 1,V 2,V 3 ),patients(V 1,Y 2,V 4,V 5 ),dr ugs(V 6,V 7 ) r2(Y 1,Y 2 ):- notes(Y 1,V 1,V 2,V 3 ),patients(V 1,Y 2,V 4,V 5 ),dr ugs(V 6,V 7 )

Folding rules Folding rules

Labels for hypergraph

Theorem: Theorem: There does not exist a partial folding of acyclic query Q using folding rules F iff every hyperedge in reduction (G Q ) has a singleton label.

Resources: Resources: Folding Rules: Folding Rules:

Labels to Hypergraph

Conclusion Query folding can be used in centralized databases Query folding can be used in centralized databases Queries can be answered using views instead of base relations. Queries can be answered using views instead of base relations. In multiple queries, the result of a query can be used to partially answer another query. In multiple queries, the result of a query can be used to partially answer another query. In client server application, views can be cached. In client server application, views can be cached.

Questions?

Thank you