1 Computing Full Disjunctions Yaron Kanza Yehoshua Sagiv The Selim and Rachel Benin School of Engineering and Computer Science The Hebrew University of.

Slides:



Advertisements
Similar presentations
Lecture 15. Graph Algorithms
Advertisements

CS848: Topics in Databases: Foundations of Query Optimization Topics covered  Introduction to description logic: Single column QL  The ALC family of.
DOLAP'04 - Washington DC1 Constructing Search Space for Materialized View Selection Dimiti Theodoratos Wugang Xu New Jersey Institute of Technology.
Query Folding Xiaolei Qian Presented by Ram Kumar Vangala.
Lecture 5 Graph Theory. Graphs Graphs are the most useful model with computer science such as logical design, formal languages, communication network,
Optimizing Join Enumeration in Transformation-based Query Optimizers ANIL SHANBHAG, S. SUDARSHAN IIT BOMBAY VLDB 2014
1 EE5900 Advanced Embedded System For Smart Infrastructure Static Scheduling.
Graphs Graphs are the most general data structures we will study in this course. A graph is a more general version of connected nodes than the tree. Both.
Induction and Recursion. Odd Powers Are Odd Fact: If m is odd and n is odd, then nm is odd. Proposition: for an odd number m, m k is odd for all non-negative.
Complexity 11-1 Complexity Andrei Bulatov NP-Completeness.
Implementation of Graph Decomposition and Recursive Closures Graph Decomposition and Recursive Closures was published in 2003 by Professor Chen. The project.
Applied Discrete Mathematics Week 12: Trees
1 Oblivious Querying of Data with Irregular Structure.
CS 311 Graph Algorithms. Definitions A Graph G = (V, E) where V is a set of vertices and E is a set of edges, An edge is a pair (u,v) where u,v  V. If.
Rectangle Visibility Graphs: Characterization, Construction, Compaction Ileana Streinu (Smith) Sue Whitesides (McGill U.)
Lists A list is a finite, ordered sequence of data items. Two Implementations –Arrays –Linked Lists.
Connected Components, Directed Graphs, Topological Sort COMP171.
Introduction to Graphs
Graphs & Exam Review 3 Chapter 10 – 13 CS211 CS Dept, MHC.
Technion 1 Generating minimum transitivity constraints in P-time for deciding Equality Logic Ofer Strichman and Mirron Rozanov Technion, Haifa, Israel.
Connected Components, Directed Graphs, Topological Sort Lecture 25 COMP171 Fall 2006.
Connected Components, Directed graphs, Topological sort COMP171 Fall 2005.
Constructing Signature Graphs for Signature Files Dr. Yangjun Chen Dept. Applied Computer Science University of Winnipeg Canada.
Testing for Connectedness & Cycles Connectedness of an Undirected Graph Implementation of Connectedness detection Algorithm. Implementation of Strong Connectedness.
Tirgul 7 Review of graphs Graph algorithms: – BFS (next tirgul) – DFS – Properties of DFS – Topological sort.
Automated Drawing of 2D chemical structures Kees Visser.
1 Efficiently Mining Frequent Trees in a Forest Mohammed J. Zaki.
Important Problem Types and Fundamental Data Structures
Minimum Spanning Trees. Subgraph A graph G is a subgraph of graph H if –The vertices of G are a subset of the vertices of H, and –The edges of G are a.
TECH Computer Science Graph Optimization Problems and Greedy Algorithms Greedy Algorithms  // Make the best choice now! Optimization Problems  Minimizing.
Chapter 9 – Graphs A graph G=(V,E) – vertices and edges
Efficient Gathering of Correlated Data in Sensor Networks
5.1  Routing Problems: planning and design of delivery routes.  Euler Circuit Problems: Type of routing problem also known as transversability problem.
Introduction to Graphs. Introduction Graphs are a generalization of trees –Nodes or verticies –Edges or arcs Two kinds of graphs –Directed –Undirected.
Module #19: Graph Theory: part II Rosen 5 th ed., chs. 8-9.
CIKM Finding and Approximating Top-k Answers in Keyword Proximity Search Benny Kimelfeld Yehoshua Sagiv Benny Kimelfeld and Yehoshua Sagiv The Selim.
GRAPHS THEROY. 2 –Graphs Graph basics and definitions Vertices/nodes, edges, adjacency, incidence Degree, in-degree, out-degree Subgraphs, unions, isomorphism.
Relations and their Properties
Agenda Review: –Planar Graphs Lecture Content:  Concepts of Trees  Spanning Trees  Binary Trees Exercise.
LOGO 1 Mining Templates from Search Result Records of Search Engines Advisor : Dr. Koh Jia-Ling Speaker : Tu Yi-Lang Date : Hongkun Zhao, Weiyi.
COSC 2007 Data Structures II Chapter 14 Graphs I.
Problem Statement How do we represent relationship between two related elements ?
1 Chapter 7 Objectives Upon completion you will be able to: Create and implement binary search trees Understand the operation of the binary search tree.
CompSci 102 Discrete Math for Computer Science March 13, 2012 Prof. Rodger Slides modified from Rosen.
Finding Regular Simple Paths Sept. 2013Yangjun Chen ACS Finding Regular Simple Paths in Graph Databases Basic definitions Regular paths Regular simple.
Introduction to Graph Theory By: Arun Kumar (Asst. Professor) (Asst. Professor)
Great Theoretical Ideas in Computer Science for Some.
Lecture 8CSE Intro to Cognitive Science1 Interpreting Line Drawings II.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 4: Intermediate.
Data Structures and Algorithm Analysis Graph Algorithms Lecturer: Jing Liu Homepage:
Chapter 11. Chapter Summary  Introduction to trees (11.1)  Application of trees (11.2)  Tree traversal (11.3)  Spanning trees (11.4)
Section Recursion 2  Recursion – defining an object (or function, algorithm, etc.) in terms of itself.  Recursion can be used to define sequences.
1 GRAPHS – Definitions A graph G = (V, E) consists of –a set of vertices, V, and –a set of edges, E, where each edge is a pair (v,w) s.t. v,w  V Vertices.
Lecture 20. Graphs and network models 1. Recap Binary search tree is a special binary tree which is designed to make the search of elements or keys in.
CSC317 1 At the same time: Breadth-first search tree: If node v is discovered after u then edge uv is added to the tree. We say that u is a predecessor.
Incomplete Answers over Semistructured Data Kanza, Nutt, Sagiv PODS 1999 Slides by Yaron Kanza.
Computing smallest and largest repetition factorization in O(n log n) time Hiroe Inoue, Yoshiaki Matsuoka, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai,
Breadth-First Search (BFS)
Directed Graphs 12/7/2017 7:15 AM Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia,
Decision Trees DEFINITION: DECISION TREE A decision tree is a tree in which the internal nodes represent actions, the arcs represent outcomes of an action,
Applied Discrete Mathematics Week 13: Graphs
Greedy Technique.
Computing Full Disjunctions
Directed Graphs 9/20/2018 1:45 AM Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia,
Depth-First Search.
Chapter 3 The Relational Database Model
Arrays and Linked Lists
Connected Components, Directed Graphs, Topological Sort
Directed Graphs Directed Graphs Directed Graphs 2/23/ :12 AM BOS
On the Graph Decomposition
Presentation transcript:

1 Computing Full Disjunctions Yaron Kanza Yehoshua Sagiv The Selim and Rachel Benin School of Engineering and Computer Science The Hebrew University of Jerusalem

2 A Formal Definitions of Full Disjunction

3 Preliminary Notations Given –a set of relations r 1, …, r n –with schemes R 1, …, R n, respectively We denote with t ij the j-th tuple of r i For X  R i, we denote by t ij [X] the projection of t ij on X Next, we give some preliminary definitions

4 Scheme Graph Two distinct schemes R i and R j are connected if R i  R j is non-empty The scheme graph of R 1, …, R n consists of –A node for each scheme R i –An edge between R i and R j if R i and R j are connected Movies Actors Actors-that-Directed Acted-in

5 Connected Relations Schemes Relation schemes R i 1, …, R i m are connected if their scheme graph is connected Tuples t i 1 j 1, …, t i m j m, from m distinct relations, are connected if the relation schemes of these relations are connected MoviesActors Acted-in Connected Relation Schemes MoviesActors Unconnected Relation Schemes

6 Join Consistent Tuples Two tuples t i 1 j 1 and t i 2 j 2 are join consistent if t i 1 j 1 [R i 1  R i 2 ] = t i 2 j 2 [R i 1  R i 2 ] m tuples, from m distinct relations, are join consistent if every pair of connected tuples are join consistent

7 Universal Tuple A universal tuple u is defined over all the attributes in R 1  …  R n and consists of null and non-null values We denote by û the non-null portion of u A universal tuple is called integrated tuple if there are m connected and join consistent tuples t i 1 j 1, …, t i m j m such that û is the natural join of t i 1 j 1, …, t i m j m

8 Maximal Universal Tuple A universal tuple u subsumes a universal tuple v if u is equal to v on all the non-null attributes of v (i.e., u can be created from v by replacing some null values with non-null values) In a given set D, a tuple u is maximal if there is no tuple in D, other than u, that subsumes u

9 A Full Disjunction The full disjunction of r 1, …, r n is the set of all maximal integrated tuples that can be generated from m tuples of r 1, …, r n

10 Acyclic Scheme Given a set of schemes R 1, …, R n, their scheme hypergraph consists of –A node for each attribute that appears in some R i –For each R i (1  i  n), a hyperedge that includes the attributes of R i α-acyclic scheme hypergraph: –All the hyperedges can be removed by a sequence of ear removals γ-acyclic scheme hypergraph: –The Bachman diagram of the scheme hypergraph is acyclic

11

12 Computing Full Disjunctions

13 Product Graph Given a query Q and a database D, the product of Q and D is a graph such that –The nodes are pairs of a node of Q and a node of D –The edges are between nodes such that the pair of nodes of Q and the pair of nodes of D both are connected by edges with the same label in Q and in D, respectively –The root is the pair of the root of Q and the root of D

title language 7 3 year 8 director 9 name 10 movie date of birth movie actor Zelig Antz 1998 English 1/12/1935 Woody Allen title year filmography item v1v1 v2v2 w1w1 v3v3 title actor movie director filmography item w2w2 w3w3 w4w4 date of birth name language The product of the query and the database is the next graph

15 title language director name movie date of birth movie actor title filmography item V 1, 1 V 2, 2 V 2, 3V 3, 4 w 1, 5w 2, 6w 1, 8w 3, 10w 4, 11 There are additional nodes that are not reachable from the root

16 For a subgraph G of the product graph 1.G has no repeated variables 2.G contains the root 3.Each node in G is reachable from the root 4.G preserves the constraints (edges) of the query Conditions 1 – 3  OR-matching graph Conditions 1 – 4  weak-matching graph Matching as a Subgraph of the Product Graph

17 title language director name movie date of birth movie actor title filmography item V 1, 1 V 2, 2 V 2, 3V 3, 4 w 1, 5w 2, 6w 1, 8w 3, 10w 4, 11 V 1, 1 V 2, 2 w 1, 5w 2, 6 V 3, 4 w 3, 10w 4, 11 An OR-matching graph It is also a weak-matching graph

18 title language director name movie date of birth movie actor title filmography item V 1, 1 V 2, 2 V 2, 3V 3, 4 w 1, 5w 2, 6w 1, 8w 3, 10w 4, 11 V 1, 1 V 3, 4 w 3, 10w 4, 11 Another OR-matching graph V 2, 3 w 1, 8 It is not a weak-matching graph since the “director” edge of the query is not preserved

19 Matching Graphs Each OR-matching graph represents an OR-matching (and each weak-matching graph represent a weak matching) Each OR-matching graph represents an OR-matching (and each weak-matching graph represent a weak matching) An OR-matching can be represented by many OR-matching graphs, but all these graphs have the same set of nodes and only differ by their edges (and the same it true for weak-matchings and weak-matching graphs) An OR-matching can be represented by many OR-matching graphs, but all these graphs have the same set of nodes and only differ by their edges (and the same it true for weak-matchings and weak-matching graphs) Matching

20 Intuition For DAG queries, matching graphs are constructed by adding edges according to the query constraints –The order of the extensions is simply made by using a topological sort of the query nodes For cyclic queries, a simple traversal over the query nor a simple traversal over the database will work –Instead, we use a stratum traversal over the matching graph

21 title language director name movie date of birth movie actor title filmography item V 1, 1 V 2, 2 V 2, 3V 3, 4 w 1, 5w 2, 6w 1, 8w 3, 10w 4, 11 Dividing the edges to strata Stratum 1 Stratum 2 Stratum 3 …

22 Stratum Traversal A stratum traversal is an ordered list that –Starts with the edges on stratum 1 –Followed by the edges of stratum 2 –… –Followed by the edges of stratum n –… The order of the edges in each stratum is unimportant The order of the edges in each stratum is unimportant There can be multiple occurrences of the same edge in different strata There can be multiple occurrences of the same edge in different strata We only look at the first n strata, where n is the size of the query We only look at the first n strata, where n is the size of the query

23 Computing the OR-Matching Graphs A set of OR-matching graphs is created We extend each OR-matching graph in the set by adding edges according to the stratum traversal Initially, the set includes a single graph that consists only the root of the product graph In each extension step, we try to add the current edge to the graphs that were produced so far, and this may cause –The creation of a new graph that replaces the extended graph –The creation of a new graph that is added to the set of graphs in addition to the existing graphs –No change to the set of graphs

24 Adding an Edge After each addition of an edge, subsumed matching-graphs are being removed, to avoid exponential blowup There are six cases that should be handled The cases of extending a graph by an edge will be described next

25 No change is being done movie V 1, O 1 V 2, O 2 actor V 3, O 4 title V 2, O 2 V 1, O 3 The target of the added edge has a node with a pair that includes the root of Q without the root of D 1 No change is being done movie V 1, O 1 V 2, O 2 actor V 3, O 4 movie V 1, O 1 V 2, O 2 The graph already includes the added edge 2

26 No change is being done movie V 1, O 1 V 2, O 2 actor V 3, O 4 title V 2, O 3 W 1, O 8 The graph does not include the source of the added edge 3 movie V 1, O 1 V 2, O 2 actor V 3, O 4 title V 2, O 2 W 1, O 5 The graph includes the source of the added edge and no node with the variable of the target 4 movie V 1, O 1 V 2, O 2 actor V 3, O 4 title W 1, O 5 The edge is added to the graph and the new graph replaces the existing graph

27 movie V 1, O 1 V 2, O 2 actor V 3, O 4 The graph already includes the source and the target of the added edge but does not include the added edge itself 5 title W 1, O 3 a.k.a V 2, O 2 W 1, O 3 The edge is added to the graph and the new graph replaces the existing graph a.k.a

28 movie V 1, O 1 V 2, O 2 actor V 3, O 4 film V 3, O 4 V 2, O 4 The graph includes the source of the added edge but also includes a node with the same variable as the variable in the target of the added edge 6 title W 1, O 3 Different nodes with the same variable V 2 A new graph is created and being added to the existing graph, without replacing it movie V 1, O 1 V 2, O 2 actor V 3, O 4 title W 1, O 3 movie V 1, O 1 V 2, O 4 actor V 3, O 4 film (V 2,O 2 ) is replaced by (V 2,O 4 ), and nodes that are not reachable from the root are being erased

29 Applying the algorithm to the movies example V 1, movie V 2, 2 V 1, 1 movie V 2, 2 V 1, 1 3 movie V 2, 2 V 1, 1 V 2, 3 movie V 2, 2 V 1, 1 V 2, 3 movie

30 4 movie V 2, 2 V 1, 1 V 2, 3 movie actor V 1, 1 V 3, 4 movie V 2, 2 V 1, 1 V 2, 3 movieactor V 3, 4 actor 5 title V 2, 2 w 1, 5 V 3, 4 movie V 2, 2 V 1, 1 V 2, 3 movieactor V 3, 4 actor title w 1, 5 V 3, 4 movie V 2, 2 V 1, 1 V 2, 3 movieactor V 3, 4 actor

31 6 language V 2, 2 w 2, 6 title w 1, 5 V 3, 4 movie V 2, 2 V 1, 1 V 2, 3 movieactor V 3, 4 actor language w 2, 6 title w 1, 5 V 3, 4 movie V 2, 2 V 1, 1 V 2, 3 movieactor V 3, 4 actor 7 language w 2, 6 title w 1, 5 V 3, 4 movie V 2, 2 V 1, 1 V 2, 3 movieactor V 3, 4 actor title w 1, 5 V 2, 3 language w 2, 6 title w 1, 5 V 3, 4 movie V 2, 2 V 1, 1 V 2, 3 movieactor V 3, 4 actor title w 1, 5

32 language w 2, 6 title w 1, 5 V 3, 4 movie V 2, 2 V 1, 1 V 2, 3 movie actor V 3, 4 actor title w 1, 5 8 name V 3, 4 w 3, 10 name w 3, 10 name w 3, 10 V 3, 4 w 4, 11 date of birth 9 w 4, 11 date of birth w 4, 11

33 language w 2, 6 title w 1, 5 V 3, 4 movie V 2, 2 V 1, 1 V 2, 3 movie actor V 3, 4 actor title w 1, 5 10 director V 2, 2 V 3, 4 name w 3, 10 name w 3, 10 date of birth w 4, 11 date of birth w 4, 11 language w 2, 6 title w 1, 5 V 3, 4 movie V 2, 2 V 1, 1 V 2, 3 movie actor V 3, 4 actor title w 1, 5 name w 3, 10 name w 3, 10 date of birth w 4, 11 date of birth w 4, 11 director

34 11 filmography item V 3, 4 V 2, 2 language w 2, 6 title w 1, 5 V 3, 4 movie V 2, 2 V 1, 1 V 2, 3 movie actor V 3, 4 actor title w 1, 5 name w 3, 10 name w 3, 10 date of birth w 4, 11 date of birth w 4, 11 title w 1, 5 movie V 2, 2 language w 2, 6 V 3, 4 V 1, 1 V 2, 3 movie actor V 3, 4 actor title w 1, 5 name w 3, 10 name w 3, 10 date of birth w 4, 11 date of birth w 4, 11 filmography item director V 1, 1 V 2, 2V 3, 4 actor name w 3, 10 date of birth w 4, 11 filmography item Subsumed by the left matching graph

35 12 filmography item V 3, 4 V 2, 3 V 1, 1 V 2, 3 movie V 3, 4 actor title w 1, 5 name w 3, 10 date of birth w 4, 11 title w 1, 5 movie V 2, 2 language w 2, 6 V 3, 4 V 1, 1 actor name w 3, 10 date of birth w 4, 11 filmography item director title w 1, 5 movie V 2, 2 language w 2, 6 V 3, 4 V 1, 1 V 2, 3 movie actor V 3, 4 actor title w 1, 5 name w 3, 10 name w 3, 10 date of birth w 4, 11 date of birth w 4, 11 filmography item director filmography item V 2, 3 V 3, 4 V 1, 1 actor name w 3, 10 date of birth w 4, 11 filmography item Subsumed by the right matching graph

36 title language name movie date of birth movie actor title filmography item V 1, 1 V 2, 2 V 2, 3V 3, 4 w 1, 5w 2, 6w 1, 8w 3, 10w 4, 11 director title w 1, 5 movie V 2, 2 language w 2, 6 V 3, 4 V 1, 1 actor name w 3, 10 date of birth w 4, 11 filmography item director V 1, 1 V 2, 3 movie V 3, 4 actor title w 1, 5 name w 3, 10 date of birth w 4, 11 filmography item The OR-Matchings The Product Graph

37 Computing Maximal Weak- Matching Graphs In order to compute maximal weak matching graphs, the same algorithm is being used with a slight change After each addition of an edge the nodes that cause a query constraint not to be preserved are removed (along with edges that contain these nodes) Also, are deleted nodes that the previous deletion causes them not to be reachable from the root

38 The Algorithm Computes Weak- Queries in Polynomial Time Theorem Given a query Q and a database D, the revised algorithm terminates with the set of maximal weak-matching graphs of Q w.r.t. D. The runtime of the algorithm is O(q 3 dm 2 ), where q is the size of the query, d is the size of the database and m is the size of the result