1 Abdeslame ALILAOUAR, Florence SEDES Fuzzy Querying of XML Documents The minimum spanning tree IRIT - CNRS IRIT : IRIT : Research Institute for Computer.

Slides:



Advertisements
Similar presentations
Chapter 9 Greedy Technique. Constructs a solution to an optimization problem piece by piece through a sequence of choices that are: b feasible - b feasible.
Advertisements

Efficient Keyword Search for Smallest LCAs in XML Database Yu Xu Department of Computer Science & Engineering University of California, San Diego Yannis.
O(N 1.5 ) divide-and-conquer technique for Minimum Spanning Tree problem Step 1: Divide the graph into  N sub-graph by clustering. Step 2: Solve each.
Greedy Algorithms Greed is good. (Some of the time)
Greed is good. (Some of the time)
Minimum Spanning Trees Definition Two properties of MST’s Prim and Kruskal’s Algorithm –Proofs of correctness Boruvka’s algorithm Verifying an MST Randomized.
1 Minimum Spanning Tree Prim-Jarnik algorithm Kruskal algorithm.
CMPS 2433 Discrete Structures Chapter 5 - Trees R. HALVERSON – MIDWESTERN STATE UNIVERSITY.
1 CS 561 Presentation: Indexing and Querying XML Data for Regular Path Expressions A Paper by Quanzhong Li and Bongki Moon Presented by Ming Li.
Graph Algorithms: Minimum Spanning Tree We are given a weighted, undirected graph G = (V, E), with weight function w:
3 -1 Chapter 3 The Greedy Method 3 -2 The greedy method Suppose that a problem can be solved by a sequence of decisions. The greedy method has that each.
Chapter 9: Greedy Algorithms The Design and Analysis of Algorithms.
Minimum-Cost Spanning Tree weighted connected undirected graph spanning tree cost of spanning tree is sum of edge costs find spanning tree that has minimum.
Greedy Algorithms Reading Material: Chapter 8 (Except Section 8.5)
1 Indexing and Querying XML Data for Regular Path Expressions A Paper by Quanzhong Li and Bongki Moon Presented by Amnon Shochot.
Minimum Spanning Trees CIS 606 Spring Problem A town has a set of houses and a set of roads. A road connects 2 and only 2 houses. A road connecting.
1 CSE 417: Algorithms and Computational Complexity Winter 2001 Lecture 10 Instructor: Paul Beame.
Greedy Algorithms Like dynamic programming algorithms, greedy algorithms are usually designed to solve optimization problems Unlike dynamic programming.
More Graph Algorithms Weiss ch Exercise: MST idea from yesterday Alternative minimum spanning tree algorithm idea Idea: Look at smallest edge not.
CSE 550 Computer Network Design Dr. Mohammed H. Sqalli COE, KFUPM Spring 2007 (Term 062)
CPSC 411, Fall 2008: Set 4 1 CPSC 411 Design and Analysis of Algorithms Set 4: Greedy Algorithms Prof. Jennifer Welch Fall 2008.
Lecture 27 CSE 331 Nov 6, Homework related stuff Solutions to HW 7 and HW 8 at the END of the lecture Turn in HW 7.
Minimum Spanning Trees. Subgraph A graph G is a subgraph of graph H if –The vertices of G are a subset of the vertices of H, and –The edges of G are a.
Minimum Spanning Trees and Clustering By Swee-Ling Tang April 20, /20/20101.
Nirmalya Roy School of Electrical Engineering and Computer Science Washington State University Cpt S 223 – Advanced Data Structures Graph Algorithms: Minimum.
© The McGraw-Hill Companies, Inc., Chapter 3 The Greedy Method.
1 Ranking Inexact Answers. 2 Ranking Issues When inexact querying is allowed, there may be MANY answers –different answers have a different level of incompleteness.
0 Course Outline n Introduction and Algorithm Analysis (Ch. 2) n Hash Tables: dictionary data structure (Ch. 5) n Heaps: priority queue data structures.
UNC Chapel Hill Lin/Foskey/Manocha Minimum Spanning Trees Problem: Connect a set of nodes by a network of minimal total length Some applications: –Communication.
CSCI 115 Chapter 7 Trees. CSCI 115 §7.1 Trees §7.1 – Trees TREE –Let T be a relation on a set A. T is a tree if there exists a vertex v 0 in A s.t. there.
Graphs. Definitions A graph is two sets. A graph is two sets. –A set of nodes or vertices V –A set of edges E Edges connect nodes. Edges connect nodes.
Chap 8 Trees Def 1: A tree is a connected,undirected, graph with no simple circuits. Ex1. Theorem1: An undirected graph is a tree if and only if there.
M180: Data Structures & Algorithms in Java Trees & Binary Trees Arab Open University 1.
Graphs. Graphs Similar to the graphs you’ve known since the 5 th grade: line graphs, bar graphs, etc., but more general. Those mathematical graphs are.
Trees Dr. Yasir Ali. A graph is called a tree if, and only if, it is circuit-free and connected. A graph is called a forest if, and only if, it is circuit-free.
Graph Data Management Lab, School of Computer Science Branch Code: A Labeling Scheme for Efficient Query Answering on Tree
Chapter 10: Trees A tree is a connected simple undirected graph with no simple circuits. Properties: There is a unique simple path between any 2 of its.
Minimum- Spanning Trees
Graphs Upon completion you will be able to:
Lecture 19 Minimal Spanning Trees CSCI – 1900 Mathematics for Computer Science Fall 2014 Bill Pine.
1) Find and label the degree of each vertex in the graph.
CSCI-256 Data Structures & Algorithm Analysis Lecture Note: Some slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved. 11.
Chapter 11. Chapter Summary  Introduction to trees (11.1)  Application of trees (11.2)  Tree traversal (11.3)  Spanning trees (11.4)
CSE 589 Applied Algorithms Spring 1999 Prim’s Algorithm for MST Load Balance Spanning Tree Hamiltonian Path.
Indexing and Querying XML Data for Regular Path Expressions Quanzhong Li and Bongki Moon Dept. of Computer Science University of Arizona VLDB 2001.
Minimum Spanning Trees
Chapter 5 : Trees.
Greedy function greedy { S <- S0 //Initialization
Chapter 5. Greedy Algorithms
Minimum Spanning Tree Chapter 13.6.
12. Graphs and Trees 2 Summary
Greedy Algorithms / Minimum Spanning Tree Yin Tat Lee
Short paths and spanning trees
Data Structures & Algorithms Graphs
Autumn 2016 Lecture 11 Minimum Spanning Trees (Part II)
Minimum-Cost Spanning Tree
Minimum Spanning Tree.
Minimum Spanning Trees
EMIS 8373: Integer Programming
Connected Components Minimum Spanning Tree
Graphs Chapter 13.
Autumn 2015 Lecture 11 Minimum Spanning Trees (Part II)
Minimum Spanning Tree Algorithms
CSE 373: Data Structures and Algorithms
CSE 417: Algorithms and Computational Complexity
Winter 2019 Lecture 11 Minimum Spanning Trees (Part II)
CSE 373: Data Structures and Algorithms
Lecture 14 Minimum Spanning Tree (cont’d)
Minimum-Cost Spanning Tree
INTRODUCTION A graph G=(V,E) consists of a finite non empty set of vertices V , and a finite set of edges E which connect pairs of vertices .
Presentation transcript:

1 Abdeslame ALILAOUAR, Florence SEDES Fuzzy Querying of XML Documents The minimum spanning tree IRIT - CNRS IRIT : IRIT : Research Institute for Computer Science of Toulouse (France)

2  The XML model  The problem of querying XML documents  Proposed techniques  Our approach  Implementation details  Conclusion and future tasks Talk Outline

3 Document-centric vs. Data-centric  Less regular or irregular structure,  The order of sibling elements is important,  Examples : s, books, etc. Document-centric  More structured  The order of sibling elements is often unimportant  Examples : sales orders, configuration files, etc. Data-centric The XML Data Model

4 The XML Data Model (continued)  Data are commonly modeled by a tree structure  Nodes represent objects  Edges represent relationships between objects  Atomic values are attached to leaf nodes

The XML Data Model (continued) Variations in Structure cottage price 1300 identifier ″40″ character nbeds 4 cottage character identifier ″23″ nbeds 4 price room 1700 room cotglist nbeds summer winter

Query = Content + Structure Unknown, Irregular XML Document = Content + Structure Structure matching R.I. The Problem of Querying XML Documents Content matching Result Irregular structure In most cases, the queries return empty or incomplete set of answers  Data has structural variations Relationships between objects are represented differently in different parts of the documents  Data has ontology variations Different labels are used to describe objects of the same type (e.g. house, cottage)

 Query should deal with different data structures Solution The Problem of Querying XML Documents (continued)  The queries should not be rigid patterns (structure)  Flexible handling of queries in order to find not only the answers that match exactly, but also with a similar structure and/or content

8 Proposed Techniques  Query relaxation (S. AmerYahia, AT&T, 2002)  Tree-edit distance (D. Shasha, K. Zhang, 1989 )  Correlation (A. Tversky, 1977 )  Data Relaxation (Damiani & Tanca, 2000 )

Our approach The minimum spanning tree (MST) - Optimization problem - A weighted graph Input Output - The cheapest subset of edges that keeps the graph in one connected component The minimum spanning tree

Proposed algorithm : Prim's algorithm (1957) Compute a minimum spanning tree by beginning with any vertex as the current tree. At each step add a least edge between any vertex not in the tree and any vertex in the tree. Continue until all vertices have been added. Kruskal's algorithm (1956) It maintains a set of partial minimum spanning trees, and repeatedly adds the shortest edge in the graph whose vertices are in different partial minimum spanning trees.

Querying XML documents with MST  Define a similarity function that we will use for estimating the matching degree of the preferences The importance level determines the priority between the preferences  replace the criteria by preferences with their importance levels The satisfaction degree of one preference is at least equal it importance level  The answers subtrees are built gradually, starting by evaluating the leaf nodes and the most important preferences, going up until construct the answers tree like a Kruskal’s algorithm. cottage nbeds price ,8 0,6 Example :  represent the queries by a weighted tree pattern

12 cottage i dentifier ″140″ character nbeds 4 cottage character identifier ″123″ nbeds 4 price room 1700 room cotglist nbeds 2 cottage nbeds price ,8 0,6 Sim(1300,1400)=0,9 Sim(price,price)=1 Sim(1300,1700) = 0,7 Sim=1 Sim=1,0 Sim=0,9 Sim=0,7 Example : price summer winter

Index builder Query Processor Query Answer list Tag Index Attribute Index Data Index Term Index XML document XML collection Indexed collection The architecture of our querying system Some Implementation Details

Indexing method  Efficiently determine the ancestors and descendent s of any node  Dietz’s method ( 1982)  Why Dietz’s method - for two given nodes x and y of a tree T, x is an ancestor of y iff x occurs before y in the preorder traversal and after y in the postorder traversal.  A straightforward method Traversal order to determine the ancestor-descendant relationship

15 Future work  Experiments within INEX (Initiative for the Evaluation of XML retrieval) Uses a  Improving the similarity functions (Uses a thesaurus, etc.)  Introducing the qualitative preferences (cheapest, nearest, small, etc.)

16 Thank You Questions?