Yinghui Wu, LFCS DB talk Database Group Meeting Talk Yinghui Wu 10/11/2010 1 Simulation Revised for Graph Pattern Matching.

Slides:



Advertisements
Similar presentations
2005conjunctive-ii1 Query languages II: equivalence & containment (Motivation: rewriting queries using views)  conjunctive queries – CQ’s  Extensions.
Advertisements

1 Finite Constraint Domains. 2 u Constraint satisfaction problems (CSP) u A backtracking solver u Node and arc consistency u Bounds consistency u Generalized.
ICS-271:Notes 5: 1 Lecture 5: Constraint Satisfaction Problems ICS 271 Fall 2008.
Shuai Ma, Yang Cao, Wenfei Fan, Jinpeng Huai, Tianyu Wo Capturing Topology in Graph Pattern Matching University of Edinburgh.
New Models for Graph Pattern Matching Shuai Ma ( 马 帅 )
The IEEE International Conference on Big Data 2013 Arash Fard M. Usman Nisar Lakshmish Ramaswamy John A. Miller Matthew Saltz Computer Science Department.
Topology Generation Suat Mercan. 2 Outline Motivation Topology Characterization Levels of Topology Modeling Techniques Types of Topology Generators.
1 Networking through Linux Partha Sarathi Dasgupta MIS Group Indian Institute of Management Calcutta.
CS2420: Lecture 19 Vladimir Kulyukin Computer Science Department Utah State University.
1 Introduction to Linear and Integer Programming Lecture 9: Feb 14.
Firewall Policy Queries Author: Alex X. Liu, Mohamed G. Gouda Publisher: IEEE Transaction on Parallel and Distributed Systems 2009 Presenter: Chen-Yu Chang.
Exploiting Correlated Attributes in Acquisitional Query Processing Amol Deshpande University of Maryland Joint work with Carlos Sam
1 Brief Announcement: Distributed Broadcasting and Mapping Protocols in Directed Anonymous Networks Michael Langberg: Open University of Israel Moshe Schwartz:
Scalable Network Distance Browsing in Spatial Database Samet, H., Sankaranarayanan, J., and Alborzi H. Proceedings of the 2008 ACM SIGMOD international.
Yinghui Wu LFCS Lab Lunch Homomorphism and Simulation Revised for Graph Matching.
Making Pattern Queries Bounded in Big Graphs 11 Yang Cao 1,2 Wenfei Fan 1,2 Jinpeng Huai 2 Ruizhe Huang 1 1 University of Edinburgh 2 Beihang University.
Querying Big Graphs within Bounded Resources 1 Yinghui Wu UC Santa Barbara Wenfei Fan University of Edinburgh Southwest Jiaotong University Xin Wang.
Yinghui Wu, SIGMOD 2012 Query Preserving Graph Compression Wenfei Fan 1,2 Jianzhong Li 2 Xin Wang 1 Yinghui Wu 1,3 1 University of Edinburgh 2 Harbin Institute.
Performance Guarantees for Distributed Reachability Queries Wenfei Fan 1,2 Xin Wang 1 Yinghui Wu 1,3 1 University of Edinburgh 2 Harbin Institute of Technology.
XML-to-Relational Schema Mapping Algorithm ODTDMap Speaker: Artem Chebotko* Wayne State University Joint work with Mustafa Atay,
Graph Indexing: A Frequent Structure­ based Approach Authors:Xifeng Yan†, Philip S‡. Yu, Jiawei Han†
Virtual Network Mapping: A Graph Pattern Matching Approach Yang Cao 1,2, Wenfei Fan 1,2, Shuai Ma University of Edinburgh 2 Beihang University.
Research Directions for Big Data Graph Analytics John A. Miller, Lakshmish Ramaswamy, Krys J. Kochut and Arash Fard Department of Computer Science University.
G-SPARQL: A Hybrid Engine for Querying Large Attributed Graphs Sherif SakrSameh ElniketyYuxiong He NICTA & UNSW Sydney, Australia Microsoft Research Redmond,
1 Evaluating top-k Queries over Web-Accessible Databases Paper By: Amelie Marian, Nicolas Bruno, Luis Gravano Presented By Bhushan Chaudhari University.
Diversified Top-k Graph Pattern Matching 1 Yinghui Wu UC Santa Barbara Wenfei Fan University of Edinburgh Southwest Jiaotong University Xin Wang.
Lecture 16 Maximum Matching. Incremental Method Transform from a feasible solution to another feasible solution to increase (or decrease) the value of.
On Graph Query Optimization in Large Networks Alice Leung ICS 624 4/14/2011.
System Support for Managing Graphs in the Cloud Sameh Elnikety & Yuxiong He Microsoft Research.
Yinghui Wu, ICDE Adding Regular Expressions to Graph Reachability and Pattern Queries Wenfei Fan Shuai Ma Nan Tang Yinghui Wu University of Edinburgh.
Easiest-to-Reach Neighbor Search Fatimah Aldubaisi.
Graph Query Reformulation with Diversity – Davide Mottin, Francesco Bonchi, Francesco Gullo 1 Graph Query Reformulation with Diversity Davide Mottin, University.
Answering pattern queries using views Yinghui Wu UC Santa Barbara Wenfei Fan University of EdinburghSouthwest Jiaotong University Xin Wang.
CS6321 Query Optimization Over Web Services Utkarsh Kamesh Jennifer Rajeev Shrivastava Munagala Wisdom Motwani Presented By Ajay Kumar Sarda.
1.7 Linear Inequalities.  With an inequality, you are finding all values of x for which the inequality is true.  Such values are solutions and are said.
QoS Supported Clustered Query Processing in Large Collaboration of Heterogeneous Sensor Networks Debraj De and Lifeng Sang Ohio State University Workshop.
Graphs and 2-Way Bounding Discrete Structures (CS 173) Madhusudan Parthasarathy, University of Illinois 1 /File:7_bridgesID.png.
COSC 5341 High-Performance Computer Networks Presentation for By Linghai Zhang ID:
1 Approximate XML Query Answers Presenter: Hongyu Guo Authors: N. polyzotis, M. Garofalakis, Y. Ioannidis.
Research Directions for Big Data Graph Analytics John A. Miller, Lakshmish Ramaswamy, Krys J. Kochut and Arash Fard.
Biological Model Engineering Peter Saffrey, Department of Medicine Cakes Talk Monday, October 20, 2008.
03/02/20061 Evaluating Top-k Queries Over Web-Accessible Databases Amelie Marian Nicolas Bruno Luis Gravano Presented By: Archana and Muhammed.
Efficient Rule-Based Attribute-Oriented Induction for Data Mining Authors: Cheung et al. Graduate: Yu-Wei Su Advisor: Dr. Hsu.
Sullivan Algebra and Trigonometry: Section 12.9 Objectives of this Section Set Up a Linear Programming Problem Solve a Linear Programming Problem.
Yinghui Wu, SIGMOD Incremental Graph Pattern Matching Wenfei Fan Xin Wang Yinghui Wu University of Edinburgh Jianzhong Li Jizhou Luo Harbin Institute.
1 EL736 Communications Networks II: Design and Algorithms Class4: Network Design Modeling (II) Yong Liu 10/03/2007.
CPT-S Advanced Databases 11 Yinghui Wu EME 49.
Xifeng Yan Philip S. Yu Jiawei Han SIGMOD 2005 Substructure Similarity Search in Graph Databases.
The NP class. NP-completeness
Outline Introduction State-of-the-art solutions
Answering pattern queries using views
BIPARTITE GRAPHS AND ITS APPLICATIONS
T.W. Scholten, C. de Persis, P. Tesi
Michael Langberg: Open University of Israel
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
CPT-S 415 Big Data Yinghui Wu EME B45 1.
Computing Full Disjunctions
Date of download: 1/1/2018 Copyright © ASME. All rights reserved.
Associative Query Answering via Query Feature Similarity
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
Design of Declarative Graph Query Languages: On the Choice between Value, Pattern and Object based Representations for Graphs Hasan Jamil Department of.
Finding Fastest Paths on A Road Network with Speed Patterns
Simulation based approach Shang Zechao
The Communication Complexity of Distributed Set-Joins
Lecture 19-Problem Solving 4 Incremental Method
Lu Xing CS59000GDM Sept 7th, 2018.
G-CORE: A Core for Future Graph Query Languages
Flow Networks and Bipartite Matching
Maximum Bipartite Matching
Approximate Graph Mining with Label Costs
Presentation transcript:

Yinghui Wu, LFCS DB talk Database Group Meeting Talk Yinghui Wu 10/11/ Simulation Revised for Graph Pattern Matching

Yinghui Wu, LFCS DB talk Outline Graph Simulation label equality, edge-to-edge matching relation Bounded Simulation node predicates, edge bound, edge-to-path matching relation Reachability Queries and Graph Pattern Queries query containment and minimization – cubic time query evaluation – cubic time Conclusion 2 A first step towards revising simulation for graph pattern matching

Yinghui Wu, LFCS DB talk Graph Pattern Matching: the problem Given a pattern graph P and a data graph G, decide whether G matches P, and if so, find all the matches of P in G. Applications social queries, social matching biology and chemistry network querying key work search, proximity search, … 3 Widely employed in a variety of emerging real life applications How to define?

Yinghui Wu, LFCS DB talk Graph Simulation Node label equivalence Edge-to-edge relation 4 Identical label matching, edge-to-edge relations Capable enough? A B D B v1v1v1v1 v2v2v2v2 E G A B DE P

Yinghui Wu, LFCS DB talk An example from real life social matching 5 Alice biologist doctors P G edge-to-path mappings Graph simulation is too restrictive!

Yinghui Wu, LFCS DB talk Bounded Simulation data graph G = (V, E, f A ) pattern graph P = (V p, E p, f v, f e ) G matches P via bounded simulation if there is a binary relation from V p to V that for every edge of P, there exists a path in G satisfying the constraints of the edge. bounded simulation v.s graph simulation node matches v.s label equality edge-to-path matching v.s edge-to-edge matching 6 Enriched model for capturing meaningful matches special case Id = ‘Alice’ Job = ‘biologist’ Job = ‘doctors’ P G Job = ‘biologist’ Job = ‘doctors’ Job = ‘CTO’ Id = ‘Alice’

Yinghui Wu, LFCS DB talk Basic results for the bounded simulation For any graph G and pattern P, if G matches P, then there is a unique maximum match in G for P. The graph pattern matching problem via bounded simulation can be solved in cubic time. The incremental bounded simulation problem Efficient approaches for graph pattern matching extension for multiple edge colors? 7

Yinghui Wu, LFCS DB talk Considering edge types… 8 Real life graphs have multiple edge types Essembly Network friends-allies friends-nemeses strangers-nemeses strangers-allies

Yinghui Wu, LFCS DB talk Querying Essembly network: an example 9 Essembly Network fa fn sn sa Alice Biologists supporting Cloning Doctors Against cloning fa <=2 sa <=2 fn P fa <=2 sn fa+ Pattern queries with multiple edge types

Yinghui Wu, LFCS DB talk Graph reachability and pattern queries Real life graphs usually bear different edge types… data graph G = (V, E, f A,, f C ) Reachability query (RQ) : (u 1, u 2, f u1, f u2, f e ) where f e is a subclass of regular expression of:  F ::= c | c ≤k | c + | FF Q r (G): set of node pairs (v 1, v 2 ) that there is a nonempty path from v 1 to v 2, and the edge colors on the path match the pattern specified by f e. 10 Job=‘biologist’, sp=‘cloning’ Job=‘doctors’ fa <=2 fn

Yinghui Wu, LFCS DB talk Graph pattern queries 11  graph pattern queries PQ Q p =(V p, E p, f v, f e ) where for each edge e=(u,u’), Q e =(u 1, u 2, f v (u), f v (u’), f e (e)) is an RQ.  Q p (G) is the maximum set (e, S e ) for any e 1 (u 1,u 2 ) and e 2 (u 2,u 3 ), if (v 1,v 2 ) is in S e1, then there is a v 3 that (v 2,v 3 ) is in S e2. for any two edges e 1 (u 1,u 2 ) and e 2 (u 1,u 3 ), if (v 1,v 2 ) is in S e1, then there is a v 3 that (v 1,v 3 ) is in S e2  PQ vs. simulation and bounded simulation  search condition on query nodes  mapping edges to paths  constrain the edges on the path with a regular expression RQ and bounded simulation are special cases of PQ

Yinghui Wu, LFCS DB talk Reachability and graph pattern query: examples 12 fa fn sn sa Job=‘biologist’, sp=‘cloning’ Job=‘doctors’ fa <=2 fn Id=‘Alice’ Job=‘biologist’, sp=‘cloning’ Job=‘doctors’ dsp=‘cloning’ fa<=2 sa<=2 fn fa<=2 sn fa+

Yinghui Wu, LFCS DB talk Fundamental problems: query containment  PQ Q 1 (V 1, E 1, f v1, f e1 ) is contained in Q 2 (V 2, E 2, f v2, f e2 ) if there exists a mapping λ from E 1 to E 2 s.t for any data graph G and e in E 1, S e is a subset of S λ(e), i.e., λ is a renaming function that Q 1 (G) is mapped to Q 2 (G).  Query containment and equivalence problems can all be determined in cubic time Query similarity based on a revision of graph simulation Determine the query similarity in cubic time 13 Query containment and equivalence for PQs can be solved efficiently

Yinghui Wu, LFCS DB talk query containment: example 14 B1B1 C1C1 Q1Q1 C3C3 C2C2 h <=1 h <=2 h <=3 B2B2 Q2Q2 C4C4 h <=1 B3B3 C5C5 Q3Q3 C6C6 h <=3

Yinghui Wu, LFCS DB talk Fundamental problems: query minimization Query minimization problem input: a PQ Q p output: a minimized PQ Q m equivalent to Q p Query minimization problem can be solved in cubic time. compute the maximum node equivalent classes based on a revision of graph simulation; determine the number of redundant nodes and edges based on the equivalent classes; Removed redundant and isolated nodes and edges 15 Query minimization for PQs can be solved efficiently

Yinghui Wu, LFCS DB talk query minimization: example 16 R B Q1Q1 B C f h <=2 g <=3 g CCC h <=2 g <=3 R B B f g CC h <=2 g <=3 h <=2 g <=3 R B B f g CC h <=2 g <=3 h <=2 Q2Q2 Q3Q3

Yinghui Wu, LFCS DB talk Evaluating graph pattern queries 17 PQ can be answered in cubic time. Join-based Algorithm JoinMatch  Matrix index vs distance cache  join operation for each edge in PQ until a fixpoint is reached (wrt. a reversed topological order) Split-based Algorithm SplitMatch  blocks: treating pattern node and data node uniformly  partition-relation pair Graph pattern matching can be solved in polynomial time

Yinghui Wu, LFCS DB talk Example of JoinMatch 18 fa fn sn sa Id=‘Alice’ Job=‘biologist’, sp=‘cloning’ Job=‘doctors’ dsp=‘cloning’ fa<=2 sa<=2 fn fa<=2 sn fa+

Yinghui Wu, LFCS DB talk Example of JoinMatch 19 fa fn sn sa Id=‘Alice’ Job=‘biologist’, sp=‘cloning’ Job=‘doctors’ dsp=‘cloning’ fa<=2 sa<=2 fn fa<=2 sn fa+

Yinghui Wu, LFCS DB talk Example of JoinMatch 20 fa fn sn sa Id=‘Alice’ Job=‘biologist’, sp=‘cloning’ Job=‘doctors’ dsp=‘cloning’ fa<=2 sa<=2 fn fa<=2 sn fa+

Yinghui Wu, LFCS DB talk Example of JoinMatch 21 fa fn sn sa Id=‘Alice’ Job=‘biologist’, sp=‘cloning’ Job=‘doctors’ dsp=‘cloning’ fa<=2 sa<=2 fn fa<=2 sn fa+

Yinghui Wu, LFCS DB talk Experimental results – effectiveness of PQs 22 Effectiveness of PQs: edge to path relations

Yinghui Wu, LFCS DB talk Experimental results – querying real life graphs 23 Evaluation algorithms are sensitive to pattern edges Varying |Vp|Varying |Ep|

Yinghui Wu, LFCS DB talk Experimental results – querying real life graphs 24 The algorithms are sensitive to the number of predicates Varying |pred|Varying b

Yinghui Wu, LFCS DB talk Experimental results – querying synthetic graphs 25 The algorithms scale well over large synthetic graphs Varying |V| (x10 5 ) Varying b

Yinghui Wu, LFCS DB talk Experimental results – querying synthetic graphs 26 The algorithms scale well over large synthetic graphs Varying αVarying cr

Yinghui Wu, LFCS DB talk Conclusion Simulation revised for graph pattern matching Bounded Simulation  node predicates, edge bound, edge-to-path matching relation Reachability Queries and Graph Pattern Queries  query containment and minimization – cubic time  query evaluation – cubic time Future work extending RQs and PQs by supporting general regular expressions incremental evaluation of RQs and PQs 27 Simulation revised for graph pattern matching

Yinghui Wu, LFCS DB talk 28 “ Those who were trained to fly didn’t know the others. One group of people did not know the other group.” (Bin Laden) Terrorist Collaboration Network ( ) Thank you!