Mining Graphs with Constrains on Symmetry and Diameter Natalia Vanetik Deutsche Telecom Laboratories at Ben-Gurion University IWGD10 workshop July 14th,

Slides:



Advertisements
Similar presentations
Optimization Problems in Optical Networks. Wavelength Division Multiplexing (WDM) Directed: Symmetric: Undirected: Optic Fiber.
Advertisements

Optical networks: Basics of WDM
gSpan: Graph-based substructure pattern mining
Correlation Search in Graph Databases Yiping Ke James Cheng Wilfred Ng Presented By Phani Yarlagadda.
1 EE5900 Advanced Embedded System For Smart Infrastructure Static Scheduling.
Graph Isomorphism Algorithms and networks. Graph Isomorphism 2 Today Graph isomorphism: definition Complexity: isomorphism completeness The refinement.
Optimization Problems in Optical Networks. Wavelength Division Multiplexing (WDM) Directed: Symmetric: Optic Fiber.
Multicut Lower Bounds via Network Coding Anna Blasiak Cornell University.
NP-complete and NP-hard problems Transitivity of polynomial-time many-one reductions Concept of Completeness and hardness for a complexity class Definition.
1 NP-Complete Problems. 2 We discuss some hard problems:  how hard? (computational complexity)  what makes them hard?  any solutions? Definitions 
Introduction to Graph Mining
5/12/2015PhD seminar CS BGU Counting subgraphs Support measures for graphs Natalia Vanetik.
Information Networks Graph Clustering Lecture 14.
Online Social Networks and Media. Graph partitioning The general problem – Input: a graph G=(V,E) edge (u,v) denotes similarity between u and v weighted.
Five Problems CSE 421 Richard Anderson Winter 2009, Lecture 3.
Combinatorial Algorithms
CSC5160 Topics in Algorithms Tutorial 2 Introduction to NP-Complete Problems Feb Jerry Le
Computability and Complexity 23-1 Computability and Complexity Andrei Bulatov Search and Optimization.
Association Analysis (2). Example TIDList of item ID’s T1I1, I2, I5 T2I2, I4 T3I2, I3 T4I1, I2, I4 T5I1, I3 T6I2, I3 T7I1, I3 T8I1, I2, I3, I5 T9I1, I2,
1 Internet Networking Spring 2006 Tutorial 6 Network Cost of Minimum Spanning Tree.
Structure discovery in PPI networks using pattern-based network decomposition Philip Bachman and Ying Liu BIOINFORMATICS System biology Vol.25 no
NP-Complete Problems Reading Material: Chapter 10 Sections 1, 2, 3, and 4 only.
Graph Triangulation by Dmitry Pidan Based on the paper “A sufficiently fast algorithm for finding close to optimal junction tree” by Ann Becker and Dan.
Graph, Search Algorithms Ka-Lok Ng Department of Bioinformatics Asia University.
SubSea: An Efficient Heuristic Algorithm for Subgraph Isomorphism Vladimir Lipets Ben-Gurion University of the Negev Joint work with Prof. Ehud Gudes.
1 Internet Networking Spring 2004 Tutorial 6 Network Cost of Minimum Spanning Tree.
The community-search problem and how to plan a successful cocktail party Mauro SozioAris Gionis Max Planck Institute, Germany Yahoo! Research, Barcelona.
6/29/20151 Efficient Algorithms for Motif Search Sudha Balla Sanguthevar Rajasekaran University of Connecticut.
Steiner trees Algorithms and Networks. Steiner Trees2 Today Steiner trees: what and why? NP-completeness Approximation algorithms Preprocessing.
The Maximum Independent Set Problem Sarah Bleiler DIMACS REU 2005 Advisor: Dr. Vadim Lozin, RUTCOR.
Part I: Introductory Materials Introduction to Graph Theory Dr. Nagiza F. Samatova Department of Computer Science North Carolina State University and Computer.
Minimum Spanning Tree Algorithms. What is A Spanning Tree? u v b a c d e f Given a connected, undirected graph G=(V,E), a spanning tree of that graph.
Mehdi Kargar Aijun An York University, Toronto, Canada Discovering Top-k Teams of Experts with/without a Leader in Social Networks.
Approximating Minimum Bounded Degree Spanning Tree (MBDST) Mohit Singh and Lap Chi Lau “Approximating Minimum Bounded DegreeApproximating Minimum Bounded.
Clustering of protein networks: Graph theory and terminology Scale-free architecture Modularity Robustness Reading: Barabasi and Oltvai 2004, Milo et al.
Xiangnan Kong,Philip S. Yu Department of Computer Science University of Illinois at Chicago KDD 2010.
1 Frequent Subgraph Mining Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY June 12, 2010.
Memory Allocation of Multi programming using Permutation Graph By Bhavani Duggineni.
University at BuffaloThe State University of New York Lei Shi Department of Computer Science and Engineering State University of New York at Buffalo Frequent.
NP-Complete Problems. Running Time v.s. Input Size Concern with problems whose complexity may be described by exponential functions. Tractable problems.
COSC 5341 High-Performance Computer Networks Presentation for By Linghai Zhang ID:
Computing Branchwidth via Efficient Triangulations and Blocks Authors: F.V. Fomin, F. Mazoit, I. Todinca Presented by: Elif Kolotoglu, ISE, Texas A&M University.
New algorithms for Disjoint Paths and Routing Problems
NP-completeness NP-complete problems. Homework Vertex Cover Instance. A graph G and an integer k. Question. Is there a vertex cover of cardinality k?
+ GRAPH Algorithm Dikompilasi dari banyak sumber.
Introduction to NP Instructor: Neelima Gupta 1.
Approximation Algorithms by bounding the OPT Instructor Neelima Gupta
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Mining Complex Data COMP Seminar Spring 2011.
Cohesive Subgraph Computation over Large Graphs
Mining in Graphs and Complex Structures
Minimum Spanning Tree 8/7/2018 4:26 AM
Algorithms and networks
Mining Frequent Subgraphs
Graph Search with Indexing
CS 3343: Analysis of Algorithms
CS223 Advanced Data Structures and Algorithms
Connected Components Minimum Spanning Tree
Richard Anderson Lecture 25 NP-Completeness
Algorithms and networks
Coverage Approximation Algorithms
Approximation Algorithms
Computation Basics & NP-Completeness
SEG5010 Presentation Zhou Lanjun.
CSE 373: Data Structures and Algorithms
CSE 421, University of Washington, Autumn 2006
Minimum Spanning Trees
Approximate Graph Mining with Label Costs
A Variation of Minimum Latency Problem on Path, Tree and DAG
Vertex Covers and Matchings
CSE 421 Richard Anderson Autumn 2019, Lecture 3
Presentation transcript:

Mining Graphs with Constrains on Symmetry and Diameter Natalia Vanetik Deutsche Telecom Laboratories at Ben-Gurion University IWGD10 workshop July 14th, 2010 Jiuzhaigou, China 1

Graph mining (1) Problem statement IWGD10 workshop July 14th, 2010 Jiuzhaigou, China 2

Graph mining (2) Motivation Graphs are everywhere – Chemical compounds (Cheminformatics) – Protein structures, biological pathways/networks (Bioinformactics) – Program control flow, traffic flow, and workflow analysis – XML databases, Web, and social network analysis Graph is a general model – Trees, lattices, sequences, and items are degenerated graphs Diversity of graphs – Directed vs. undirected, labeled vs. unlabeled (edges & vertices), weighted, with angles & geometry (topological vs. 2-D/3-D) Complexity of algorithms : many problems are of high complexity (NP complete or even P-SPACE !) IWGD10 workshop July 14th, 2010 Jiuzhaigou, China 3

Graphs, graphs, everywhere Aspirin Yeast protein interaction network from H. Jeong et al Nature 411, 41 (2001) Internet Co-author network

Constraints: diameter Diameter d(G) of a graph G is the maximum among minimal distances between pairs of its vertices. d(G)=1 implies that G is complete. d(G)=  implies that G is not connected. IWGD10 workshop July 14th, 2010 Jiuzhaigou, China 5 d(G)=1d(G)=2 d(G)= 

Constraints: symmetry Symmetries of a graph G are determines by its automorphism group Aut(G). Aut(G) is a permutation group. Largest possible automorphism group for a graph of size n is S n, which has order n! IWGD10 workshop July 14th, 2010 Jiuzhaigou, China 6 Aut(G)=S 5 Aut(G)=S 3 Aut(G)=D 5 Aut(G)=S 5

Measuring symmetry and diameter (1) Graph diameter is computable in polynomial time. Automorphism group of a graph is not likely to be computable in polynomial time. – Best known algorithm: Nauty by B. McKay, outputs a set of generators of Aut(G). Intuitively, graphs with smaller diameter and higher symmetry are more interesting. IWGD10 workshop July 14th, 2010 Jiuzhaigou, China 7 d(G)=2d(G)=3

Measuring symmetry and diameter (2) Symmetry is harder to measure. Observation: maximum symmetry of a graph is achieved when is automorphism group is the symmetric group of order equal to the size of a graph. Suggestion: measure symmetry of G as IWGD10 workshop July 14th, 2010 Jiuzhaigou, China 8 s(G)=|S 5 |/|S 5 |=1s(G)=|S 3 |/|S 5 |= 1/20s(G)=|D 5 |/|S 5 |= 1/12

Tree decomposition of a graph Let G=(V,E) be a graph. Tree T is called a tree decomposition of G if – Nodes of T are subsets X 1,…,X n  V such that X 1  …  X n =V – If node v  X i  X j, then every node X k of T on the path from X i to X j contains v as well. – For every edge e=(v,u) there exists i so that u,v  X i. IWGD10 workshop July 14th, 2010 Jiuzhaigou, China G T 1 ={{1,2,3,4},  } T 2 ={{1,2,4},{2,3,4}}, {({1,2,4},{2,3,4})}}

Minimal tree decomposition Width of a tree decomposition T is (max i |X i |)-1. Minimum width among all tree decomposition is called tree width of a graph. Tree width equals maximum clique size minus 1. Tree decomposition of minimum width is called minimal tree decomposition. Computing minimal tree decomposition is NP-hard problem as it contains the problem of finding all maximum cliques in a graph. IWGD10 workshop July 14th, 2010 Jiuzhaigou, China 10

Different tree decompositions IWGD10 workshop July 14th, 2010 Jiuzhaigou, China Minimal tree decomposition Non-minimal tree decomposition

Intuition behind the proposed algorithm 1.Compute the finest tree decomposition possible for every DB transaction under given time constraints. 2.Use basic pattern growing algorithm, such as FSG or gSpan to extend instances of frequent patterns. 3.Every time an instance of a frequent pattern is extended by an edge of a node a.Compute its diameter and symmetry estimates based on pattern’s position within tree decomposition of a DB transaction; b.if one of the estimates is lower than user-specified symmetry or diameter constraints, remove patterns instance from instance list, c.otherwise, keep the instance in the list. d.If the count of instances is higher than support bound, this is a frequent pattern. IWGD10 workshop July 14th, 2010 Jiuzhaigou, China 12

How does it work? Let T be tree decomposition of DB graph transaction t. Let G  t be an instance of a candidate pattern. Let T G =(V G,E G )  T be minimal subtree of T containing G. Claim 1. d(G)  d(T G ). Claim 2. s(G)≤(|LAut(T G )|  X  V G |X\E G |!  e  E G |e|!)/|G|! where LAut is automorphism group of T G viewed as tree where each node X is labeled by |X|. IWGD10 workshop July 14th, 2010 Jiuzhaigou, China 13

Example (1) IWGD10 workshop July 14th, 2010 Jiuzhaigou, China Pattern instance and corresponding subtree of minimal Tree decomposition Diameter is at least 1Diameter is at least 2

Example (2) IWGD10 workshop July 14th, 2010 Jiuzhaigou, China Pattern instance and corresponding subtree of minimal Tree decomposition Symmetry is at most 1Symmetry is at most 2*2!*1!*1!/4!=1/6

Properties of estimates IWGD10 workshop July 14th, 2010 Jiuzhaigou, China 16

The algorithm IWGD10 workshop July 14th, 2010 Jiuzhaigou, China 17

Correctness IWGD10 workshop July 14th, 2010 Jiuzhaigou, China 18

Complexity concerns IWGD10 workshop July 14th, 2010 Jiuzhaigou, China 19

Test results (symmetry) IWGD10 workshop July 14th, 2010 Jiuzhaigou, China 20

Test results (symmetry) IWGD10 workshop July 14th, 2010 Jiuzhaigou, China 21

Test results (symmetry) IWGD10 workshop July 14th, 2010 Jiuzhaigou, China 22

Test results (diameter) IWGD10 workshop July 14th, 2010 Jiuzhaigou, China 23

Test results (diameter) IWGD10 workshop July 14th, 2010 Jiuzhaigou, China 24

Test results (diameter) IWGD10 workshop July 14th, 2010 Jiuzhaigou, China 25