Graph Database Mining and Its Applications Luke Huan Feb 19, 2004
Presentation Overview Introduction Finding recurring subgraphs from graph databases. Algorithms Application Bioinformatics Social network Communication network Left: social network right protein structure 1L06 11/30/2018
An Example of Graph Database Mining Input: a database of labeled undirected graphs and a threshold 0 < 1 p2 p5 d c a x y (P) p1 p3 p4 b d c x y (Q) q1 q3 q2 d c x (S) s1 s3 s2 = 2/3 Output: All (connected) frequent subgraphs from the graph database. d c x d c d c x y d c x y 11/30/2018
FSG: an apriori like method 11/30/2018
Framework 11/30/2018
Candidate Generating 11/30/2018
How about using paths? Graphs can be decomposed into a set of paths 11/30/2018
Applications I: protein structure analysis ASP 102 ALA 55 SER 214 HIS 57 GLY 140 GLY 142 ASP 194 GLY 43 GLY 196 11/30/2018
Application II: Finding social contacts 11/30/2018
References X. Yan, J. Han, gSpan: Graph-Based Substructure Pattern Mining, ICDM 2002 J. Huan, W. Wang, J. Prins. Efficient Mining of Frequent Subgraphs in the Presence of Isomorphism, ICDM 2003. M. Kuramochi and G. Karypis Finding Frequent Patterns in a Large Sparse Graph, SIAM 2004 N. Vanetik, E. Gudes Mining Frequent Labeled and Partially Labeled Graph Patterns, ICDE 2004 11/30/2018