Download presentation
Presentation is loading. Please wait.
Published byLoreen Cross Modified over 9 years ago
1
An Efficient Algorithm for Discovering Frequent Subgraphs Michihiro Kuramochi and George Karypis ICDM, 2001 報告者:蔡明瑾
2
Introduction Structural pattern Biology, chemistry Chemical compounds graph vertex – item edge – relation between items Undirected connected labeled graph b a x a y x
3
Graph Isomorphism b a x a x y a b x a y x G1(V1,E1) and G2(V2,E2) are topologically identical to each other. There is a mapping from v1 to v2,such that each edge in E1 is mapped to E2 and vice versa. v0v0 v1v1 v2v2 v0v0 v1v1 v2v2 =
4
Canonical labeling Adjacency list b a x a x y v0v0 v1v1 v2v2 v0v0 v1v1 v2v2 v0bv0b v1av1a v2av2a x x x y x y code = baaxxy a b x a y x v0v0 v1v1 v2v2 v0v0 v1v1 v2v2 v0av0a v1bv1b v2av2a x y x x y x code = abaxyx ||
5
Canonical labeling Different permutation of vertices lead to different canonical label. |v|! Largest codes
6
Vertex invariants Properties don ’ t change across isomorphism mappings. Vertex degree Vertex label siblings b a x a x y
7
Vertex Degrees and Labels Adjacency Matrix Partitioning verteices by degrees and labels that every partition contains vertices with same degree and label
8
Degree : p0={v0,v1,v3}:2 Degree+label : p0={ v1,v2}:(2,a),p1={v0}:(2,b) Vertex Degrees and Labels b a x a x y v0v0 v1v1 v2v2 v0v0 v1v1 v2v2 v0bv0b v1av1a v2av2a x x x y x y code = baaxxy
9
Vertex Degrees and Labels b a x a x y v0v0 v1v1 v2v2 v1v1 v2v2 v0v0 v1av1a v2av2a v0bv0b y x y x x x code = aabyxx p0={ v1,v2}:2,a,p1={v0}:2,b 原本: 3! 現在: 2!x 1!
10
Running example minsup =2 0 1 02 121 0 0 0 3 13 0 1 02 1 0 0 3 3 0 1 0 2 4 0 0 1 1 0 1 0 0 2 1 1 2 3 2 4 1 g0g1g2 Tid_list{0,1,2}{0,2}{0,1}{2} cl010021123 Frequent 1_subgraph
11
Running example minsup =2 tid{0,1,2} cl010 child {0,2} 021 {0,1} 123 0 1 0 0 2 1 1 2 3 0 12 01 0 11 00 0 1 0 1 2 3 Possible tid {0,1,2} c0 c2 c3 {0,2} {0,1} 0 1 0 1 0 0 c1 {0,1,2} c0,c1,c2,c3 c2 c3 ……
12
0 12 01 0 1 0 2 3 c2 c3 0 1 0 1 0 0 c1 tid {0,2}{0,1,2}{0,1} cl 01201x10000x10203x21133x 1 2 3 1 3 c4 tid{0,1,2} cl010 child c1,c2,c3 {0,2} 021 {0,1} 123 0 1 0 0 2 1 1 2 3 c2 c3,c4 Frequent 2_subgraph
13
Frequency computing Id-list Intersection two k-subgraph ’ s id-list Frequent->find the support Not frequent -> pruned
14
Candidate generation Joining two frequent k-subgraph ->k+1 candidate subgraph Having same k-1 core Vertex labeling Multiple cores Multiple automorphisms
15
Vertex labeling
16
Multiple automorphism
17
Multiple cores
18
0 1 2 0 1 0 1 0 2 3 c2c3 0 1 0 1 0 0 c1 1 2 3 1 3 c4 tid{0,1,2} cl010 child c1,c2,c3 {0,2} 021 {0,1} 123 0 1 0 0 2 1 1 2 3 c2 c3,c4 0 12 01 q1 tid {0,2} cl 01201x child {0,1,2} 10000x {0,1} 10203x {0,1} 21133x 1 1 0 0 0 1 2 0 1 2 1 Possible tid {0, 2} q0,q1 q0 0 2 01 q2 1 0 0 {0,} q1 0 2 1 1 0 1 0 {0, 2} 不符合 downward closure
19
Experiment AMD 1.53GHz 2GB main memory Linux OS chemical compound: PTE(340),66 atom types and four bond types,27 edges/graph on average DTP(223,644),104 atom types and three bound types and 22 edges/graph on average Synthetic datasets
20
PTE and DTP
21
Synthetic datasets
22
Synthetic datasets |D|=10000,|S|=200,|L E |=1,minsup=2%
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.