Download presentation
Presentation is loading. Please wait.
Published byElaine Cross Modified over 8 years ago
1
Center-Piece Subgraphs: Problem definition and Fast Solutions Hanghang Tong Christos Faloutsos Carnegie Mellon University
2
2 Center-Piece Subgraph(Ceps) Given Q query nodes Find Center-piece ( ) Input of Ceps –Q Query nodes –Budget b –K softand coefficient App. –Social Network –Law Inforcement –Gene Network –…
3
3 Challenges in Ceps Q1: How to measure the importance? Q2: How to extract connection subgraph? Q3: How to do it efficiently?
4
4 Roadmap Ceps Overview Q1: Goodness Score Calculation Q2: Extract Alg. Q3: Efficiency Issue Experimental Results Conclusion
5
5 Ceps Overview Individual Score Calculation –Measure importance wrt individual query Combine Individual Scores –Measure importance wrt query set “Extract” Alg. – … the connection subgraphs
6
6 Roadmap Ceps Overview Q1: Goodness Score Calculation Q2: “Extract” Alg. Q3: Efficiency Issue Experimental Results Conclusion
7
7 RWR: Individual Score Calculation Goal –Individual importance score r(i,j) = r i,j –For each node j wrt each query i How to –Random walk with restart –Steady State Prob.
8
8 An Illustrating Example 1 2 3 4 5 6 7 89 11 10 13 12 Starting from 1 Randomly to neighbor Some p to return to 1 Prob (RW will finally stay at j)
9
9 Individual Score Calculation Q1 Q2Q3 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Node 8 Node 9 Node 10 Node 11 Node 12 Node 13 0.5767 0.0088 0.0088 0.1235 0.0076 0.0076 0.0283 0.0283 0.0283 0.0076 0.1235 0.0076 0.0088 0.5767 0.0088 0.0076 0.0076 0.1235 0.0088 0.0088 0.5767 0.0333 0.0024 0.1260 0.1260 0.0024 0.0333 0.1260 0.0333 0.0024 0.0333 0.1260 0.0024 0.0024 0.1260 0.0333 0.0024 0.0333 0.1260
10
10 Individual Score Calculation Q1 Q2Q3 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Node 8 Node 9 Node 10 Node 11 Node 12 Node 13 0.5767 0.0088 0.0088 0.1235 0.0076 0.0076 0.0283 0.0283 0.0283 0.0076 0.1235 0.0076 0.0088 0.5767 0.0088 0.0076 0.0076 0.1235 0.0088 0.0088 0.5767 0.0333 0.0024 0.1260 0.1260 0.0024 0.0333 0.1260 0.0333 0.0024 0.0333 0.1260 0.0024 0.0024 0.1260 0.0333 0.0024 0.0333 0.1260
11
11 AND: Combine Scores Q: How to combine scores? A: Multiply …= prob. 3 random particles coincide on node j
12
12 K_SoftAnd: Combine Scores Generalization – SoftAND: We want nodes close to k of Q (k<Q) query nodes. Q: How to do that?
13
13 K_SoftAnd: Combine Scores Generalization – softAND: We want nodes close to k of Q (k<Q) query nodes. Q: How to do that? A: Prob(at least k-out- of-Q will meet each other at j)
14
14 K_SoftAnd: Relaxation of AND Asking AND query? No Answer! Disconnected Communities Noise
15
15 K_SoftAnd: Combine Score Goal –Importance wrt query set –Depend on query scenario! How to… –Meeting Probability –K_SoftAnd –Prob(at least k-out-of-Q will meet each other at j)
16
16 AND query vs. K_SoftAnd query And Query 2_SoftAnd Query x 1e-4
17
17 1_SoftAnd query = OR query
18
18 Measuring Importance Q1 Q2Q3 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Node 8 Node 9 Node 10 Node 11 Node 12 Node 13 0.5767 0.0088 0.0088 0.1235 0.0076 0.0076 0.0283 0.0283 0.0283 0.0076 0.1235 0.0076 0.0088 0.5767 0.0088 0.0076 0.0076 0.1235 0.0088 0.0088 0.5767 0.0333 0.0024 0.1260 0.1260 0.0024 0.0333 0.1260 0.0333 0.0024 0.0333 0.1260 0.0024 0.0024 0.1260 0.0333 0.0024 0.0333 0.1260 0.4505 0.0710 0.2267 0.0710 0.4505 0.0710 0.4505 0.1010 OR 0.0103 0.0019 0.0103 0.0019 0.0103 0.0019 0.0024 0.0046 K_SoftAnd Random walk with restart And 2_SoftAnd Individual Scores Combining Scores Steady State Prob Meeting Prob
19
19 Roadmap Ceps Overview Q1: Goodness Score Calculation Q2: “Extract” Alg. Q3: Efficiency Experimental Results Conclusion
20
20 Goal –Maximize total scores and –‘Appropriate’ Connections How to…”Extract” Alg. –Dynamic Programming –Greedy Alg. Pickup promising node Find ‘best’ path “Extract” Alg. 1 2 3 5 4 6 7 8 9 10 11 12 13 141516 1 2 3 5 4 6 7 8 9 10 11 12 13
21
21 Roadmap Ceps Overview Q1: Goodness Score Calculation Q2: “Extract” Alg. Q3: Efficiency Experimental Results Conclusion
22
22 Graph Partition: Efficiency Issue Straightforward way –Q linear system: –linear to # of edge Observation –Skewed dist. How to… –Graph partition
23
23 Roadmap Ceps Overview Q1: Goodness Score Calculation Q2: “Extract” Alg. Q3: Efficiency Issue Experimental Results Conclusion
24
24 Experimental Setup Dataset –DBLP/authorship –Author-Paper –315k nodes –1,800k edges Evaluation Criteria –I Node Ratio –I Edge Ratio
25
25 Experimental Setup We want to check –Does the goodness criteria make sense? –Does “extract” alg. capture most of important nodes/edge? –Efficiency
26
26 Case Study: AND query
27
27 2_SoftAnd query Statistic database
28
28 Evaluation of “Extract” Alg. 20 nodes 90%+ preserved Budget (b) Node Ratio 2 query nodes 3 query nodes
29
29 Running Time vs. Quality for Fast Ceps Running Time Quality ~90% quality 6:1 speedup
30
30 Roadmap Ceps Overview Q1: Goodness Score Calculation Q2: “Extract” Alg. Q3: Efficiency Issue Experimental Results Conclusion
31
31 Conclusion Q1:How to measure the importance? A1: RWR+K_SoftAnd Q2: How to find connection subgraph? A2:”Extract” Alg. Q3:How to do it efficiently? A3:Graph Partition (Fast Ceps) –~90% quality –6:1 speedup
32
32 Q&A Thank you! htong@cs.cmu.edu
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.