Download presentation
Presentation is loading. Please wait.
Published byHugh Allison Modified over 9 years ago
1
1/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Online Search of Overlapping Communities Wanyun Cui, Fudan University Yanghua Xiao, Fudan University Haixun Wang, Microsoft Research Asia Yiqi Lu, Fudan University Wei Wang, Fudan University Presenter. Wanyun Cui
2
2/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Outline Motivation Model Algorithm Experiments Applications
3
3/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Outline Motivation Model Algorithm Experiments Applications
4
4/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Complex network Complex network is everywhere. Social Network
5
5/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Complex network Complex network is everywhere. Internet
6
6/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Complex network Complex network is everywhere. Protein Network
7
7/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Complex network Complex network is everywhere. InternetSocial NetworkProtein Network
8
8/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Community structures Complex network is everywhere. Most real life networks have community structures. The graph can be divided into different groups such that the vertices within each group are closely connected and the vertices between different groups are sparsely connected InternetSocial NetworkProtein Network
9
9/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Overlapping community structure Overlapping community: a vertex may belong to multiple communities
10
10/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Overlapping community structure Overlapping community: a vertex may belong to multiple communities C1: small boat C2: meaning of bucket C3: big boat C4: table wares
11
11/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Finding community structures Two possible ways to find the community structure OCD: overlapping community detection OCS: overlapping community search
12
12/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com OCD vs. OCS OCD: divides the entire network to find communities
13
13/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com OCD vs. OCS Disadvantages of OCD Too costly Global criterion Unfriendly to dynamic graph Facebook network: over 800 million nodes and 100 billion links algorithmcomplexity Girvan–Newman algorithm O(|E| 3 ) LPAAlmost linear LAO(|C||E|+|V|)
14
14/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com OCD vs. OCS Disadvantages of OCD Too costly Global criterion Unfriendly to dynamic graph A fixed parameter or criterion is not appropriate for all vertices and queries. Communities of a student Communities of Barack Obama
15
15/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com OCD vs. OCS Disadvantages of OCD Too costly Global criterion Unfriendly to dynamic graph Graphs in real life are always evolving over time. We cannot afford to run OCD very frequently. OCD loses its freshness and effectiveness
16
16/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com OCD vs. OCS Disadvantages of OCD Too costly Global criterion Unfriendly to dynamic graph Usually performed in an offline fashion
17
17/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com OCS: problem definition OCS: Given graph G, a query vertex v Return: all communities that v belong to Given:Return:
18
18/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com OCD vs. OCS Advantages of OCS: More efficient Personalized criterion Light weight We just need to find communities within the local neighborhoods of the vertex. Our OCS solution only needs several milliseconds to find answer
19
19/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com OCD vs. OCS Advantages of OCS: More efficient Personalized criterion Friendly to dynamic graph
20
20/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com OCD vs. OCS Advantages of OCS: More efficient Personalized criterion Light weight A good choice to find communities in an online fashion
21
21/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Applications of OCS Friend recommendation on Facebook. Semantic expansion. Infectious disease control. Etc.
22
22/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Challenges of OCS Modeling Complexity and scalability A community should be dense enough Overlapping aware Generality
23
23/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Challenges of OCS Modeling Complexity and scalability OCS in the worst case may need to enumerate an exponential number of valid communities. Computational hard Approximate approach
24
24/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Outline Introduction Model Algorithm Experiments Applications
25
25/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Model Community structure awareness Overlapping awareness Generality The inner edges of a community should be dense Clique as the unit of community A clique of 6 vertices
26
26/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Model Community structure awareness Overlapping awareness Generality Two k-cliques are adjacent if they share k-1 vertices A community is a component in the k-clique graph Original graphClique graph (k=4)
27
27/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Model Community structure awareness Overlapping awareness Generality
28
28/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Model Community structure awareness Overlapping awareness Generality It’s ok if a few edges are missing in the clique
29
29/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Model Community structure awareness Overlapping awareness Generality If two cliques share at least vertices, they are adjacent.
30
30/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Model Community structure awareness Overlapping awareness Generality Original graph
31
31/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com k=4
32
32/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Alpha-gamma ocs k=3
33
33/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Parameter selection
34
34/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Outline Introduction Model Algorithm Experiments Applications
35
35/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Algorithm Exact algorithm Approximate algorithm
36
36/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Exact Algorithm Example k=4, (3,1)-OCS Query vertex = Bob
37
37/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Exact Algorithm Example k=4, (3,1)-OCS Query vertex = Bob Drawback exponential enumerations
38
38/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Approximate Algorithm Example k=4, (3,1)-OCS Query vertex = Bob Approximate the new clique contains at least one new vertex
39
39/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Approximate Algorithm Example k=4, (3,1)-OCS Query vertex = Bob Approximate the new clique contains at least one new vertex
40
40/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Outline Introduction Model Algorithm Experiments Applications
41
41/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Experiments Setup Dataset Intel Core2 2.13GHz 4GB memory 64 bit windows 7
42
42/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Experiments Setup Dataset Dataset|V||E| WordNet82676133445 DBLP5608511816613 Google9164274322051 Livejournal484757242851237
43
43/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Effectiveness It successfully unveils multiple research interests Example Jiawei Han K=6 Jiawei Han C1: multimedia data mining C2: stream data mining C3: information network
44
44/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Effectiveness Our model is flexible to support different parameters. Example Jiawei Han K=9 Jiawei Han
45
45/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Effectiveness For most vertices, OCS model can find non-trivial results.
46
46/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Performance OCS is more efficient than OCD. Competitors: LA OSLOM Amortized time (Total time of OCD)/n
47
47/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Performance: influence of parameters
48
48/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Accuracy of approximate algorithm More than 70% accuracy can be consistently achieved, in some cases almost 90% accuracy can be achieved
49
49/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Outline Introduction Model Algorithm Experiments Applications
50
50/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Diversity-based Social Network Analysis What is the distribution of diversity? Can we find people with really large diversity?
51
51/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Name disambiguation Ambiguous names with a significant number of entities also have a large number of communities. Real person’s communities is smaller than these ambiguous names.
52
52/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Contributions Problem definition Model Guide for parameter selection Algorithms Extensive experiments and applications
53
53/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn GDM@FUDANGDM@FUDAN www.gdm.fudan.edu.cn Email: zhenjiong@gmail.com Q&A Thank you!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.