Download presentation
Presentation is loading. Please wait.
1
Tantipathananandh Chayant Tantipathananandh with Tanya Berger-Wolf Constant-Factor Approximation Algorithms for Identifying Dynamic Communities
2
Social Networks These are snapshots and networks change over time
3
Dynamic Networks Aggregated network 5 4 3 2 1 1 32 2 1 1 Interactions occur in the form of disjoint groups Groups are not communities … t=2 t=1 32145 54 3 12 52341 5234 5241 … t=2 5 4 1 2 3
4
Communities What is community? “Cohesive subgroups are subsets of actors among whom there are relatively strong, direct, intense, frequent, or positive ties.” [Wasserman & Faust 1994] Dynamic Community Identification – GraphScope [Sun et al 2005] – Metagroups [Berger-Wolf & Saia 2006] – Dynamic Communities [TBK 2007] – Clique Percolation [Palla et al 2007] – FacetNet [Lin et al 2009] – Bayesian approach [Yang et al 2009]
5
Ship of Theseus Jeannot's knife “has had its blade changed fifteen times and its handle fifteen times, but is still the same knife.” [French story] from Wikipedia “The ship … was preserved by the Athenians …, for they took away the old planks as they decayed, putting in new and stronger timber in their place, insomuch that this ship became a standing example among the philosophers, for the logical question of things that grow; one side holding that the ship remained the same, and the other contending that it was not the same.” [Plutarch, Theseus]
6
Ship of Theseus … Individual parts never change identities Cost for changing identity
7
Ship of Theseus … Identity changes to match the group Costs for visiting and being absent
8
Approach
9
Community = Color Valid coloring: In each time step, different groups have different colors.
10
Interpretation Group color: How does community c interact at time t?
11
Interpretation Individual color: Who belong to community c at time t? 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2
12
Social Costs: Conservatism Switching cost α α α α Absence cost β 1 Visiting cost β 2 α α α 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
13
Social Costs: Loyalty β1β1 β1β1 β1β1 Absence cost β 1 Visiting cost β 2 Switching cost α β1β1 β1β1 β1β1 2 2 3 3 3 3 1 1 1 1 3 3 2 2 3 3 β1β1
14
Social Costs: Loyalty β2β2 β2β2 Switching cost αAbsence cost β 1 Visiting cost β 2 2 2 3 3 β2β2 2 2 β2β2 3 3
15
Problem Complexity Minimizing total cost is hard NP-complete and APX-hard [with Berger-Wolf and Kempe 2007] Constant-Factor Approximation [details in paper] Easy special case If no missing individuals and 2α ≤ β 2, then simply weighted bipartite matching [details in paper]
17
– assume all individuals are observed at all time steps
18
Greedy Approximation time No visiting or absence and minimizing switching No visiting or absence and minimizing switching
19
Greedy Approximation 2 4 3 3 7 3 3 4 ≈ maximizing path coverage No visiting or absence and minimizing switching No visiting or absence and minimizing switching 2 Improvement by dynamic programming Greedy alg guarantees max{2, 2α/β 1, 4α/β 2 } in α, β 1, β 2, independent of input size Greedy alg guarantees max{2, 2α/β 1, 4α/β 2 } in α, β 1, β 2, independent of input size time
20
Southern Women Data Set [DGG 1941] 18 individuals, 14 time steps Collected in Natchez, MS, 1935 aggregated network
21
Ethnography [DGG1941] Core note: columns not ordered by time
22
Optimal Communities all costs equal white circles = unknown Core time individuals ethnography
23
time Approximate Optimal Core ethnography
24
Approximation Power 28 inds, 44 times29 inds, 82 times313 inds, 758 times
25
Approximation Power 41 inds, 418 times264 inds, 425 times96 inds, 1577 times
26
Conclusions Identity of objects that change over time (Ship of Theseus Paradox) Formulate an optimization problem Greedy approximation – Fast – Near-optimal Future Work – Algorithm with guarantee not depending on α, β 1, β 2 – Network snapshots instead of disjoint groups
27
Arun Maiya Saad Sheikh Thank You NSF grant, KDD student travel award Habiba David Kempe Jared Saia Mayank Lahiri Dan Rubenstein Tanya Berger-Wolf Rajmonda Sulo Robert Grossman Siva Sundaresan Ilya Fischoff Anushka Anand Chayant
28
Ravi Kumar, Jasmine Novak, Prabhakar Raghavan, Andrew Tomkins IBM Almaden Research Center On the Bursty Evolution of Blogspace
29
Blogspace Collection of blogs with their links Motivation – Sociological Different with traditional web page – Technical From static snapshot to dynamic graphs
30
Web communities (Ravi Kumar,1999) groups of individuals who share a common interest characterized by dense directed bipartite subgraphs. Bursty communities of blogs Exhibit striking temporal characteristics Extract the community within a time interval
31
time graph G = (V,E) v in V has an associated duaration D(v) e in E is a triple (u, v, t) t is a time in interval D(u) ∩ D(v). prefix of G at time t Gt = (V t,E t ) V t = {v in V | D(v) ∩ [0, t] ≠ Ø } E t = {(u, v, t) in E| t’ ≤ t}
32
Two step approach – Community extraction Extract dense subgraphs( potential communities) – Bust analysis analyze each dense subgraph to identfy and rank bursts in these communities.
33
Finding the densest subgraph: NP-hard Two steps: – Pruning Remove vertices of degree no more than one Vertices of degree two are K 3 g Output and remove communities (pass a threshold) Repeat the 3 steps above – Expanding Determines the vertex containing the most links Add it to the community If the links is larger than t k.
34
Kleinberg’s method (SIGKDD 2002) model the generation of events by an automaton – one of two states, “low” and “high.” high state is hypothesized as generating bursts of events. a cost is associated with any state transition to discourage short bursts. find a low cost state sequence that is likely to generate the stream. solves the problem of enumerating all the bursts by order of weight( dynamic programming)
35
Expansion in community extraction Edges must grow to triangles; communities of size up to six will only grow vertices that link to all but one vertex; Communities of size up to nine will only grow vertices that link to all but two vertices; communities up to size 20 will grow only vertices that link to 70% of the community; larger communities will grow only vertices that link to at least 60% of the community
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.