Download presentation
Published byScot Sherman Modified over 9 years ago
1
Greedy Approximation Algorithms for finding Dense Components in a Graph
Paper by Moses Charikar Presentation by Paul Horn
2
Overview Differing definitions of density The problem Undirected Case
Linear Programming Network Flows Approximation Directed Case
3
Defining Density Logical definition of density relates the number of edges to the number of possible edges. In other words, given G(V,E)
4
Problems with Density This simple definition of density does not make sense when looking for a densest subgraph, as two vertices connected by an edge have density 1, and this problem simplifies to maximum clique
5
Redefining Density Instead we define density as the average degree of a subgraph. This definition of density is appropriate for sparse graphs This definition is, however, inappropriate for Erdős-Rényi random graphs.
6
Density of a Directed Graph
Introduced by Kannan and Vinay Given a digraph G(V,E), consider subgraphs S, T and let E(S,T) be the set of directed edges from S to T. Then the density of the sets S and T is The density of the graph G is
7
The problem Known exact algorithms for finding a maximum density subgraph of a graph are cubic or slower. For large graphs, such as the webgraph – or even any sizable chunk of the webgraph this is too slow.
8
Linear programming In an undirected case an exactly solution can be solved by maximizing the following LP.
9
Go with the flow? Flow-based algorithm to find a maximum density subgraph exists. Finding a Maximum Density Subgraph, by A.V. Goldberg Creates a digraph from the undirected graph, and uses flows to partion the graph. Requires log(n) executions of a max flow algorithm
10
Getting Greedy… Since the density of a subgraph S is its average degree, nodes of lowest degree are least likely to be a part of the densest subgraph. Algorithm: Remove the lowest degree vertex each time, find the maximum density subgraph. Runs in O(|V|) time. Theorem: Algorithm is a 2-approximation of f(S)
11
Directing our Insight Finding the maximum d(S,T) is harder as we need to find the maximum over all subgraphs S and T. For our exact case, we can generalize our LP to use |S|/|T| = c as a parameter to give us our new LP(c) Can be solved in O(n2) linear programs
12
LP(c) LP(c) A solution to this linear program corresponds to the densests sets S, T such that |S|/|T| = c for a given value of c. Therefore
13
Approximate this. Idea: Maintain two sets, S and T. At each iteration remove either the vertex of the lowest ‘degree’ in S or T based on a certain rule. We define degree of a vertex x in S to be |E({x}, T)| and degree of a vertex y in T to be |E(S,{y})|. Our rule is based on the same idea of c=|S|/|T| that we found in the linear progam, so each pass finds an S and T that maximize for that particular c.
14
Analyzing our Approximation
When run over all c values, this algorithm gives us a 2 approximation of d(c). There are, however, roughly n2 possible values of c. Each iteration can run in O(m+n) time. Therefore running through all possible values becomes restrictive. An is possible in iterations of the algorithm.
15
Generalizations, and notes
While there is a flow-based algorithm for finding a maximum density subgraph of an undirected graph, none is known for a digraph. Both cases can be generalized to weighted graphs, however the linear nature of the algorithm does not hold. Using Fibonacci heaps it can run in O(m+nlogn). (in the directed case, for a single value of c.)
16
Wrapping Up Finding dense subgraphs is important in areas such as clustering. Kannan and Vinay defintion of density motivated by the idea of hubs and authorities. With large graphs (such as any sizable chunk of the webgraph), solving the n2 LP to find the exact densest graph is unrealistic
17
Wrapping Up: The Sequel
Therefore, the paper Provides LP solutions to both the directed and undirected cases Provides a linear approximation algorithm for undirected graph techniques Generalizes the algorithm to directed graphs, finding sets S and T given |S|/|T|=c. Observes that this is a 2-aproximation when run over all values of c and a aproximation is possible in iterations.
18
Future Work Flow based algorithm for directed case.
The defintion of density which we used does not require S and T to be disjoint. How does this requirement affect the algorithm and it’s complexity? An n-approximation of d(G) can provide an O(n)-approximation of d’(G)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.