Liaoruo Wang and John E. Hopcroft Dept. of Computer Engineering & Computer Science, Cornell University In Proc. 7th Annual Conference on Theory and Applications of Models of Computation (TAMC), June 2010 Presented by Nam Nguyen
Motivation Introduction Contributions of the paper Definitions WHISKER is NP-Complete. Algorithms.
C.S is a classical but still-hot topic in complex networks. Previous studies: Communities were assumed to be densely connected inside but sparsely connected outside. A different point of view: We should disregard “whiskers” and elaborate “cores” in the networks.
Roughly speaking ◦ Whiskers: Subsets of vertices that are barely connected to the rest of the network. ◦ Cores: Connected subgraphs that are densely connected inside and well-connected to the rest of the network, i.e., “real communities” Why??? ◦ For real-world societies, communities are also well connected to the rest of the network. ◦ Imagine a close-nit community, CISE Dept., with only one connection with the outer world. Definitions come right away.
More concrete definitions of “whiskers” and “cores” in a networks. WHISKER is NP-Complete Three heuristic algorithms for finding approximate cores. Simulation results.
Graph G = (V,E) undirected, A = (A i,j ). For S⊆V, let S C = V\S. Conduction of S where A suitable cut
A k-whisker A maximal k-whisker
A whisker A maximal whisker
A core
Proof The only suitable cut of size = 26 |S ⋃ T| = 25 >
Proof (1a) e xr + e xz + e yr + e yz ≤ v x + v y (1b) e yr + e xy + e zr + e xz ≤ v y + v z (1c) e xr + e yr + e zr > v x + v z (1a) + (1b) and use (1c) gives e xr +2e yr +e zr +e xy +e yz +2e xz ≤ v x +2v y +v z < e xr +e yr +e zr +v y e yr + e xy + e yz < v y
NAE-3-SAT: The problem of determining whether there exists a truth assignment for a 3-CNF Boolean formula such that each clause has at least one true literal and at least one false literal. Fact: NAE-3-SAT is NP-Complete [1] WHISKER: Given an unweighted undirected graph, determine whether there exists a whisker or not. WHISKER is NP-Complete (of course, from a reduction from NAE-3-SAT)
Road map ◦ 1. Construct a special graph G of 2n vertices and show that G admits 2 n whiskers and no more. ◦ 2. Construct a G-like graph for the 3- SAT problem. ◦ 3. Make a reduction from NAE-3-SAT problem to WHISKER
WHISKER is in NP Reduction from NAE-3-SAT to WHISKER ◦ Consider the following graph (constructed in poly time) At each row, pick only one vertex (i.e., either x i or ¬x i ) The resulted graph G of n vertices is a whisker Total number of whiskers is 2 n ………… And no more than that
2 n whiskers and no more than that!!! Why??? Suppose there is a whisker W of 2k+j vertices Cut size of W By definition of suitable cut size, we have which implies !!!!
NAE-3-SAT ≤ P WHISKER Consider an instance of NAE-3-SAT with n variables and c clauses. Construct G 1, G 2, …, G c as follow
NAE-3-SAT ≤ P WHISKER Now, combine all G i ’s and add up all edge weights to get G’. Next GG G’ G* 3CNF has a satisfied assignment contains a whisker update
Update G ( ) Update G’ ◦ Amplify all edge weights of G’ by a small amount δ where cn 2 δ << 1 All whiskers in new G are the same as in old G.
G* = G + G’ Goal: If the 3CNF instance has a satisfied truth assignment, then selecting true literal from each row of G* gives us a whisker of size n, and vice versa. For any truth assignment of 3SAT, rearrange the literals in to TRUE and FALSE columns. If there is a satisfied not-all-equal assignment for 3SAT ◦ Each clause must have one TRUE and one FALSE literals. ◦ Not all the literals in each clause can be in the same column. ◦ For each i th clause, G i contains n 2 -2 edges connecting its two columns ◦ Total cut size is required to satisfied
If there is NO satisfied not-all-equal assignment for 3SAT ◦ At least one clause i has its literals located in the same column n 2 edges between the two columns of G i. ◦ For the other (c-1) clauses, there are at most (n 2 -2) edges connecting the their two columns. Total number of edges: (c-1)(n 2 -2)+n 2 = cn 2 –2c+2. ◦ Of course, we don’t want selecting the true literal in each row give us a whisker, thus Combining the two inequalities, if ℇ and δ is chosen such that Then If the 3CNF instance has a satisfied truth assignment, then selecting true literal from each row of G* gives us a whisker of size n, and vice versa. ◦ Hence, NAE-3-CNF ≤ P WHISKER □
On random graph ◦ Alg 2 can positively find an approximate core ◦ Alg 3 fails to find approximate core ◦ The size of core growing linearly with d = np (fixed n) and logarithmically with n (fixed d) ◦ ??? G(n,p) displays core structure with high probability when p > 1/n ???
Textual graph ◦ Vertices and Edges: Words and their semantic Correlations ◦ Data is crawled from 10K scientific papers of KDD conf. ( ) ◦ Pointwise mutual information ◦ Total: 685 vertices and edges
Both alg 2 and 3 successfully find approximate cores. Higher values of λ indicate smaller core sizes. Fig (b), the best community of the textual graph has a large conductance of.3 best community has as many internal edges as cut edges. Alg 3 is believed to be more useful.
Is a “whisker” make sense?
[1] Schaefer, T. J. The complexity of satisfiability problems. In Proc. 10th Ann. ACM Symp. on Theory of Computing (1978), Association for Computing Machinery, pp