Community Detection in a Large Real-World Social Network Karsten Steinhaeuser Nitesh V. Chawla DIAL Research Group University of Notre Dame April 1, 2008
Cellular Phone Network Real social network Represents actual interactions between individuals Requires intent to communicate Network dimensions 1.3 million nodes (customers) 1.2 million edges (aggregate of voice and text) Contains a wealth of data Communication Links Customer Demographics Temporal Data Spatial Data
Community Detection with Random Walks No Weighting Topology-Based Attribute-Based Weight Clustering Using Random Walks Walk Agglomeration with EA Combine Input Graph Weighted Graph Co-Association Matrix Community Structure
Algorithm Comparison AlgorithmComplexityComments / Assessment Scalable Random Walks O(n) with EAFinds good divisions with high efficiency, still parameterized FastQO(n log 2 n)Computationally efficient, limited by modularity WalkTrapO(n 2 log n)Finds divisions similar to FastQ but at higher complexity MCLO(n 3 )Better divisions but matrix computations limit scalability
Experimental Results Edge weighting based on topology CCS = clustering coefficient similarity CNS = common neighbor similarity Real edge weights Call frequency Call duration NAS = edge weighting based on node attributes WeightingModularityTime (s) CCS (topological)< CNS (topological)< Frequency Duration NAS
Future Work Incorporate network dynamics Spatial data Temporal data