A Distributed and Privacy Preserving Algorithm for Identifying Information Hubs in Social Networks M.U. Ilyas, Z Shafiq, Alex Liu, H Radha Michigan State.

Slides:



Advertisements
Similar presentations
Bayesian Belief Propagation
Advertisements

Fast algorithm for detecting community structure in networks M. E. J. Newman Department of Physics and Center for the Study of Complex Systems, University.
Complex Networks Advanced Computer Networks: Part1.
Mobile Communication Networks Vahid Mirjalili Department of Mechanical Engineering Department of Biochemistry & Molecular Biology.
Network analysis Sushmita Roy BMI/CS 576
Social network partition Presenter: Xiaofei Cao Partick Berg.
LEARNING INFLUENCE PROBABILITIES IN SOCIAL NETWORKS Amit Goyal Francesco Bonchi Laks V. S. Lakshmanan University of British Columbia Yahoo! Research University.
Spread of Influence through a Social Network Adapted from :
CSE 5243 (AU 14) Graph Basics and a Gentle Introduction to PageRank 1.
Introduction to Network Theory: Modern Concepts, Algorithms
Analysis and Modeling of Social Networks Foudalis Ilias.
Community Detection Algorithm and Community Quality Metric Mingming Chen & Boleslaw K. Szymanski Department of Computer Science Rensselaer Polytechnic.
Relationship Mining Network Analysis Week 5 Video 5.
Nodes, Ties and Influence
Advanced Topics in Data Mining Special focus: Social Networks.
CS 599: Social Media Analysis University of Southern California1 The Basics of Network Analysis Kristina Lerman University of Southern California.
DATA MINING LECTURE 12 Link Analysis Ranking Random walks.
Are You moved by Your Social Network Application? Abderrahmen Mtibaa, Augustin Chaintreau, Jason LeBrun, Earl Oliver, Anna-Kaisa Pietilainen, Christophe.
Fast Distributed Algorithm for Convergecast in Ad Hoc Geometric Radio Networks Alex Kesselman, Darek Kowalski MPI Informatik.
Privacy-Preserving Cross-Domain Network Reachability Quantification
Graphs and Topology Yao Zhao. Background of Graph A graph is a pair G =(V,E) –Undirected graph and directed graph –Weighted graph and unweighted graph.
PageRank Identifying key users in social networks Student : Ivan Todorović, 3231/2014 Mentor : Prof. Dr Veljko Milutinović.
Final Presentation Undergraduate Researchers: Graduate Student Mentor: Faculty Mentor: Jordan Cowart, Katie Allmeroth Krist Culmer Dr. Wenjun Zeng Investigating.
On Distinguishing between Internet Power Law B Bu and Towsley Infocom 2002 Presented by.
Simpath: An Efficient Algorithm for Influence Maximization under Linear Threshold Model Amit Goyal Wei Lu Laks V. S. Lakshmanan University of British Columbia.
Midterm Presentation Undergraduate Researchers: Graduate Student Mentor: Faculty Mentor: Jordan Cowart, Katie Allmeroth Krist Culmer Dr. Wenjun (Kevin)
COVERTNESS CENTRALITY IN NETWORKS Michael Ovelgönne UMIACS University of Maryland 1 Chanhyun Kang, Anshul Sawant Computer Science Dept.
Network Measures Social Media Mining. 2 Measures and Metrics 2 Social Media Mining Network Measures Klout.
Leveraging Big Data: Lecture 11 Instructors: Edith Cohen Amos Fiat Haim Kaplan Tova Milo.
Models of Influence in Online Social Networks
Optimization Based Modeling of Social Network Yong-Yeol Ahn, Hawoong Jeong.
On Anomalous Hot Spot Discovery in Graph Streams
Topic 13 Network Models Credits: C. Faloutsos and J. Leskovec Tutorial
Multigraph Sampling of Online Social Networks Minas Gjoka, Carter Butts, Maciej Kurant, Athina Markopoulou 1Multigraph sampling.
Link Recommendation In P2P Social Networks Yusuf Aytaş, Hakan Ferhatosmanoğlu, Özgür Ulusoy Bilkent University, Ankara, Turkey.
Presentation: Random Walk Betweenness, J. Govorčin Laboratory for Data Technologies, Faculty of Information Studies, Novo mesto – September 22, 2011 Random.
WALKING IN FACEBOOK: A CASE STUDY OF UNBIASED SAMPLING OF OSNS junction.
Network Characterization via Random Walks B. Ribeiro, D. Towsley UMass-Amherst.
Glasgow 02/02/04 NN k networks for content-based image retrieval Daniel Heesch.
Lectures 6 & 7 Centrality Measures Lectures 6 & 7 Centrality Measures February 2, 2009 Monojit Choudhury
Special Topics in Educational Data Mining HUDK5199 Spring 2013 March 25, 2012.
Mining Social Networks for Personalized Prioritization Shinjae Yoo, Yiming Yang, Frank Lin, II-Chul Moon [KDD ’09] 1 Advisor: Dr. Koh Jia-Ling Reporter:
ACM International Conference on Information and Knowledge Management (CIKM) Analysis of Physical Activity Propagation in a Health Social Network.
Peer Centrality in Socially-Informed P2P Topologies Nicolas Kourtellis, Adriana Iamnitchi Department of Computer Science & Engineering University of South.
Xiaowei Ying, Xintao Wu Dept. Software and Information Systems Univ. of N.C. – Charlotte 2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia.
Exploit of Online Social Networks with Community-Based Graph Semi-Supervised Learning Mingzhen Mo and Irwin King Department of Computer Science and Engineering.
A Passive Approach to Sensor Network Localization Rahul Biswas and Sebastian Thrun International Conference on Intelligent Robots and Systems 2004 Presented.
Network Community Behavior to Infer Human Activities.
Privacy Preserving Payments in Credit Networks By: Moreno-Sanchez et al from Saarland University Presented By: Cody Watson Some Slides Borrowed From NDSS’15.
CS 590 Term Project Epidemic model on Facebook
Informatics tools in network science
Privacy Preserving in Social Network Based System PRENTER: YI LIANG.
1 Relational Factor Graphs Lin Liao Joint work with Dieter Fox.
Transportation Networks September 9, 2014 Michael Lin Alex Farrell Ziqi Zhu Sanjeev Ramachadra.
Importance Measures on Nodes Lecture 2 Srinivasan Parthasarathy 1.
A Connectivity-Based Popularity Prediction Approach for Social Networks Huangmao Quan, Ana Milicic, Slobodan Vucetic, and Jie Wu Department of Computer.
Lecture 23: Structure of Networks
Groups of vertices and Core-periphery structure
Comparison of Social Networks by Likhitha Ravi
Network analysis.
Community detection in graphs
Dieudo Mulamba November 2017
Lecture 23: Structure of Networks
Network Science: A Short Introduction i3 Workshop
Pong: Diagnosing Spatio-Temporal Internet Congestion Properties
Design of Hierarchical Classifiers for Efficient and Accurate Pattern Classification M N S S K Pavan Kumar Advisor : Dr. C. V. Jawahar.
Department of Computer Science University of York
Lecture 23: Structure of Networks
GANG: Detecting Fraudulent Users in OSNs
Advanced Topics in Data Mining Special focus: Social Networks
Presentation transcript:

A Distributed and Privacy Preserving Algorithm for Identifying Information Hubs in Social Networks M.U. Ilyas, Z Shafiq, Alex Liu, H Radha Michigan State University INFOCOM’11 Mini Conference

2 / 13 Background and Motivation  Information hubs in social network ─ Definition: users that have a large number of interactions with others. ─ Interaction=transmission of information from one user to another such as posting a comment.  Hubs are important for the spread of propaganda, ideologies, or gossips.  Applications ─ Free sample distribution ● Samsung used Twitter feeds to identify dissatisfied iPhone 4 owners who are the most active in terms of communication with their friends and offer them free GalaxyS phones. ─ Word of mouth advertisement Alex X. Liu

3 / 13 Problem Statement  Top-k information hub identification from friendship graph ─ Ground truth: interaction graph degree ─ Identifying top-k hubs from interaction graph is difficult. ● Data collection is difficult. –Interaction graph requires to collect data over a long time. ● More user information to keep private.  Distributed ─ Friendship graph may not be accessible  Privacy-preserving ─ Users do not reveal friends’ lists

4 / 13 Limitations of Prior Art  Use interaction graph information ─ Influence maximization [Leskovec07,Goyal08] ● Centralized ● Need access to complete graph  Use friendship graph information [Marsden02,Shi08] ─ Degree centrality = # friends of a node ● Measures the immediate rate of spread of a replicable commodity by a node ─ Closeness centrality = 1/(sum of lengths of shortest paths from a node to rest of the nodes) ● Optimizes detection time of information flows ─ Betweeness centrality = fraction of all pair shortest paths passing through a node ● Optimizes detection probability of information flows ─ Eigenvector centrality ● Better than the other three metrics. Alex X. Liu

5 / 13 Limitations of Eigenvector Centrality Alex X. Liu  Eigenvector Centrality  Principal eigenvector of adjacency matrix  EVC works well enough in graphs consisting of a single cluster/community of nodes  Principal eigenvector is “pulled” in the direction of the largest community

6 / 13 Proposed Approach 1.Top-k information hub identification ─ Principal Component Centrality (PCC) 2.Distributed and Privacy-preserving ─ Power method [Lehoucq96] ─ Kempe-McSherry (KM) algorithm [Kempe08] Alex X. Liu

7 / 13  Principal Component Centrality (PCC)  Use P<<N, not 1, most significant eigenvectors. Principal Component Centrality

8 / 13  Method: phase angle between EVC vector and PCC vector  For our data set, P=10 is good enough. Determine Approriate # of Eigenvectors in PCC

9 / 13 Distributed and Privacy-Preserving  Iterative algorithms  Power algorithm ─ Pros: implement is simple ─ Cons: ● Communication overheads grow exponentially with each additional eigenvector computation ● Suffers from rounding errors  Kempe & McSherry’s (KM) algorithm ─ Pros: ● Communication overheads grow linearly with each additional eigenvector computation ● Accurate estimation, good convergence ─ Cons: Implementation is more complex  Users don’t reveal friends’ lists to others

10 / 13 Data Set  Facebook data collected by Wilson et al. at UCSB  Consists of: 1.Friendship graph[Input data] 2.Messages exchanged[Ground truth]  # Users 3,097,165  # Friendship Links 23,667,394  Average Clustering Coefficient  # Cliques 28,889,110

11 / 13 Experimental Results (1/2)  Correlation coefficient between PCC vector and degree centrality vector from interaction graph  Logs of 3 time durations ─ 1 month, 6 months, ~ 1 year  Observation 1: PCC outperforms EVC  Observation 2: Better accuracy for longer duration data Alex X. Liu

12 / 13 Experimental Results (2/2)  Evaluate |top-k users identified by PCC vector ∩ top-k users identified by degree centrality vector from interaction graph | / k  K=2000 in our experiments  Observation 1: PCC outperforms EVC  Observation 2: Better results for longer duration data Alex X. Liu

13 / 13 Questions? Alex X. Liu