Presentation is loading. Please wait.

Presentation is loading. Please wait.

Page 1 Inferring Relevant Social Networks from Interpersonal Communication Munmun De Choudhury, Winter Mason, Jake Hofman and Duncan Watts WWW ’10 Summarized.

Similar presentations


Presentation on theme: "Page 1 Inferring Relevant Social Networks from Interpersonal Communication Munmun De Choudhury, Winter Mason, Jake Hofman and Duncan Watts WWW ’10 Summarized."— Presentation transcript:

1 Page 1 Inferring Relevant Social Networks from Interpersonal Communication Munmun De Choudhury, Winter Mason, Jake Hofman and Duncan Watts WWW ’10 Summarized and presented by Kim Chungrim

2 Page 2 Contents Introduction Motivation Inferring Social Networks –Dataset –Constructing Thresholded Networks Network Descriptive Statistics –Network level features –Node level features Network-based Prediction –Node Status / Gender –Future Communication / Community Detection Discussion/Conclusion

3 Page 3 The rapidly growing volume of electronic communication data has been a great benefit to social network analysis. However, social network analysts have found out that there are two problems: –Inference problem : the “real” social ties are not directly observable and hence must be inferred from observation of events –Relevance problem : there is no one “true” social network, but rather many such networks, each corresponding to a different definition of a tie, and each relevant to different social processes According on the definition of an ‘edge’, a network can have different meanings –1) An edge exists between I and j if either has communicated with the other at least once in the past year –2) An edge exists if each has communicated with the other at least once in the past week –3) An edge exists if each has communicated with the other at least once per week for the past year Which of these networks is the “relevant” one depends on the research question of interest I NTRODUCTION

4 Page 4 Motivation Define a minimum threshold  on a network threshold To infer networks for various definitions of “threshold” over a tie Study the impact of different thresholded networks on: –Descriptive statistics –Ability of the network in predicting node characteristics

5 Page 5 Inferring Social Networks - Datasets University Email –A compiled registry of all email associated with individuals at a large university –Duration : 2 years (6 Trimester) –Number of users : 19,817 –Number of emails : 1.09M –Disregard emails involving non-university domain –A node contains information about a person : id, gender, position, etc Enron Email –A repository of the emails exchanged internally among the employees at Enron –Duration : 4 years –Number of users : 4,736 –Number of emails : 1.06M –A node contains information about a person : id, position, etc

6 Page 6 Inferring Social Networks - Constructing Thresholded Networks Edge definition –Geometric mean of the annualize rate of messages exchanged Edge threshold –Minimum of  emails between each pair of individuals, over a period of time T –A social graph G(V,E;  ) s.t. –A Family of networks: {G( ), G( ), …, G( )}

7 Page 7 Network Descriptive Statistics – Network Level Features Network density: –Number of edges –Number of connected nodes –Number of components –Relative Sizes of Components

8 Page 8 Network Descriptive Statistics – Network Level Features

9 Page 9 Network Descriptive Statistics – Node Level Features Reach of a node: –Node degree : –Average Neighbor Degree : The average degree over all of a nodes neighbor –Size of Two-hop Neighborhood : count of all of the node’s neighbors plus all of the node’s neighbors’ neighbors

10 Page 10 Network Descriptive Statistics – Node Level Features Closure of the ego-network: –Embeddedness –Normalized clustering coefficient

11 Page 11 Network Descriptive Statistics – Node Level Features To what extent does a node “bridge” communities: –Network constraint [Burt ‘04] –Number of ego components : count of the number of connected components that rema in when the focal node and its incident edges are removed

12 Page 12 Network-based Prediction The characteristics of a network depends on the threshold Which network to choose for an experiment? Experiment to find out the right threshold for various research interest – Predictions on Node Status/Gender – Predictions on Future communication activity – Predictions on Community detection

13 Page 13 Prediction Tasks: Node Status/Gender Given feature vector, A Feature matrix is built using the feature vectors for each node i, and a vector of status/gender attribute of each node i is constructed. The and are split into training set and test se –Training set : 90% of the and –Test set : 10 % of the and Using SVM with Gaussian RBF kernel, learn parameters & kernel width with 10-fold cross-validation

14 Page 14 Prediction Tasks: Future Communication / Community Detection Given a feature vector Where is the activity of node j from time t0 to tm and is the activity of node I at the time tl. The model of communication activity can be expressed as a function –The best-fit regression coefficient is used to predict the future node activity Fit a stochastic block model to G(  ) using variational Bayes inference [Hofman et al. 2008]

15 Page 15 Experimental Result – University Dataset

16 Page 16 Experimental Results – Enron Dataset

17 Page 17 Conclusion It is hard to find the optimal threshold –Accuracy maximized at non-obvious point –Still, accuracy is improved 30% than the unthresholded network –Deleting edges removes noise Optimal threshold at consistent value –For different prediction tasks –For different data sets

18 Page 18 Summary / Discussion / Future work Network inference procedure assumes ad-hoc edge filtering Introduced a threshold on edges and a family of Networks to find a optimal threshold for a certain prediction task –The prediction accuracies peak in a non-obvious yet relatively narrow threshold range Tested on too few datasets Not enough to give a solid conclusion Apply method to variety of networks Test various thresholds for more interests


Download ppt "Page 1 Inferring Relevant Social Networks from Interpersonal Communication Munmun De Choudhury, Winter Mason, Jake Hofman and Duncan Watts WWW ’10 Summarized."

Similar presentations


Ads by Google