1 Yuxiao Dong *, Jie Tang $, Tiancheng Lou #, Bin Wu &, Nitesh V. Chawla * How Long will She Call Me? Distribution, Social Theory and Duration Prediction.

Slides:



Advertisements
Similar presentations
Mobile Communication Networks Vahid Mirjalili Department of Mechanical Engineering Department of Biochemistry & Molecular Biology.
Advertisements

By Venkata Sai Pulluri ( ) Narendra Muppavarapu ( )
Mining Triadic Closure Patterns in Social Networks
LEARNING INFLUENCE PROBABILITIES IN SOCIAL NETWORKS Amit Goyal Francesco Bonchi Laks V. S. Lakshmanan University of British Columbia Yahoo! Research University.
Confluence: Conformity Influence in Large Social Networks
Modelling Paying Behavior in Game Social Networks Zhanpeng Fang +, Xinyu Zhou +, Jie Tang +, Wei Shao #, A.C.M. Fong *, Longjun Sun #, Ying Ding -, Ling.
1 Inferring User Demographics and Social Strategies in Mobile Social Networks Yuxiao Dong #, Yang Yang +, Jie Tang +, Yang Yang #, Nitesh V. Chawla # #
Jure Leskovec Joint work with Eric Horvitz, Microsoft Research.
1 Social Influence Analysis in Large-scale Networks Jie Tang 1, Jimeng Sun 2, Chi Wang 1, and Zi Yang 1 1 Dept. of Computer Science and Technology Tsinghua.
1 Yuxiao Dong *$, Jie Tang $, Sen Wu $, Jilei Tian # Nitesh V. Chawla *, Jinghai Rao #, Huanhuan Cao # Link Prediction and Recommendation across Multiple.
Link creation and profile alignment in the aNobii social network Luca Maria Aiello et al. Social Computing Feb 2014 Hyewon Lim.
Graph Data Management Lab School of Computer Science , Bristol, UK.
1 1 Chenhao Tan, 1 Jie Tang, 2 Jimeng Sun, 3 Quan Lin, 4 Fengjiao Wang 1 Department of Computer Science and Technology, Tsinghua University, China 2 IBM.
Analysis of Large-Scale Cell Phone Networks Course Project Leman Akoglu Bhavana Dalvi Skyler Speakman April
CMU SCS Mining Billion-node Graphs Christos Faloutsos CMU.
Time-dependent Similarity Measure of Queries Using Historical Click- through Data Qiankun Zhao*, Steven C. H. Hoi*, Tie-Yan Liu, et al. Presented by: Tie-Yan.
Sampling from Large Graphs. Motivation Our purpose is to analyze and model social networks –An online social network graph is composed of millions of.
Who Will Follow You Back? Reciprocal Relationship Prediction* 1 John Hopcroft, 2 Tiancheng Lou, 3 Jie Tang 1 Department of Computer Science, Cornell University,
Community Detection in a Large Real-World Social Network Karsten Steinhaeuser Nitesh V. Chawla DIAL Research Group University of Notre.
A Measurement-driven Analysis of Information Propagation in the Flickr Social Network WWW09 报告人: 徐波.
Modelling Paying Behavior in Game Social Networks Zhanpeng Fang +, Xinyu Zhou +, Jie Tang +, Wei Shao #, A.C.M. Fong *, Longjun Sun #, Ying Ding -, Ling.
1 1 Chenhao Tan, 1 Jie Tang, 2 Jimeng Sun, 3 Quan Lin, 4 Fengjiao Wang 1 Department of Computer Science and Technology, Tsinghua University, China 2 IBM.
Active Learning for Networked Data Based on Non-progressive Diffusion Model Zhilin Yang, Jie Tang, Bin Xu, Chunxiao Xing Dept. of Computer Science and.
Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.
Active Learning for Class Imbalance Problem
Modeling Relationship Strength in Online Social Networks Rongjing Xiang: Purdue University Jennifer Neville: Purdue University Monica Rogati: LinkedIn.
Suggesting Friends using the Implicit Social Graph Maayan Roth et al. (Google, Inc., Israel R&D Center) KDD’10 Hyewon Lim 1 Oct 2014.
Jure Leskovec Joint work with Eric Horvitz, Microsoft Research.
Jure Leskovec, CMU Eric Horwitz, Microsoft Research.
Jure Leskovec Computer Science Department Cornell University / Stanford University Joint work with: Eric Horvitz, Michael Mahoney,
Using Transactional Information to Predict Link Strength in Online Social Networks Indika Kahanda and Jennifer Neville Purdue University.
Data Mining and Machine Learning Lab Network Denoising in Social Media Huiji Gao, Xufei Wang, Jiliang Tang, and Huan Liu Data Mining and Machine Learning.
Web Science Course Lecture: Social Networks - * Dr. Stefan Siersdorfer 1 * Figures from Easley and Kleinberg 2010 (
Influence Maximization in Dynamic Social Networks Honglei Zhuang, Yihan Sun, Jie Tang, Jialin Zhang, Xiaoming Sun.
To Blog or Not to Blog: Characterizing and Predicting Retention in Community Blogs Imrul Kayes 1, Xiang Zuo 1, Da Wang 2, Jacob Chakareski 3 1 University.
Jure Leskovec Computer Science Department Cornell University / Stanford University Joint work with: Jon Kleinberg (Cornell), Christos.
Microsoft Instant Messenger Communication Network How does the world communicate? Jure Leskovec Machine Learning Department
Predicting Positive and Negative Links in Online Social Networks
The Database and Info. Systems Lab. University of Illinois at Urbana-Champaign User Profiling in Ego-network: Co-profiling Attributes and Relationships.
ACM International Conference on Information and Knowledge Management (CIKM) Analysis of Physical Activity Propagation in a Health Social Network.
EVENT DETECTION IN TIME SERIES OF MOBILE COMMUNICATION GRAPHS
Page 1 Inferring Relevant Social Networks from Interpersonal Communication Munmun De Choudhury, Winter Mason, Jake Hofman and Duncan Watts WWW ’10 Summarized.
Link Prediction Topics in Data Mining Fall 2015 Bruno Ribeiro
Social Tag Prediction Paul Heymann, Daniel Ramage, and Hector Garcia- Molina Stanford University SIGIR 2008.
Measuring Behavioral Trust in Social Networks
Jure Leskovec (Stanford), Daniel Huttenlocher and Jon Kleinberg (Cornell)
1 CoupledLP: Link Prediction in Coupled Networks Yuxiao Dong #, Jing Zhang +, Jie Tang +, Nitesh V. Chawla #, Bai Wang* # University of Notre Dame + Tsinghua.
Speaker : Yu-Hui Chen Authors : Dinuka A. Soysa, Denis Guangyin Chen, Oscar C. Au, and Amine Bermak From : 2013 IEEE Symposium on Computational Intelligence.
Supervised Random Walks: Predicting and Recommending Links in Social Networks Lars Backstrom (Facebook) & Jure Leskovec (Stanford) Proc. of WSDM 2011 Present.
Internet Economics כלכלת האינטרנט Class 9 – social networks (based on chapter 3 from Easely & Kleinberg’s books) 1.
Online Social Networks and Media Absorbing random walks Label Propagation Opinion Formation.
Computing and Information Sciences Kansas State University ANNIE Conference November 10, 2008 Predicting Links and Link Change in Friends Networks: Supervised.
Social Networks Strong and Weak Ties
PREDICTION ON TWEET FROM DYNAMIC INTERACTION Group 19 Chan Pui Yee Wong Tsz Wing Yeung Chun Kit.
1 Zi Yang Tsinghua University Joint work with Prof. Jie Tang, Prof. Juanzi Li, Dr. Keke Cai, Jingyi Guo, Chi Wang, etc. July 21, 2011, CASIN 2011, Tsinghua.
1 Zi Yang Tsinghua University Joint work with Prof. Jie Tang, Prof. Juanzi Li, Dr. Keke Cai, Jingyi Guo, Chi Wang, etc. July 21, 2011, CASIN 2011, Tsinghua.
Sofus A. Macskassy Fetch Technologies
Learning Triadic Influence in Large Social Networks
Cross-lingual Knowledge Linking Across Wiki Knowledge Bases
Jie Tang Computer Science, Tsinghua
Link Prediction Seminar Social Media Mining University UC3M
Networks with Signed Edges
Graph and Tensor Mining for fun and profit
Binghui Wang, Le Zhang, Neil Zhenqiang Gong
Graph-based Security and Privacy Analytics via Collective Classification with Joint Weight Learning and Propagation Binghui Wang, Jinyuan Jia, and Neil.
GANG: Detecting Fraudulent Users in OSNs
“The Spread of Physical Activity Through Social Networks”
Modeling Topic Diffusion in Scientific Collaboration Networks
Presentation transcript:

1 Yuxiao Dong *, Jie Tang $, Tiancheng Lou #, Bin Wu &, Nitesh V. Chawla * How Long will She Call Me? Distribution, Social Theory and Duration Prediction *University of Notre Dame $ Tsinghua University # Google Inc. & Beijing U. of Posts & Telecoms Yuxiao Dong, Jie Tang, Tiancheng Lou, Bin Wu, Nitesh V. Chawla. How Long will She Call Me? Distribution, Social Theory and Duration Prediction. In ECML/PKDD’13.

2 Outline  Motivation  Dynamic Distribution on Duration  Social Theory on Duration  Duration Prediction  Conclusion

3 Motivation  Mobile calls between humans are ubiquitous at any time … 91% of American adults have a mobile phone in May 2013 [1]. Mobile users can’t leave their phone alone for 6 minutes and check it up to 150 times a day [2]. People make, receive or avoid 22 phone calls every day [2]. 1.Pew Internet: Mobile Reports. June 6, Tomi Ahonen. Communities Dominate Brands.

4 Duration Macro-Distribution 1.M. Seshadri, A. Srid. J. Bolot. C. Faloutsos and J. Leskovec. Mobile Call Graphs: Beyond Power-Law and Lognormal Distributions. In KDD’08. 2.P. Melo, L. Akoglu, C. Faloutsos and A. Loureiro. Surprising Patterns for the call duration distribution of mobile phone users. In PKDD’10 Double pareto lognormal distribution (DPLN) [1].Truncated log-logistic distribution(TLAC) [2].

5 Mobile Data  Call Detailed Records (CDR): 3.9 million CDRs; 2 months (Dec & Jan. 2008); Non-America.  Mobile Network: 272,345 users and 521,925 call edges.  Pareto Principle: 20% pairs of users produce 80% calls. One-week data is available at

6 1.V. Palchykov, K. Kaski, J. Kertesz, AL. Bababasi and R.I.M. Dunbar. Sex differences in intimate relationships. Scientific reports 2:  Existing Macro-Distribution. DPLN distribution TLAC distribution  Dynamic Dist. on Duration Temporal distribution. Demographics distribution. Roadmap [1]

7 1.V. Palchykov, K. Kaski, J. Kertesz, AL. Bababasi and R.I.M. Dunbar. Sex differences in intimate relationships. Scientific reports 2:  Existing Macro-Distribution. DPLN distribution TLAC distribution  Dynamic Dist. on Duration Temporal distribution. Demographics distribution.  Social Theory on Duration Strong/weak tie Homophily Opinion leader Social balance Roadmap [1]

8 1.V. Palchykov, K. Kaski, J. Kertesz, AL. Bababasi and R.I.M. Dunbar. Sex differences in intimate relationships. Scientific reports 2:  Existing Macro-Distribution. DPLN distribution TLAC distribution  Dynamic Dist. on Duration Temporal distribution. Demographics distribution.  Social Theory on Duration Strong/weak tie Homophily Opinion leader Social balance  Duration Prediction Dynamic factors Social factors Roadmap [1]

9 Dynamic Distribution on Duration

10 Periodicity  Periodic patterns for mobile call duration: Working time (8:00AM-7:00PM), 75 seconds in average; Evening (7:00PM-12:00AM), increasing to150 seconds on mid-night; Early Moring (12:00AM-8:00AM), decreasing to 50 seconds.

11 Demographics  Call Duration VS. Demographics: Longer calls by female than male; Longer calls between 2 females than 2 males; Longer calls from M to F than F call M; Longer calls if younger.

12 Social Theory on Duration

13 Social Theory  Strong/weak tie: How long do people with a strong or weak tie call?  Link homophily: Do similar users tend to call each other with long or short duration?  Opinion leader: How different are the calling behaviours between opinion leaders and ordinary users?  Social balance: How does the duration-based network satisfy social balance theory?

14 Strong/Weak Tie Using the #calls to measure the tie strength between two users Jure Leskovec and Eric Horvitz. Planetary-Scale views on a large instant-messaging network. In WWW’08. [1]

15 Strong/Weak Tie  Call Duration VS. Social Tie: The stronger tie, shorter calls. 80% probability that the call is < 60s if they call each other for 1000 times two month. Different from online instant messaging network [2]. Using the #calls to measure the tie strength between two users Jure Leskovec and Eric Horvitz. Planetary-Scale views on a large instant-messaging network. In WWW’08. Probability that the call is < 60s. [1]

16 Link Homophily Using #common neighbours between two users to measure homophily. 1.Lilian Weng, Fillippo Menczer, Yong-Yeol Ann. Virality Prediction and Community Structure in Social Network. Scientific Reports. Aug [1]

17 Link Homophily  Call Duration VS. Link Homophily: More common neighbors, shorter calls. 80% probability that the call is 30 common neighbors.  Call Duration VS. Social Tie + Link Homophily: More homophily and stronger ties, shorter calls. Using #common neighbours between two users to measure homophily. 1.Lilian Weng, Fillippo Menczer, Yong-Yeol Ann. Virality Prediction and Community Structure in Social Network. Scientific Reports. Aug Probability that the call is < 60s. [1]

18 Opinion Leader Using PageRank to mine top 1% users as opinion leaders in mobile call network. The other as ordinary users. [1] 1. Katz, E. The two-step flow of communication: an up-to-date report of an hypothesis. In: Enis, Cox (eds.) Marketing Classics, 1973

19 Opinion Leader  Call Duration VS. Opinion Leader: OL make shorter calls in general, the prob is about 80% that OL’s calls are < 60s; Calls between 2 OLs are shorter. Using PageRank to mine top 1% users as opinion leaders in mobile call network. The other as ordinary users. OL: opinion leader OU: ordinary user Probability that the call is < 60s. [1] 1. Katz, E. The two-step flow of communication: an up-to-date report of an hypothesis. In: Enis, Cox (eds.) Marketing Classics, 1973

20 Social Balance Structural balance: all three users are friends or only one pair of them are friends. Assume two users are friends if they call each other at least once. Relationship balance: the balance rate is the percentage of triangles with even number of negative ties. Assume a tie is a negative one based on #calls or average duration between two nodes.

21 Social Balance  Call Duration VS. Social Balance: Unbalanced in structural balance Balanced in relationship balance Structural balance: all three users are friends or only one pair of them are friends. Assume two users are friends if they call each other at least once. Relationship balance: the balance rate is the percentage of triangles with even number of negative ties. Assume a tie is a negative one based on #calls or average duration between two nodes. < 20%, not balanced

22 Duration Prediction

23 Prediction Scenario v3v3 v4v4 v5v5 v2v2 v1v1 38s 62s 132s 95s Time 1 47s 33s v 1 : female, 29y v 2 : male, 31y v 3 : male, 60y v 4 : female, 63y v 5 : female, 27y Attribute factors

24 Prediction Scenario v3v3 v4v4 v5v5 v2v2 v1v1 47s 38s 62s 132s 95s v3v3 v4v4 v5v5 v2v2 v1v1 19s 40s 441s 78s 63s Time 1 Time 2 Opinion leader: v 5 Strong tie: v 4, v 5 Weak tie: v 1, v 3 Homophily: v 3, v 5 Social balance: v 3, v 4, v 5 33s 76s 16s v 1 : female, 29y v 2 : male, 31y v 3 : male, 60y v 4 : female, 63y v 5 : female, 27y Attribute factorsSocial factors

25 Prediction Scenario v3v3 v4v4 v5v5 v2v2 v1v1 138s 54s 95s 49s Time 3 Can we predict how long this call lasts for? v3v3 v4v4 v5v5 v2v2 v1v1 47s 38s 62s 132s 95s v3v3 v4v4 v5v5 v2v2 v1v1 19s 40s 441s 78s 63s Time 1 Time 2 33s 76s 16s v 5 calls to v 3 on Mon. 10:00PM Opinion leader: v 5 Strong tie: v 4, v 5 Weak tie: v 1, v 3 Homophily: v 3, v 5 Social balance: v 3, v 4, v 5 v 1 : female, 29y v 2 : male, 31y v 3 : male, 60y v 4 : female, 63y v 5 : female, 27y Attribute factorsSocial factors Temporal factors

26 Social Time-dependent Factor Graph (STFG)  PFG : partially labeled factor graph [1]  TRFG: social triad based factor graph [2] 1.W. Tang, H. Zhuang and J. Tang. Learning to infer social ties in large networks. In ECML/PKDD’11. 2.J. Hopcroft, T. Lou and J. Tang. Who will follow you back? Reciprocal relationship prediction. In CIKM’11.

27 Social Time-dependent Factor Graph (STFG)  PFG : partially labeled factor graph [1]  TRFG: social triad based factor graph [2]  STFG: partially labeled + social triad + time dependent 1.W. Tang, H. Zhuang and J. Tang. Learning to infer social ties in large networks. In ECML/PKDD’11. 2.J. Hopcroft, T. Lou and J. Tang. Who will follow you back? Reciprocal relationship prediction. In CIKM’11.

28 Social Time-dependent FG

29 Social Time-dependent FG  Joint distribution : Attributes SocialTemporal

30 Social Time-dependent FG  Joint distribution : Attributes Social Attribute factor: Social factor:  Exponential-linear functions to initialize factors Temporal Temporal factor:

31  STFG objective function:  Learning: Parameters: Social Time-dependent FG

32 Learning Algorithm 1. J. Hopcroft, T. Lou and J. Tang. Who will follow you back? Reciprocal relationship prediction. In CIKM’11. Gradient decent method.

33 Learning Algorithm 1. J. Hopcroft, T. Lou and J. Tang. Who will follow you back? Reciprocal relationship prediction. In CIKM’11. Gradient decent method. Using Loopy Belief Propagation to compute expectation.

34 Experimental Setup  Prediction Case 1: predict the duration of next call in the future Case 2: predict the average duration of calls in a future period

35 Experimental Setup  Prediction Case 1: predict the duration of next call in the future Case 2: predict the average duration of calls in a future period  Data First 7-week CDR data as historic data Case 1: 1 st call duration in 8 th week as next call prediction Case 2: average duration in 8 th week as next average prediction

36 Experimental Setup  Prediction Case 1: predict the duration of next call in the future Case 2: predict the average duration of calls in a future period  Data First 7-week CDR data as historic data Case 1: 1 st call duration in 8 th week as next call prediction Case 2: average duration in 8 th week as next average prediction  Binary Prediction 60% calls are less than 60 seconds and remaining 40% are > 60s; There is a jump on telephone bill when it reaches 1 minute; Setting threshold = 60 seconds to classify calls as long or short calls in this work.

37 Experimental Setup (Cont.)  Baseline Predictors SVM: support vector machine by SVM-light. LRC: logistic regression in Weka. Bnet: Bayes Network CRF: conditional random field  Evaluation Precision / Recall / F1-Measure

38 Results Case 1: Next Call Duration Prediction Case 2: Average Call Duration Prediction

39 Factor Contribution G: gender A: age B: social balance T: social tie H: homophily O: opinion leader W: week D: day

40 STFG Convergence Our learning algorithm is able to reach convergence quickly.

41 Conclusion & Future Work  Conclusions: Social theory and dynamic distribution have obvious existence in duration network; Our proposed model can significantly improve the prediction accuracy.  Interesting observations: Young females tend to make long calls, in particular in the evening; Familiar people (more calls and more common neighbors) make shorter calls.  Future work: Inferring call duration by regression model. Modeling duration prediction into a mobile application.

42 Thanks Data&Code: Yuxiao Dong, Jie Tang, Tiancheng Lou, Bin Wu, Nitesh V. Chawla. How Long will She Call Me? Distribution, Social Theory and Duration Prediction. In ECML/PKDD’13.