Exploit of Online Social Networks with Community-Based Graph Semi-Supervised Learning Mingzhen Mo and Irwin King Department of Computer Science and Engineering.

Slides:



Advertisements
Similar presentations
A Comparison of Implicit and Explicit Links for Web Page Classification Dou Shen 1 Jian-Tao Sun 2 Qiang Yang 1 Zheng Chen 2 1 Department of Computer Science.
Advertisements

+ Multi-label Classification using Adaptive Neighborhoods Tanwistha Saha, Huzefa Rangwala and Carlotta Domeniconi Department of Computer Science George.
LEARNING INFLUENCE PROBABILITIES IN SOCIAL NETWORKS Amit Goyal Francesco Bonchi Laks V. S. Lakshmanan University of British Columbia Yahoo! Research University.
Exploring Latent Features for Memory- Based QoS Prediction in Cloud Computing Yilei Zhang, Zibin Zheng, and Michael R. Lyu
Modeling Relationship Strength in Online Social Networks Rongjian Xiang 1, Jennifer Neville 1, Monica Rogati 2 1 Purdue University, 2 LinkedIn WWW 2010.
Trust Relationship Prediction Using Online Product Review Data Nan Ma 1, Ee-Peng Lim 2, Viet-An Nguyen 2, Aixin Sun 1, Haifeng Liu 3 1 Nanyang Technological.
Relational Learning with Gaussian Processes By Wei Chu, Vikas Sindhwani, Zoubin Ghahramani, S.Sathiya Keerthi (Columbia, Chicago, Cambridge, Yahoo!) Presented.
Relevance Feedback Content-Based Image Retrieval Using Query Distribution Estimation Based on Maximum Entropy Principle Irwin King and Zhong Jin Nov
Transfer Learning for WiFi-based Indoor Localization
Funding Networks Abdullah Sevincer University of Nevada, Reno Department of Computer Science & Engineering.
Derek Hao Hu, Qiang Yang Hong Kong University of Science and Technology.
Chen Cheng1, Haiqin Yang1, Irwin King1,2 and Michael R. Lyu1
Co-Training and Expansion: Towards Bridging Theory and Practice Maria-Florina Balcan, Avrim Blum, Ke Yang Carnegie Mellon University, Computer Science.
Jierui Xie, Boleslaw Szymanski, Mohammed J. Zaki Department of Computer Science Rensselaer Polytechnic Institute Troy, NY 12180, USA {xiej2, szymansk,
Efficient Convex Relaxation for Transductive Support Vector Machine Zenglin Xu 1, Rong Jin 2, Jianke Zhu 1, Irwin King 1, and Michael R. Lyu 1 4. Experimental.
Relevance Feedback based on Parameter Estimation of Target Distribution K. C. Sia and Irwin King Department of Computer Science & Engineering The Chinese.
Non-fixed and Asymmetrical Margin Approach to Stock Market Prediction using Support Vector Regression Haiqin Yang, Irwin King and Laiwan Chan Department.
Semi-Supervised Clustering Jieping Ye Department of Computer Science and Engineering Arizona State University
1 PageSim: A Link-based Similarity Measure for the World Wide Web Zhenjiang Lin, Irwin King, and Michael, R., Lyu Computer Science & Engineering, The Chinese.
A Study of the Relationship between SVM and Gabriel Graph ZHANG Wan and Irwin King, Multimedia Information Processing Laboratory, Department of Computer.
Relevance Feedback Content-Based Image Retrieval Using Query Distribution Estimation Based on Maximum Entropy Principle Irwin King and Zhong Jin The Chinese.
A Distributed and Privacy Preserving Algorithm for Identifying Information Hubs in Social Networks M.U. Ilyas, Z Shafiq, Alex Liu, H Radha Michigan State.
Dongyeop Kang1, Youngja Park2, Suresh Chari2
Modeling Relationship Strength in Online Social Networks Rongjing Xiang: Purdue University Jennifer Neville: Purdue University Monica Rogati: LinkedIn.
Preserving Link Privacy in Social Network Based Systems Prateek Mittal University of California, Berkeley Charalampos Papamanthou.
WEMAREC: Accurate and Scalable Recommendation through Weighted and Ensemble Matrix Approximation Chao Chen ⨳ , Dongsheng Li
Using Transactional Information to Predict Link Strength in Online Social Networks Indika Kahanda and Jennifer Neville Purdue University.
1 On Querying Historical Evolving Graph Sequences Chenghui Ren $, Eric Lo *, Ben Kao $, Xinjie Zhu $, Reynold Cheng $ $ The University of Hong Kong $ {chren,
Data Mining and Machine Learning Lab Network Denoising in Social Media Huiji Gao, Xufei Wang, Jiliang Tang, and Huan Liu Data Mining and Machine Learning.
Hierarchical Distributed Genetic Algorithm for Image Segmentation Hanchuan Peng, Fuhui Long*, Zheru Chi, and Wanshi Siu {fhlong, phc,
Page 1 Ming Ji Department of Computer Science University of Illinois at Urbana-Champaign.
Multiple Instance Real Boosting with Aggregation Functions Hossein Hajimirsadeghi and Greg Mori School of Computing Science Simon Fraser University International.
Xiangnan Kong,Philip S. Yu Department of Computer Science University of Illinois at Chicago KDD 2010.
The Database and Info. Systems Lab. University of Illinois at Urbana-Champaign User Profiling in Ego-network: Co-profiling Attributes and Relationships.
ACM International Conference on Information and Knowledge Management (CIKM) Analysis of Physical Activity Propagation in a Health Social Network.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Andreas Papadopoulos - [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.
SemiBoost : Boosting for Semi-supervised Learning Pavan Kumar Mallapragada, Student Member, IEEE, Rong Jin, Member, IEEE, Anil K. Jain, Fellow, IEEE, and.
Project by: Cirill Aizenberg, Dima Altshuler Supervisor: Erez Berkovich.
1 Effect of Spatial Locality on An Evolutionary Algorithm for Multimodal Optimization EvoNum 2010 Ka-Chun Wong, Kwong-Sak Leung, and Man-Hon Wong Department.
CoCQA : Co-Training Over Questions and Answers with an Application to Predicting Question Subjectivity Orientation Baoli Li, Yandong Liu, and Eugene Agichtein.
Threshold Setting and Performance Monitoring for Novel Text Mining Wenyin Tang and Flora S. Tsai School of Electrical and Electronic Engineering Nanyang.
Hongbo Deng, Michael R. Lyu and Irwin King
Recommender Systems with Social Regularization Hao Ma, Dengyong Zhou, Chao Liu Microsoft Research Michael R. Lyu The Chinese University of Hong Kong Irwin.
More Than Relevance: High Utility Query Recommendation By Mining Users' Search Behaviors Xiaofei Zhu, Jiafeng Guo, Xueqi Cheng, Yanyan Lan Institute of.
11 A Classification-based Approach to Question Routing in Community Question Answering Tom Chao Zhou 1, Michael R. Lyu 1, Irwin King 1,2 1 The Chinese.
ICONIP 2010, Sydney, Australia 1 An Enhanced Semi-supervised Recommendation Model Based on Green’s Function Dingyan Wang and Irwin King Dept. of Computer.
Learning with Green’s Function with Application to Semi-Supervised Learning and Recommender System ----Chris Ding, R. Jin, T. Li and H.D. Simon. A Learning.
Supervised Random Walks: Predicting and Recommending Links in Social Networks Lars Backstrom (Facebook) & Jure Leskovec (Stanford) Proc. of WSDM 2011 Present.
Analyzing and Predicting Question Quality in Community Question Answering Services Baichuan Li, Tan Jin, Michael R. Lyu, Irwin King, and Barley Mak CQA2012,
Privacy Preserving in Social Network Based System PRENTER: YI LIANG.
MMM2005The Chinese University of Hong Kong MMM2005 The Chinese University of Hong Kong 1 Video Summarization Using Mutual Reinforcement Principle and Shot.
Semi-Supervised Learning with Graph Transduction Project 2 Due Nov. 7, 2012, 10pm EST Class presentation on Nov. 12, 2012.
SemiBoost : Boosting for Semi-supervised Learning Pavan Kumar Mallapragada, Student Member, IEEE, Rong Jin, Member, IEEE, Anil K. Jain, Fellow, IEEE, and.
The Chinese University of Hong Kong Learning Larger Margin Machine Locally and Globally Dept. of Computer Science and Engineering The Chinese University.
Hao Ma, Dengyong Zhou, Chao Liu Microsoft Research Michael R. Lyu
Computer Science and Engineering Jianye Yang 1, Ying Zhang 2, Wenjie Zhang 1, Xuemin Lin 1 Influence based Cost Optimization on User Preference 1 The University.
Semi-Supervised Clustering
Sofus A. Macskassy Fetch Technologies
WSRec: A Collaborative Filtering Based Web Service Recommender System
Dieudo Mulamba November 2017
Community Distribution Outliers in Heterogeneous Information Networks
Zhenjiang Lin, Michael R. Lyu and Irwin King
Binghui Wang, Le Zhang, Neil Zhenqiang Gong
Socialized Word Embeddings
Project Title: (Your project title here)
GANG: Detecting Fraudulent Users in OSNs
A Classification-based Approach to Question Routing in Community Question Answering Tom Chao Zhou 22, Feb, 2010 Department of Computer.
Mingzhen Mo and Irwin King
Three steps are separately conducted
Presentation transcript:

Exploit of Online Social Networks with Community-Based Graph Semi-Supervised Learning Mingzhen Mo and Irwin King Department of Computer Science and Engineering The Chinese University of Hong Kong ICONIP 2010, Sydney, Australia

Motivation Online social network is an important way to interact with friends A large number users are attracted by it – 500 million active users (Facebook) – 700 billion minutes (Facebook) The security of users’ information attracts much attention from researchers and developers ICONIP 2010, Sydney, Australia2

Problem Hidden Information ? Group & Network Friendship Users’ Profile 3ICONIP 2010, Sydney, Australia U3U3 U3U3 U1U1 U1U1 U2U2 U2U2 U4U4 U4U4 U5U5 U5U5 Group Network

Example On Facebook Given: – Users’ profiles, e.g., age, location and phone – Friendship relationship – Member lists of groups and networks Output – Predict the university information ICONIP 2010, Sydney, Australia4

Objective Build a model with proper algorithm to predict the hidden information Better utilize community information Related works – Graph Theory [G. Flake et al., SIGKDD 2000] – Supervised Learning [E. Zheleva et al., WWW2009] – Semi-Supervised Learning [M. Mo et al., IJCNN2010] 5ICONIP 2010, Sydney, Australia

Contributions Propose a novel community-based model – Predict hidden information more accurately Provide two algorithms – Be able to deal with different conditions Help to understand the security level in social networks. 6ICONIP 2010, Sydney, Australia

Preparation for Modeling Definition – Online social network: G(V, E) Profile P i Friendship W ij – Two sets Labeled data V l Unlabeled data V u P3P3 P3P3 P1P1 P1P1 P2P2 P2P2 P4P4 P4P4 P5P5 P5P5 P3P3 P3P3 P1P1 P1P1 P2P2 P2P2 P4P4 P4P4 P5P5 P5P5 Y5Y5 Y1Y1 W 1,3 W 3,4 W 3,5 W 4,5 W 2,4 W 1,2 7ICONIP 2010, Sydney, Australia

Consistency on Graph ICONIP 2010, Sydney, Australia8 Community Consistency Community-Based Graph (CG) SSL Model 3 Local Consistency Global Consistency Basic Graph-Based SSL with Harmonic Function Local and Global Consistency (LGC) Graph SSL Model 1 Model 2 U3U3 U3U3 U1U1 U1U1 U2U2 U2U2 U4U4 U4U4 U5U5 U5U5 Y2Y2 Y1Y1 Local Consistency label Y 1 should be similar to label Y 2 Global Consistency Predicted label should be closed to the true label Y 2 Network Community Consistency Predicted label should be closed to the true label, if user 2 and user 4 are in the same network.

Community-based Graph (CG) Model Input: basic graph, community graph Output: predicted labels Objective is the Laplacian Matrix of community info, and Local & Global Consistency (LGC) Learning Community Term 9ICONIP 2010, Sydney, Australia True LabelsParameter 1Parameter 2

Community-based Graph (CG) Model Generating – Clustering vertices “Distance” is measured by Group and Network info. – Mark down each cluster in a matrix E.g., a cluster contains the vertex 1, 2 and 3 – _, n c is the total number of clusters 10ICONIP 2010, Sydney, Australia

Algorithms Algorithm one – Closed form algorithm – Simple and time-saving Input Output Process 11ICONIP 2010, Sydney, Australia

Algorithms Algorithm two – Iterative algorithm – Able to deal with large-scale data Input Output Process True False 12ICONIP 2010, Sydney, Australia

Experiments Datasets – One synthetic dataset: TwoMoons – Two real-world datasets: StudiVZ & Facebook Objectives – Classification in TwoMoons – Predict university names in StudiVZ & Facebook Comparison – Supervised learning – Basic and LGC graph learning Evaluation – Accuracy and confidence 13ICONIP 2010, Sydney, Australia

Datasets Statistic Visualization – TwoMoons 14ICONIP 2010, Sydney, Australia

Experimental Results TwoMoons (200 vertices) 15ICONIP 2010, Sydney, Australia The community information does help in prediction in term of accuracy The CG SSL is stably better than the others Observations

Experimental Results StudiVZ (1,423 users) 16ICONIP 2010, Sydney, Australia All graph-based SSL outperforms supervised learning The CG SSL keeps stably better than the others Observations

Experimental Results Facbook (10,410 users) 17ICONIP 2010, Sydney, Australia In most cases, CG SSL outperforms other learning methods There is little instability in CG SSL model Observations

Experiments CG SSL performs best in most cases The curves of CG SSL are not increasing monotonically The accuracies on Facebook dataset are less than the others 18ICONIP 2010, Sydney, Australia

Conclusion Community-based Graph SSL model describes the real world more exactly CG SSL predicts the hidden information of online social networks with higher accuracy and confidence The security of users’ information becomes in lower level 19ICONIP 2010, Sydney, Australia

THANK YOU Q & A 20ICONIP 2010, Sydney, Australia