Biased Random Walk based Social Regularization for Word Embeddings

Slides:



Advertisements
Similar presentations
Autonomic Scaling of Cloud Computing Resources
Advertisements

A Local-Optimization based Strategy for Cost-Effective Datasets Storage of Scientific Applications in the Cloud Many slides from authors’ presentation.
Benchmarking traversal operations over graph databases Marek Ciglan 1, Alex Averbuch 2 and Ladialav Hluchý 1 1 Institute of Informatics, Slovak Academy.
Analysis and Modeling of Social Networks Foudalis Ilias.
Sarcasm Detection on Twitter A Behavioral Modeling Approach
Personal Name Classification in Web queries Dou Shen*, Toby Walker*, Zijian Zheng*, Qiang Yang**, Ying Li* *Microsoft Corporation ** Hong Kong University.
Algorithm-Independent Machine Learning Anna Egorova-Förster University of Lugano Pattern Classification Reading Group, January 2007 All materials in these.
Graph Data Management Lab School of Computer Science , Bristol, UK.
Chen Cheng1, Haiqin Yang1, Irwin King1,2 and Michael R. Lyu1
Collaborative Ordinal Regression Shipeng Yu Joint work with Kai Yu, Volker Tresp and Hans-Peter Kriegel University of Munich, Germany Siemens Corporate.
Active Learning Strategies for Compound Screening Megon Walker 1 and Simon Kasif 1,2 1 Bioinformatics Program, Boston University 2 Department of Biomedical.
Duyu Tang, Furu Wei, Nan Yang, Ming Zhou, Ting Liu, Bing Qin
Large-Scale Cost-sensitive Online Social Network Profile Linkage.
Transfer Learning From Multiple Source Domains via Consensus Regularization Ping Luo, Fuzhen Zhuang, Hui Xiong, Yuhong Xiong, Qing He.
Discovering Outlier Filtering Rules from Unlabeled Data Author: Kenji Yamanishi & Jun-ichi Takeuchi Advisor: Dr. Hsu Graduate: Chia- Hsien Wu.
Modeling Relationship Strength in Online Social Networks Rongjing Xiang: Purdue University Jennifer Neville: Purdue University Monica Rogati: LinkedIn.
Using Transactional Information to Predict Link Strength in Online Social Networks Indika Kahanda and Jennifer Neville Purdue University.
Authors : Ramon F. Astudillo, Silvio Amir, Wang Lin, Mario Silva, Isabel Trancoso Learning Word Representations from Scarce Data By: Aadil Hayat (13002)
Network Characterization via Random Walks B. Ribeiro, D. Towsley UMass-Amherst.
Chengjie Sun,Lei Lin, Yuan Chen, Bingquan Liu Harbin Institute of Technology School of Computer Science and Technology 1 19/11/ :09 PM.
Joint Models of Disagreement and Stance in Online Debate Dhanya Sridhar, James Foulds, Bert Huang, Lise Getoor, Marilyn Walker University of California,
Shanda Innovations Context-aware Ensemble of Multifaceted Factorization Models for Recommendation Kevin Y. W. Chen.
The Influence Mobility Model: A Novel Hierarchical Mobility Modeling Framework Muhammad U. Ilyas and Hayder Radha Michigan State University.
Mining Social Network for Personalized Prioritization Language Techonology Institute School of Computer Science Carnegie Mellon University Shinjae.
Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.
Exploit of Online Social Networks with Community-Based Graph Semi-Supervised Learning Mingzhen Mo and Irwin King Department of Computer Science and Engineering.
Paired Sampling in Density-Sensitive Active Learning Pinar Donmez joint work with Jaime G. Carbonell Language Technologies Institute School of Computer.
Traffic Prediction in a Bike-Sharing System
Socialbots and its implication On ONLINE SOCIAL Networks Md Abdul Alim, Xiang Li and Tianyi Pan Group 18.
© Devi Parikh 2008 Devi Parikh and Tsuhan Chen Carnegie Mellon University April 3, ICASSP 2008 Bringing Diverse Classifiers to Common Grounds: dtransform.
Wenyuan Dai, Ou Jin, Gui-Rong Xue, Qiang Yang and Yong Yu Shanghai Jiao Tong University & Hong Kong University of Science and Technology.
11 A Classification-based Approach to Question Routing in Community Question Answering Tom Chao Zhou 1, Michael R. Lyu 1, Irwin King 1,2 1 The Chinese.
Simulating the Social Processes of Science Leiden| 9 April 2014 INGENIO [CSIC-UPV] Ciudad Politécnica de la Innovación | Edif 8E 4º Camino de Vera s/n.
Supervised Random Walks: Predicting and Recommending Links in Social Networks Lars Backstrom (Facebook) & Jure Leskovec (Stanford) Proc. of WSDM 2011 Present.
Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:
Ganesh J, Soumyajit Ganguly, Manish Gupta, Vasudeva Varma, Vikram Pudi
TribeFlow Mining & Predicting User Trajectories Flavio Figueiredo Bruno Ribeiro Jussara M. AlmeidaChristos Faloutsos 1.
Topologically inferring risk-active pathways toward precise cancer classification by directed random walk Topologically inferring risk-active pathways.
DeepWalk: Online Learning of Social Representations
Medical Semantic Similarity with a Neural Language Model Dongfang Xu School of Information Using Skip-gram Model for word embedding.
P.Demestichas (1), S. Vassaki(2,3), A.Georgakopoulos(2,3)
Scalable Person Re-identification on Supervised Smoothed Manifold
Deep Feedforward Networks
Table 1. Advantages and Disadvantages of Traditional DM/ML Methods
Supervised Time Series Pattern Discovery through Local Importance
Boosting and Additive Trees
This Talk 1) Node embeddings 2) Graph neural networks 3) Applications
Distributed Representations of Subgraphs
Speaker: Jim-an tsai advisor: professor jia-lin koh
Lei Sha, Jing Liu, Chin-Yew Lin, Sujian Li, Baobao Chang, Zhifang Sui
Random Sampling over Joins Revisited
Word Embeddings with Limited Memory
Weakly Learning to Match Experts in Online Community
Learning Emoji Embeddings Using Emoji Co-Occurrence Network Graph
iSRD Spam Review Detection with Imbalanced Data Distributions
Scaling up Link Prediction with Ensembles
Neural Networks Geoff Hulten.
Binghui Wang, Le Zhang, Neil Zhenqiang Gong
Socialized Word Embeddings
Asymmetric Transitivity Preserving Graph Embedding
Graph-based Security and Privacy Analytics via Collective Classification with Joint Weight Learning and Propagation Binghui Wang, Jinyuan Jia, and Neil.
A Classification-based Approach to Question Routing in Community Question Answering Tom Chao Zhou 22, Feb, 2010 Department of Computer.
Word embeddings (continued)
Keshav Balasubramanian
Heterogeneous Graph Attention Network
Modeling Topic Diffusion in Scientific Collaboration Networks
BSMDMA IJCAI 2019 Prediction of Information Cascades via Content and Structure Integrated Whole Graph Embedding Feng Xiaodong, Zhao Qihang,
Outlines Introduction & Objectives Methodology & Workflow
Do Better ImageNet Models Transfer Better?
Presentation transcript:

Biased Random Walk based Social Regularization for Word Embeddings Ziqian Zeng*, Xin Liu*, and Yangqiu Song The Hong Kong University of Science and Technology * Equal Contribution

Outline Socialized Word Embeddings Motivation (Problems of SWE and Solutions) Algorithm Random Walk Methods Experiments

Socialized Word Embeddings (SWE) Everyone has his/her own personal characteristics of language use. Zeng et al., Socialized Word Embeddings, IJCAI, 2017

Socialized Word Embeddings (SWE) Socially connected individuals tend to use language in similar ways.1 Frequently Used Words Language feature pawsome, anipals, furever animal based punctuation kradam, glambert, glamily puns around pop star Adam Lambert tdd, mvc, linq acronyms [1] Yi Yang et al. Overcoming Language Variation in Sentiment Analysis with Social Attention, 2016. Table is from Bryden John et al, Word usage mirrors community structure in the online social network Twitter, 2013 with partial deletion.

Socialized Word Embeddings (SWE) Figure 1. Illustration of Socialized Word Embeddings1. Vector representation of a word for user : 1 Zeng et al., Socialized Word Embeddings, IJCAI, 2017

Socialized Word Embeddings (SWE) Personalized CBOW (1) Socialized Regularization (2) Socialized Word Embeddings Trade-off Parameter (3) Constraints

Problems of SWE Implicit modeling on transitivity Not affected node Node being considered Node being affected

Problems of SWE Implicit modeling on transitivity Not affected node Node being considered Node being affected

Problems of SWE Implicit modeling on transitivity Not affected node Node being considered Node being affected

Problems of SWE Implicit modeling on transitivity Not affected node Node being considered Node being affected

Problems of SWE The spread of language use is transitive. Not affected node Node being considered Node being affected

Solution Use random walk methods to generate paths Not affected node Node being considered Node being affected

Problems of SWE Lack of concern of users who have fewer friends Case 2 More Friends Not affected node Node being considered Node being affected

Solution Go to users who have fewer friends with larger probability. B A C

Solution Go to users who have fewer friends with larger probability. Node # friends B 1 C 4 B A C

Solution Go to users who have fewer friends with larger probability. B Node # friends B 1 C 4 A C

Social Regularization with Random Walk Algorithm Under SWE framework. Figure 2. Illustration of Socialized Word Embeddings. Zeng et al., Socialized Word Embeddings, IJCAI, 2017

Social Regularization with Random Walk Algorithm Under SWE framework. Generate paths using random walk methods.

Social Regularization with Random Walk Algorithm Under SWE framework. Generate paths using random walk methods. Each user in paths gets update.

Random Walk 1/3 Transition Probability A B C 1/3 2/3 1 1 A B 2/3 1 C

Random Walk 1/3 1 A B 2/3 1 C Path: A - Transition Probability A B C 1/3 2/3 1 1 A B 2/3 1 C Path: A -

Random Walk 1/3 1 A B 2/3 1 C Path: A - C Transition Probability A B C 1/3 2/3 1 1 A B 2/3 1 C Path: A - C

Random Walk 1/3 1 A B 2/3 1 C Path: A - C - B Transition Probability A 1/3 2/3 1 1 A B 2/3 1 C Path: A - C - B

Random Walk 1/3 1 A B 2/3 1 C Path: A - C - B - A Transition Probability A B C 1/3 2/3 1 1 A B 2/3 1 C Path: A - C - B - A

First-order Random Walk A random walker moves to next node based on the last node. Second Last Node Last Node Node Perozzi et al., Deepwalk: Online learning of social representations, KDD, 2014.

Second-order Random Walk A random walker moves to next node based on the last two nodes. Second Last Node Last Node Node B A Grover et al., node2vec: Scalable feature learning for networks, KDD, 2016.

Second-order Random Walk A random walker moves to next node based on the last two nodes. Second Last Node Last Node Node B A Grover et al., node2vec: Scalable feature learning for networks, KDD, 2016.

Second-order Random Walk A random walker moves to next node based on the last two nodes. Second Last Node Last Node Node B A Grover et al., node2vec: Scalable feature learning for networks, KDD, 2016.

Biased Second-order Random Walk A random walker moves to next node based on the last two nodes.

Biased Second-order Random Walk Number of friends of node Average number of friends Hyper-parameter

Experiments Dataset – Yelp Challenge Yelp Challenge Datasets contain billions of reviews, ratings, and large social networks. Table 1. Statistics of Yelp Round 9 and 10 datasets.

Sentiment Classification Two phenomena of language use help sentiment analysis task. Word such as “good” can indicate different sentiment ratings depending on the author. [1] Yang et al. Overcoming Language Variation in Sentiment Analysis with Social Attention, TACL, 2017.

Sentiment Classification average review + user Logistic Regression classifier rating Figure 2. Illustration of the procedure of sentiment classification

Sentiment Classification – Head & Tail Train on only head users or tail users. Head users published more reviews. Tail users published fewer. Half of total reviews from head users. Half from tail users. Tail Data Head Data

Sentiment Classification – Head & Tail Train on only head users or tail users. Head users published more reviews. Tail users published fewer. Half of total reviews from head users. Half from tail users. Statistics Table 2. Statistics of head users and tail users in the one-fifth of the training set.

Experiments A C U R Y Figure 3. Classification Accuracy on Head & Tail Users on Yelp Round 9

Experiments A C U R Y Figure 4. Classification Accuracy on Head & Tail Users on Yelp Round 10

Thank You Conclusion Explicitly model the transitivity Sample more users who have fewer friends. Thank You