Biased Random Walk based Social Regularization for Word Embeddings

Biased Random Walk based Social Regularization for Word Embeddings
Ziqian Zeng*, Xin Liu*, and Yangqiu Song The Hong Kong University of Science and Technology * Equal Contribution

Outline Socialized Word Embeddings
Motivation (Problems of SWE and Solutions) Algorithm Random Walk Methods Experiments

Socialized Word Embeddings (SWE)
Everyone has his/her own personal characteristics of language use. Zeng et al., Socialized Word Embeddings, IJCAI, 2017

Socially connected individuals tend to use language in similar ways.1 Frequently Used Words Language feature pawsome, anipals, furever animal based punctuation kradam, glambert, glamily puns around pop star Adam Lambert tdd, mvc, linq acronyms [1] Yi Yang et al. Overcoming Language Variation in Sentiment Analysis with Social Attention, 2016. Table is from Bryden John et al, Word usage mirrors community structure in the online social network Twitter, 2013 with partial deletion.

Figure 1. Illustration of Socialized Word Embeddings1. Vector representation of a word for user : 1 Zeng et al., Socialized Word Embeddings, IJCAI, 2017

Personalized CBOW (1) Socialized Regularization (2) Socialized Word Embeddings Trade-off Parameter (3) Constraints

Problems of SWE Implicit modeling on transitivity Not affected node
Node being considered Node being affected

Problems of SWE The spread of language use is transitive.
Not affected node Node being considered Node being affected

Solution Use random walk methods to generate paths Not affected node
Node being considered Node being affected

Problems of SWE Lack of concern of users who have fewer friends Case 2
More Friends Not affected node Node being considered Node being affected

Solution Go to users who have fewer friends with larger probability. B A C

Solution Go to users who have fewer friends with larger probability. Node # friends B 1 C 4 B A C

Solution Go to users who have fewer friends with larger probability. B Node # friends B 1 C 4 A C

Social Regularization with Random Walk
Algorithm Under SWE framework. Figure 2. Illustration of Socialized Word Embeddings. Zeng et al., Socialized Word Embeddings, IJCAI, 2017

Algorithm Under SWE framework. Generate paths using random walk methods.

Algorithm Under SWE framework. Generate paths using random walk methods. Each user in paths gets update.

Random Walk 1/3 Transition Probability A B C 1/3 2/3 1 1 A B 2/3 1 C

Random Walk 1/3 1 A B 2/3 1 C Path: A - Transition Probability A B C
1/3 2/3 1 1 A B 2/3 1 C Path: A -

Random Walk 1/3 1 A B 2/3 1 C Path: A - C Transition Probability A B C
1/3 2/3 1 1 A B 2/3 1 C Path: A - C

Random Walk 1/3 1 A B 2/3 1 C Path: A - C - B Transition Probability A
1/3 2/3 1 1 A B 2/3 1 C Path: A - C - B

Random Walk 1/3 1 A B 2/3 1 C Path: A - C - B - A
Transition Probability A B C 1/3 2/3 1 1 A B 2/3 1 C Path: A - C - B - A

First-order Random Walk
A random walker moves to next node based on the last node. Second Last Node Last Node Node Perozzi et al., Deepwalk: Online learning of social representations, KDD, 2014.

Second-order Random Walk
A random walker moves to next node based on the last two nodes. Second Last Node Last Node Node B A Grover et al., node2vec: Scalable feature learning for networks, KDD, 2016.

Biased Second-order Random Walk
A random walker moves to next node based on the last two nodes.

Biased Second-order Random Walk
Number of friends of node Average number of friends Hyper-parameter

Experiments Dataset – Yelp Challenge
Yelp Challenge Datasets contain billions of reviews, ratings, and large social networks. Table 1. Statistics of Yelp Round 9 and 10 datasets.

Sentiment Classification
Two phenomena of language use help sentiment analysis task. Word such as “good” can indicate different sentiment ratings depending on the author. [1] Yang et al. Overcoming Language Variation in Sentiment Analysis with Social Attention, TACL, 2017.

Sentiment Classification
average review + user Logistic Regression classifier rating Figure 2. Illustration of the procedure of sentiment classification

Sentiment Classification – Head & Tail
Train on only head users or tail users. Head users published more reviews. Tail users published fewer. Half of total reviews from head users. Half from tail users. Tail Data Head Data

Sentiment Classification – Head & Tail
Train on only head users or tail users. Head users published more reviews. Tail users published fewer. Half of total reviews from head users. Half from tail users. Statistics Table 2. Statistics of head users and tail users in the one-fifth of the training set.

Experiments A C U R Y Figure 3. Classification Accuracy on Head & Tail Users on Yelp Round 9

Experiments A C U R Y Figure 4. Classification Accuracy on Head & Tail Users on Yelp Round 10

Thank You Conclusion Explicitly model the transitivity
Sample more users who have fewer friends. Thank You

Biased Random Walk based Social Regularization for Word Embeddings

Similar presentations

Presentation on theme: "Biased Random Walk based Social Regularization for Word Embeddings"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Biased Random Walk based Social Regularization for Word Embeddings

Similar presentations

Presentation on theme: "Biased Random Walk based Social Regularization for Word Embeddings"— Presentation transcript:

Similar presentations

About project

Feedback