Socialized Word Embeddings

Slides:



Advertisements
Similar presentations
A Comparison of Implicit and Explicit Links for Web Page Classification Dou Shen 1 Jian-Tao Sun 2 Qiang Yang 1 Zheng Chen 2 1 Department of Computer Science.
Advertisements

Large-Scale Entity-Based Online Social Network Profile Linkage.
Intelligent Systems Lab. Recognizing Human actions from Still Images with Latent Poses Authors: Weilong Yang, Yang Wang, and Greg Mori Simon Fraser University,
Personal Name Classification in Web queries Dou Shen*, Toby Walker*, Zijian Zheng*, Qiang Yang**, Ying Li* *Microsoft Corporation ** Hong Kong University.
Chen Cheng1, Haiqin Yang1, Irwin King1,2 and Michael R. Lyu1
OCFS: Optimal Orthogonal Centroid Feature Selection for Text Categorization Jun Yan, Ning Liu, Benyu Zhang, Shuicheng Yan, Zheng Chen, and Weiguo Fan et.
Duyu Tang, Furu Wei, Nan Yang, Ming Zhou, Ting Liu, Bing Qin
Distributed Representations of Sentences and Documents
Large-Scale Cost-sensitive Online Social Network Profile Linkage.
Authors : Ramon F. Astudillo, Silvio Amir, Wang Lin, Mario Silva, Isabel Trancoso Learning Word Representations from Scarce Data By: Aadil Hayat (13002)
Predicting Product Adoption in Large-Scale Social Networks Offensive: Hao Chen.
Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng Computer Science Department, Stanford University, Stanford, CA 94305, USA ImprovingWord.
Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,
ICML2004, Banff, Alberta, Canada Learning Larger Margin Machine Locally and Globally Kaizhu Huang Haiqin Yang, Irwin King, Michael.
2014 EMNLP Xinxiong Chen, Zhiyuan Liu, Maosong Sun State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information.
A Novel Local Patch Framework for Fixing Supervised Learning Models Yilei Wang 1, Bingzheng Wei 2, Jun Yan 2, Yang Hu 2, Zhi-Hong Deng 1, Zheng Chen 2.
Jun Li, Peng Zhang, Yanan Cao, Ping Liu, Li Guo Chinese Academy of Sciences State Grid Energy Institute, China Efficient Behavior Targeting Using SVM Ensemble.
Exploit of Online Social Networks with Community-Based Graph Semi-Supervised Learning Mingzhen Mo and Irwin King Department of Computer Science and Engineering.
Online Multiple Kernel Classification Steven C.H. Hoi, Rong Jin, Peilin Zhao, Tianbao Yang Machine Learning (2013) Presented by Audrey Cheong Electrical.
Omer Levy Yoav Goldberg Ido Dagan Bar-Ilan University Israel
Deep Visual Analogy-Making
Accurate Cross-lingual Projection between Count-based Word Vectors by Exploiting Translatable Context Pairs SHONOSUKE ISHIWATARI NOBUHIRO KAJI NAOKI YOSHINAGA.
Feedforward semantic segmentation with zoom-out features
Intelligent Database Systems Lab Presenter: NENG-KAI, HONG Authors: HUAN LONG A, ZIJUN ZHANG A, ⇑, YAN SU 2014, APPLIED ENERGY Analysis of daily solar.
Ganesh J, Soumyajit Ganguly, Manish Gupta, Vasudeva Varma, Vikram Pudi
Predicting Winning Price In Real Time Bidding With Censored Data Tejaswini Veena Sambamurthy Weicong Chen.
RELATION EXTRACTION, SYMBOLIC SEMANTICS, DISTRIBUTIONAL SEMANTICS Heng Ji Oct13, 2015 Acknowledgement: distributional semantics slides from.
Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte A Bag-of-Features Framework for Time Series Classification.
Efficient Estimation of Word Representations in Vector Space By Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean. Google Inc., Mountain View, CA. Published.
1 Zi Yang Tsinghua University Joint work with Prof. Jie Tang, Prof. Juanzi Li, Dr. Keke Cai, Jingyi Guo, Chi Wang, etc. July 21, 2011, CASIN 2011, Tsinghua.
1 Zi Yang Tsinghua University Joint work with Prof. Jie Tang, Prof. Juanzi Li, Dr. Keke Cai, Jingyi Guo, Chi Wang, etc. July 21, 2011, CASIN 2011, Tsinghua.
Medical Semantic Similarity with a Neural Language Model Dongfang Xu School of Information Using Skip-gram Model for word embedding.
Distributed Representations for Natural Language Processing
Experience Report: System Log Analysis for Anomaly Detection
Big data classification using neural network
Scalable Person Re-identification on Supervised Smoothed Manifold
CNN-RNN: A Unified Framework for Multi-label Image Classification
Semi-Supervised Clustering
Korean version of GloVe Applying GloVe & word2vec model to Korean corpus speaker : 양희정 date :
IEEE BIBM 2016 Xu Min, Wanwen Zeng, Ning Chen, Ting Chen*, Rui Jiang*
Comparison with other Models Exploring Predictive Architectures
Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek
Pick samples from task t
Unsupervised Learning and Autoencoders
Efficient Estimation of Word Representation in Vector Space
J. Zhu, A. Ahmed and E.P. Xing Carnegie Mellon University ICML 2009
Deep Learning based Machine Translation
Distributed Representation of Words, Sentences and Paragraphs
Random Sampling over Joins Revisited
Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong
Word Embeddings with Limited Memory
Weakly Learning to Match Experts in Online Community
Learning Emoji Embeddings Using Emoji Co-Occurrence Network Graph
MEgo2Vec: Embedding Matched Ego Networks for User Alignment Across Social Networks Jing Zhang+, Bo Chen+, Xianming Wang+, Fengmei Jin+, Hong Chen+, Cuiping.
Word embeddings based mapping
Word embeddings based mapping
Sadov M. A. , NRU HSE, Moscow, Russia Kutuzov A. B
Outline Background Motivation Proposed Model Experimental Results
SVM-based Deep Stacking Networks
Neural Speech Synthesis with Transformer Network
Biased Random Walk based Social Regularization for Word Embeddings
Support Vector Machines and Kernels
View Inter-Prediction GAN: Unsupervised Representation Learning for 3D Shapes by Learning Global Shape Memories to Support Local View Predictions 1,2 1.
Machine Learning in Practice Lecture 27
Word embeddings Text processing with current NNs requires encoding into vectors. One-hot encoding: N words encoded by length N vectors. A word gets a.
A Classification-based Approach to Question Routing in Community Question Answering Tom Chao Zhou 22, Feb, 2010 Department of Computer.
Word embeddings (continued)
Deep Interest Network for Click-Through Rate Prediction
Semi-Supervised Learning
Anirban Laha and Vikas C. Raykar, IBM Research – India.
Presentation transcript:

Socialized Word Embeddings Ziqian Zeng1, Yichun Yin1,2, Yangqiu Song1 and Ming Zhang2 The Hong Kong University of Science and Technology1, Peking University2

Motivation Facts Everyone has his/her own personal characteristics of language use.

Motivation Facts Linguistic homophily: socially connected individuals tend to use language in similar ways. 1 2 Frequently Used Words Language feature tdd, mvc, linq acronyms anipals, pawsome, furever animal based puns kradam, glambert, glamily puns around pop star Adam Lambert Yi Yang et al. Overcoming Language Variation in Sentiment Analysis with Social Attention, 2016. Table is from Bryden John et al, Word usage mirrors community structure in the online social network Twitter, 2013 with partial deletion.

Motivation Facts Everyone has his/her own personal characteristics of language use. Linguistic homophily: socially connected individuals tend to use language in similar ways. 1 Goal Develop a word embeddings algorithm which can consider the facts. Yi Yang et al. Overcoming Language Variation in Sentiment Analysis with Social Attention, 2016.

CBOW Figure 1: Illustration of CBOW (1) in the paper, we take CBOW as an example to introduce our algorithm. Figure 1: Illustration of CBOW Mikolov Tomas et al. Efficient estimation of word representations in vector space. 2013

Socialized Word Embeddings Figure 2: Illustration of Socialized Word Embeddings

Socialized Word Embeddings Personalized Fact Everyone has his/her own personal characteristics of language use. Figure 2: Illustration of Socialized Word Embeddings

Socialized Word Embeddings Fact Linguistic homophily: socially connected individuals tend to use language in similar ways. Figure 2: Illustration of Socialized Word Embeddings

Personalization Notations Personalized CBOW users: A word ’s context , where is the half window size. A corpus for user . Global word vector: ; Local user vector: . Vector representation of a word for user : Personalized CBOW (2)

Socialization Notations Socialized Regularization , where is the number of friends of . Socialized Regularization (3)

Socialization Personalized CBOW Socialized Regularization (2) Socialized Regularization (3) Socialized Word Embeddings Trade-off Parameter (4) Constraint

Experiments Dataset – Yelp Challenge Yelp Challenge Datasets contain billions of reviews and ratings for various businesses.

Experiments Perplexity We reported: Perplexity is used to evaluate how good a model is to predict the current word based on several previous words. We reported: Perplexity trend with varied ( -norm constraint for user vectors) and fixed λ (socialized regularization parameter). Better

Experiments Perplexity trend with varied λ and fixed Better

Experiments SVM Sentiment Classification + average review user classifier rating Figure 5. Illustration of the procedure of sentiment classification

Experiments Sentiment Classification – Head & Tail Statistics Train on only head users or tail users. Select the users who contributed half of the total reviews as head users, and the other users are left as tail users. Statistics

Experiments Better

Experiments Better

Experiments User Vector as Attention User attention vectors can improve sentiment classification. User vectors as fixed attention vectors. Comparison with the baseline without attention, and the upper bound with trainable attention. Figure 8: The architecture of User Product Attention based Neural Sentiment Classification model.1 Figure 8 is form Chen Huimin et al. Neural sentiment classification with user and product attention. 2016

Experiments Supervised Better Unsupervised Figure 9: Comparison of our model and other baseline methods on user attention based deep learning for sentiment analysis.

Conclusion Representation: global word vector + local user vector Socialized regularization: friends’ user vectors should be similar. Thank You Figure 2. Illustration of Socialized Word Embeddings