GhostLink: Latent Network Inference for Influence-aware Recommendation

Slides:



Advertisements
Similar presentations
Bayesian Belief Propagation
Advertisements

Google News Personalization: Scalable Online Collaborative Filtering
A probabilistic model for retrospective news event detection
Autonomic Scaling of Cloud Computing Resources
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
1.Accuracy of Agree/Disagree relation classification. 2.Accuracy of user opinion prediction. 1.Task extraction performance on Bing web search log with.
Expectation Maximization
Complex Feature Recognition: A Bayesian Approach for Learning to Recognize Objects by Paul A. Viola Presented By: Emrah Ceyhan Divin Proothi Sherwin Shaidee.
GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.
Models of Influence in Online Social Networks
Incomplete Graphical Models Nan Hu. Outline Motivation K-means clustering Coordinate Descending algorithm Density estimation EM on unconditional mixture.
Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.
Modeling Relationship Strength in Online Social Networks Rongjing Xiang: Purdue University Jennifer Neville: Purdue University Monica Rogati: LinkedIn.
1 Naïve Bayes Models for Probability Estimation Daniel Lowd University of Washington (Joint work with Pedro Domingos)
A Comparative Study of Search Result Diversification Methods Wei Zheng and Hui Fang University of Delaware, Newark DE 19716, USA
Chengjie Sun,Lei Lin, Yuan Chen, Bingquan Liu Harbin Institute of Technology School of Computer Science and Technology 1 19/11/ :09 PM.
User Interests Imbalance Exploration in Social Recommendation: A Fitness Adaptation Authors : Tianchun Wang, Xiaoming Jin, Xuetao Ding, and Xiaojun Ye.
Google News Personalization: Scalable Online Collaborative Filtering
Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign.
Learning Geographical Preferences for Point-of-Interest Recommendation Author(s): Bin Liu Yanjie Fu, Zijun Yao, Hui Xiong [KDD-2013]
By Sharath Kumar Aitha. Instructor: Dr. Dongchul Kim.
CHAPTER 11 SECTION 2 Inference for Relationships.
Objectives Objectives Recommendz: A Multi-feature Recommendation System Matthew Garden, Gregory Dudek, Center for Intelligent Machines, McGill University.
Probabilistic Models for Discovering E-Communities Ding Zhou, Eren Manavoglu, Jia Li, C. Lee Giles, Hongyuan Zha The Pennsylvania State University WWW.
Tell Me What You See and I will Show You Where It Is Jia Xu 1 Alexander G. Schwing 2 Raquel Urtasun 2,3 1 University of Wisconsin-Madison 2 University.
LOGO Identifying Opinion Leaders in the Blogosphere Xiaodan Song, Yun Chi, Koji Hino, Belle L. Tseng CIKM 2007 Advisor : Dr. Koh Jia-Ling Speaker : Tu.
Active learning Haidong Shi, Nanyi Zeng Nov,12,2008.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
1 Adaptive Subjective Triggers for Opinionated Document Retrieval (WSDM 09’) Kazuhiro Seki, Kuniaki Uehara Date: 11/02/09 Speaker: Hsu, Yu-Wen Advisor:
Speaker : Yu-Hui Chen Authors : Dinuka A. Soysa, Denis Guangyin Chen, Oscar C. Au, and Amine Bermak From : 2013 IEEE Symposium on Computational Intelligence.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 11 Inference for Distributions of Categorical.
Edge Preserving Spatially Varying Mixtures for Image Segmentation Giorgos Sfikas, Christophoros Nikou, Nikolaos Galatsanos (CVPR 2008) Presented by Lihan.
Jointly Modeling Aspects, Ratings and Sentiments for Movie Recommendation (JMARS) Authors: Qiming Diao, Minghui Qiu, Chao-Yuan Wu Presented by Gemoh Mal.
A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation Yee W. Teh, David Newman and Max Welling Published on NIPS 2006 Discussion.
Using Blog Properties to Improve Retrieval Gilad Mishne (ICWSM 2007)
Hao Ma, Dengyong Zhou, Chao Liu Microsoft Research Michael R. Lyu
1 Dongheng Sun 04/26/2011 Learning with Matrix Factorizations By Nathan Srebro.
Unsupervised Learning Part 2. Topics How to determine the K in K-means? Hierarchical clustering Soft clustering with Gaussian mixture models Expectation-Maximization.
Topic Modeling for Short Texts with Auxiliary Word Embeddings
Recommendation in Scholarly Big Data
Mining Utility Functions based on user ratings
Introduction to Analysis of Algorithms
Online Multiscale Dynamic Topic Models
Statistical Models for Automatic Speech Recognition
Methods and Metrics for Cold-Start Recommendations
Multimodal Learning with Deep Boltzmann Machines
CHAPTER 11 Inference for Distributions of Categorical Data
Collective Network Linkage across Heterogeneous Social Platforms
Machine Learning Basics
A Consensus-Based Clustering Method
Log Linear Modeling of Independence
Bayesian Models in Machine Learning
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8…
Chapter 11: Inference for Distributions of Categorical Data
Matching Words with Pictures
CHAPTER 11 Inference for Distributions of Categorical Data
Unsupervised Learning II: Soft Clustering with Gaussian Mixture Models
Introduction Previous lessons have demonstrated that the normal distribution provides a useful model for many situations in business and industry, as.
CHAPTER 11 Inference for Distributions of Categorical Data
Michal Rosen-Zvi University of California, Irvine
Topic Models in Text Processing
Hierarchical Relational Models for Document Networks
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
NON-NEGATIVE COMPONENT PARTS OF SOUND FOR CLASSIFICATION Yong-Choon Cho, Seungjin Choi, Sung-Yang Bang Wen-Yi Chu Department of Computer Science &
Measuring The Influence In The News Media’s Narratives
Presentation transcript:

GhostLink: Latent Network Inference for Influence-aware Recommendation Yuchong Zheng

CONTENTS 01 02 03 04 05 Introduction Ghostlink: Influence-Facet Model Joint Probabilistic Inference 04 Experiment 05 Conclusion

01 PART Introduction

Introduction 01 02 The Motivation of the Paper Some Works Problem Traditionally is that similar users have similar rating behavior and facet preferences. Recent works use review content and temporal content to extract further cues. Although all these works assume that users behave independently, but in the real world things are different. But how can we detect the influence of this influence of online activities? 01 Some Works 02 Problem The first method is to exploit the observed social network or interaction of users. Some recent works present explicit user-user relationship to propose social-network based recommendation. Some big online review communities, such as Amazon or Beeradvocate do not have explicit social network. So the author think if they can test the influence based on other signal?

The Method of The Paper The Method Conclusion of Figure Goals In this paper, the author leverage on pinion based on writing style as an indication of the influence. And use GhostLink to analysis the problem. It shows that there are only a few users who influence most of the others, and the distribution of influencers follows a power-law like distribution. Goals Given only time stamped reviews of users in online communities, extract the underlying influence network of who-influence-whom based on opinion conformity and analyze the characteristic of the network. Leverage the implicit social influence to improve item rating prediction.

The Contributions of Paper Model Algorithm The paper propose an unsupervised probabilistic generative model GhostLink to learn influence graph in online communities. It propose an efficient algorithm based on Gibbs sampling to estimate the hidden parameters in GhostLink that empirically demonstrates fast convergence. Experiments The paper performs large-scale experiments in four communities with 13 million reviews. Moreover, it analyze the properties of the influence graph and use for use-cases like finding influential members in the community

Ghostlink: Influence-Facet Model 02 PART Ghostlink: Influence-Facet Model

Assume and Preset The goal of the paper is to learn the influence graph between users based on their review content. It argues that influence is reflexed by the used/echoed words and facts. The generative process is shown below. V is sampled user. d’is the actual view. Xita is the influencer’s faced distribution. Td is the timestamp Fei is categorical distribution

Conclusion In summary, the paper points out that the user’s review can be regarded by the summary/ or the mixture of her latent preferences and the preference of the influence.

Joint Probabilistic Inference 03 PART Joint Probabilistic Inference

Overall processing Exploiting the above results, the overall inference is an iterative process consisting of the following steps. The paper sorts all reviews on an item by timestamps. For each word in each review on an item: 1.Estimate whether the word has been written under influence. 2.In case of influence (s = 1), an influencer v is jointly sampled from the previous step. 3.Sample a facet for the word keeping all influencers and influence variables fixed. The process is repeated until convergence of the Gibbs sampling process

Example Consider a set of reviews written by three users in the following time order: Adam, Bob and Sam(see in table 2). The table also shows the current assignment of the latent variables z and s. The goal is to re-sample the influence variables. let n(u,s) be the number of tokens written by u with influence variable as s, n(d,z) be the total number of tokens with topic as z in document d, n(d) be the number of tokens in document d, and n(u) be the total number of tokens written by u.

In a community setting, especially for communities dealing with items of fine taste like movies, food and beer where users co-review multiple items, such statistics are aggregated over several other items. This provides a stronger signal for influence when a user copies/echoes similar facet descriptions from a particular user across several items. therefore, this paper relies on three main factors to model influence and influencer in the community: The vulnerability of a user u in getting influenced, modeled by π and captured in the counts of n(u, s). The textual focus of the influencing review vd by v on the specific facet (z)modeled by θ and captured in the counts of n(vd ,z); as well as how many times the influencer v influenced u, modeled by ψ and captured in counts of n(u, v, s = 1) — aggregated over all facets and items they co-reviewed. The latent preference of u for z, modeled by θu and captured in the counts of n(u, z, s = 0).

04 PART Experiment

Sources of The Experiment In order to test GhostLink, the paper use four online communities in different domains: BearAdvocate, Ratebeer for beer review and Amazon for movie and food reviews. The article set the number of latent facets K=20. For each community, the symmetric concentration are set as below:

Likelihood, Smoothness, Fast Convergence The figure shows the log-likeyhood of the data per iteration. The learning is stable and has a smooth increase in the data log-likelyhood. Also it is easy to find out that there are great difference when Ghostlink consider the influence perform The table shows the run time comparison to convergence between the basic and fast implementation of GhostLink

Influence-aware Item Rating Prediction The influence that the paper present next is the effectiveness of GhostLink in rating prediction. Here the paper has divided it into four groups(Methods), 1.GhostLink 2.Rating+ Test-aware 3.Rating+Time+Network 4.Rating+ Time-aware. And the results are shown as the right side: With the results on the right side, it is easy to find out that textural features along are not helpful. It improve when we incorporate more influence-specific features.

Facet Preference Divergence In this part the paper want to examine if there is any difference between the latent facet preference of users as opposed to their observed preference. The paper compute Jensen-Shannon Divergence. The result show that: 1. there is a strong occurrence of social influence on user preferences in online communities. 2.The users are more likely to use their original latent preferences to influence others in the community.

Structure of the Influence Network At last the author analyzes the structure of the influence network Ψ and want to find out how is the mass distributed in the network. For this, the author computed a Maximum Weighted Spanning Tree. And the features are shown as below. It points out that the majority of mass of the influence graph is concentrated in giant tree-components.

Structure of the Influence Network The tree structure on right shows another characteristics: only a few users seem to influence many others. Next we analyze if we observe specific power-law behaviors. And the results are showed as below. Also there are only a few hubs which have very high hubs.

05 PART Conclusion

This paper uses GhostLink to analyze the underlying influence graph in online communities. With this method, we can improve item rating prediction by 23% over state of the art methods by capturing implicit social influence.

THANKS