New and Improved: Modeling Versions to Improve App Recommendation Date: 2014/10/2 Author: Jovian Lin, Kazunari Sugiyama, Min-Yen Kan, Tat-Seng Chua Source:

Slides:

Advertisements

Similar presentations

Diversity Maximization Under Matroid Constraints Date : 2013/11/06 Source : KDD’13 Authors : Zeinab Abbassi, Vahab S. Mirrokni, Mayur Thakur Advisor :

Advertisements

Date : 2013/05/27 Author : Anish Das Sarma, Lujun Fang, Nitin Gupta, Alon Halevy, Hongrae Lee, Fei Wu, Reynold Xin, Gong Yu Source : SIGMOD’12 Speaker.

Title: The Author-Topic Model for Authors and Documents

Date : 2013/09/17 Source : SIGIR’13 Authors : Zhu, Xingwei

Bring Order to Your Photos: Event-Driven Classification of Flickr Images Based on Social Knowledge Date: 2011/11/21 Source: Claudiu S. Firan (CIKM’10)

Toward Whole-Session Relevance: Exploring Intrinsic Diversity in Web Search Date: 2014/5/20 Author: Karthik Raman, Paul N. Bennett, Kevyn Collins-Thompson.

Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.

Exploring the Filter Bubble: The Effect of Using Recommender Systems on Content Diversity Date: 2014/8/25 Author: Tien T.Nguyen, Pik-Mai Hui, F. Maxwell.

Searchable Web sites Recommendation Date : 2012/2/20 Source : WSDM’11 Speaker : I- Chih Chiu Advisor : Dr. Koh Jia-ling 1.

Mining Query Subtopics from Search Log Data Date : 2012/12/06 Resource : SIGIR’12 Advisor : Dr. Jia-Ling Koh Speaker : I-Chih Chiu.

Explorations in Tag Suggestion and Query Expansion Jian Wang and Brian D. Davison Lehigh University, USA SSM 2008 (Workshop on Search in Social Media)

Evaluating Search Engine

(ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence

Dongyeop Kang1, Youngja Park2, Suresh Chari2

Title Extraction from Bodies of HTML Documents and its Application to Web Page Retrieval Microsoft Research Asia Yunhua Hu, Guomao Xin, Ruihua Song, Guoping.

MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.

On Sparsity and Drift for Effective Real- time Filtering in Microblogs Date ： 2014/05/13 Source ： CIKM’13 Advisor ： Prof. Jia-Ling, Koh Speaker ： Yi-Hsuan.

Web Usage Mining with Semantic Analysis Date: 2013/12/18 Author: Laura Hollink, Peter Mika, Roi Blanco Source: WWW’13 Advisor: Jia-Ling Koh Speaker: Pei-Hao.

Tag Clouds Revisited Date : 2011/12/12 Source : CIKM’11 Speaker : I- Chih Chiu Advisor : Dr. Koh. Jia-ling 1.

Leveraging Conceptual Lexicon ： Query Disambiguation using Proximity Information for Patent Retrieval Date : 2013/10/30 Author : Parvaz Mahdabi, Shima.

FYP Presentation DATA FUSION OF CONSUMER BEHAVIOR DATASETS USING SOCIAL MEDIA Madhav Kannan A R 1.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. BNS Feature Scaling: An Improved Representation over TF·IDF for SVM Text Classification Presenter : Lin,

A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.

1 Applying Collaborative Filtering Techniques to Movie Search for Better Ranking and Browsing Seung-Taek Park and David M. Pennock (ACM SIGKDD 2007)

A Personalized Recommender System Based on Users’ Information In Folksonomies Date: 2013/12/18 Author: Mohamed Nader Jelassi, Sadok Ben Yahia, Engelbert.

1 A Unified Relevance Model for Opinion Retrieval (CIKM 09’) Xuanjing Huang, W. Bruce Croft Date: 2010/02/08 Speaker: Yu-Wen, Hsu.

CIKM’09 Date:2010/8/24 Advisor: Dr. Koh, Jia-Ling Speaker: Lin, Yi-Jhen 1.

Exploring Online Social Activities for Adaptive Search Personalization CIKM’10 Advisor ： Jia Ling, Koh Speaker ： SHENG HONG, CHUNG.

A Probabilistic Graphical Model for Joint Answer Ranking in Question Answering Jeongwoo Ko, Luo Si, Eric Nyberg (SIGIR ’ 07) Speaker: Cho, Chin Wei Advisor:

April 14, 2003Hang Cui, Ji-Rong Wen and Tat- Seng Chua 1 Hierarchical Indexing and Flexible Element Retrieval for Structured Document Hang Cui School of.

Transfer Learning Task. Problem Identification Dataset : A Year: 2000 Features: 48 Training Model ‘M’ Testing 98.6% Training Model ‘M’ Testing 97% Dataset.

Binxing Jiao et. al (SIGIR ’10) Presenter : Lin, Yi-Jhen Advisor: Dr. Koh. Jia-ling Date: 2011/4/25 VISUAL SUMMARIZATION OF WEB PAGES.

Probabilistic Models of Novel Document Rankings for Faceted Topic Retrieval Ben Cartrette and Praveen Chandar Dept. of Computer and Information Science.

Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.

How Useful are Your Comments? Analyzing and Predicting YouTube Comments and Comment Ratings Stefan Siersdorfer, Sergiu Chelaru, Wolfgang Nejdl, Jose San.

DOCUMENT UPDATE SUMMARIZATION USING INCREMENTAL HIERARCHICAL CLUSTERING CIKM’10 (DINGDING WANG, TAO LI) Advisor: Koh, Jia-Ling Presenter: Nonhlanhla Shongwe.

Social Tag Prediction Paul Heymann, Daniel Ramage, and Hector Garcia- Molina Stanford University SIGIR 2008.

LOGO Identifying Opinion Leaders in the Blogosphere Xiaodan Song, Yun Chi, Koji Hino, Belle L. Tseng CIKM 2007 Advisor ： Dr. Koh Jia-Ling Speaker ： Tu.

Date: 2015/11/19 Author: Reza Zafarani, Huan Liu Source: CIKM '15

Digital Library Project Plan Greg Ferguson LIU LIS 654 October 25, 2011.

1 A Biterm Topic Model for Short Texts Xiaohui Yan, Jiafeng Guo, Yanyan Lan, Xueqi Cheng Institute of Computing Technology, Chinese Academy of Sciences.

Exploring in the Weblog Space by Detecting Informative and Affective Articles Xiaochuan Ni, Gui-Rong Xue, Xiao Ling, Yong Yu Shanghai Jiao-Tong University.

A Classification-based Approach to Question Answering in Discussion Boards Liangjie Hong, Brian D. Davison Lehigh University (SIGIR ’ 09) Speaker: Cho,

Date: 2012/08/21 Source: Zhong Zeng, Zhifeng Bao, Tok Wang Ling, Mong Li Lee (KEYS’12) Speaker: Er-Gang Liu Advisor: Dr. Jia-ling Koh 1.

Automating Readers’ Advisory to Make Book Recommendations for K-12 Readers by Alicia Wood.

Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -

Date: 2012/11/29 Author: Chen Wang, Keping Bi, Yunhua Hu, Hang Li, Guihong Cao Source: WSDM’12 Advisor: Jia-ling, Koh Speaker: Shun-Chen, Cheng.

1 Adaptive Subjective Triggers for Opinionated Document Retrieval (WSDM 09’) Kazuhiro Seki, Kuniaki Uehara Date: 11/02/09 Speaker: Hsu, Yu-Wen Advisor:

Link Distribution on Wikipedia [0407]KwangHee Park.

Title Authors Introduction Text, text, text, text, text, text Background Information Text, text, text, text, text, text Observations Text, text, text,

CONTEXTUAL SEARCH AND NAME DISAMBIGUATION IN USING GRAPHS EINAT MINKOV, WILLIAM W. COHEN, ANDREW Y. NG SIGIR’06 Date: 2008/7/17 Advisor: Dr. Koh,

PERSONALIZED DIVERSIFICATION OF SEARCH RESULTS Date: 2013/04/15 Author: David Vallet, Pablo Castells Source: SIGIR’12 Advisor: Dr.Jia-ling, Koh Speaker:

TO Each His Own: Personalized Content Selection Based on Text Comprehensibility Date: 2013/01/24 Author: Chenhao Tan, Evgeniy Gabrilovich, Bo Pang Source:

Extracting Query Facets From Search Results Date : 2013/08/20 Source : SIGIR’13 Authors : Weize Kong and James Allan Advisor : Dr.Jia-ling, Koh Speaker.

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

Search Result Diversification in Resource Selection for Federated Search Date ： 2014/06/17 Author ： Dzung Hong, Luo Si Source ： SIGIR’13 Advisor: Jia-ling.

{ Adaptive Relevance Feedback in Information Retrieval Yuanhua Lv and ChengXiang Zhai (CIKM ‘09) Date: 2010/10/12 Advisor: Dr. Koh, Jia-Ling Speaker: Lin,

Finding similar items by leveraging social tag clouds Speaker: Po-Hsien Shih Advisor: Jia-Ling Koh Source: SAC 2012’ Date: October 4, 2012.

CiteData: A New Multi-Faceted Dataset for Evaluating Personalized Search Performance CIKM’10 Advisor : Jia-Ling, Koh Speaker : Po-Hsien, Shih.

Meta-Path-Based Ranking with Pseudo Relevance Feedback on Heterogeneous Graph for Citation Recommendation By: Xiaozhong Liu, Yingying Yu, Chun Guo, Yizhou.

G ENDER AND I NTEREST T ARGETING FOR S PONSORED P OST A DVERTISING AT T UMBLR Speaker: Jim-An Tsai Advisor: Professor Jia-ling Koh Author: Mihajlo Grbovicy,

Opinion spam and Analysis 소프트웨어공학 연구실 G 최효린 1 / 35.

SIGIR’13, July 28–August 1, 2013, Dublin, Ireland. Yu Yang Addressing Cold-Start in App Recommendation Latent User Models Constructed from Twitter.

Improving Search Relevance for Short Queries in Community Question Answering Date： 2014/09/25 Author ： Haocheng Wu, Wei Wu, Ming Zhou, Enhong Chen, Lei.

Aspect-based sentiment analysis

Location Recommendation — for Out-of-Town Users in Location-Based Social Network Yina Meng.

Learning Literature Search Models from Citation Behavior

Date : 2013/1/10 Author : Lanbo Zhang, Yi Zhang, Yunfei Chen

Date: 2012/11/15 Author: Jin Young Kim, Kevyn Collins-Thompson,

Bug Localization with Combination of Deep Learning and Information Retrieval A. N. Lam et al. International Conference on Program Comprehension 2017.

Presentation transcript:

New and Improved: Modeling Versions to Improve App Recommendation Date: 2014/10/2 Author: Jovian Lin, Kazunari Sugiyama, Min-Yen Kan, Tat-Seng Chua Source: SIGIR’14 Advisor: Jia-Ling Koh Speaker: Pei-Hao Wu 1

Outline Introduction Method Experiment Conclusion 2

Introduction Motivation 1. Existing recommender systems usually model items as static—unchanging in attributes, description, and features 2. Mobile apps, a version update may provide substantial changes to an app as updates, may attract a consumer’s interest for a previously unappealing version 3

Introduction Example 4

Introduction 5 Version features Semi-supervised topic model Latent topic Identifying Important Latent Topics Score(d, u) User Personalization Calculation of the Version- snippet Score Recommend App Important latent topic in genre User preference

Outline Introduction Method Experiment Conclusion 6

Version Features Version-categories ◦ Major ◦ Minor ◦ Maintenance Version-snippet ◦ i.e., app’s changelog Other information ◦ Genre mixture ◦ Ratings 7

Semi-supervised topic model Observed labels Version categories (i.e., major, minor, maintenance) Genre mixture (e.g., Books, Business, Action …) Corpus-enhancement with Pseudo-terms Inject pseudo-terms (i.e., Genres + category) into the corpus of version-snippets It will help in getting topic distributions that are more consistent with the nature of version-snippets 8 Ex: Version 1.2.1(maintenance update) Retina Display graphics. Background post completion (IOS 4 only) Bug fixes #Books_genre, #maintenance_update

Semi-supervised topic model 9 d1: w1,w2 d2: w3,w4 d3: w5 LDA Topic model p(topic1|d1) p(topic2|d1). p(topic1|d3) p(topic2|d3) p(w1|topic1) p(w1|topic2). p(w5|topic1) p(w5|topic2) d1: w1,w2 d2: w3,w4 d3: w5 LLDA Topic model p(topic1|d1) p(topic2|d1). p(topic1|d3) p(topic2|d3) p(w1|topic1) p(w1|topic2). p(w5|topic1) p(w5|topic2) label1 label2label3Label4

Identifying Important Latent Topics We want to know which topics are more important for recommendation Each genre works differently to the same type of version update ◦ Ex: HD display support would be more enticing and relevant on a “game” app instead of a music app Popularity score ◦ It is reflected by the votes it receives ◦ Positive votes: rating 3~5 ◦ Negative votes: rating 0~2 10

Identifying Important Latent Topics Using the popularity score to define the importance weight of a genre-topic pair 11 Ex: g=game p(topic 1| d1) = 0.5 D(game)={d1, d2} p(topic 1| d2) = 0.4 Z={topic 1, topic 2} p(topic 2| d1) = 0.6 z=topic 1 p(topic 2| d2) = 0.3 g z APP(d) p(z|d)

User Personalization We need to know each user’s preference with respect to the set of latent topics Analyzing the topics present in the apps that a user u has previously consumed 12 Ex: z=topic 1 Z={topic 1, topic 2} u=user 1 D(user 1)={d1} P(topic 1|d1)=0.5 P(topic 2|d1)=0.6 p(topic 1 | d1) / { [ p(topic 1 | d1) ] +[ p(topic 2 | d 1) ] } =0.5 / { } =0.45 u z APP(d) D(u) p(z|d)

Calculation of the Version-snippet Score Calculate its score based on its latest version to see if it should be recommend 13 Ex: d=d1 u=user 1 Z={topic 1, topic 2} genre(d)=game [ p(topic 1| d1)* w(game, topic 1)*p(topic 1| user 1) ] + [ p(topic 2| d1)* w(game, topic 2)*p(topic 1| user 1) ] = [ 0.5*0.45*0.45 ] + [0.6*0.4*0.3] = uAPP(d) z p(z|d) p(z|u) w(g(d)|z)

Outline Introduction Method Experiment Conclusion 14

Dataset App Metadata ◦ App ID, title, description, genre Version Information ◦ All version information of app Rating ◦ App ID, version number, rating, reviewer’s ID, review comments 15

Dataset At least 5 versions, document with at least 10 ratings, users at least rated 20 apps Dataset consists of 9797 users, 6524 apps, versions, ratings 20% user as target users to receive recommendations Target users remove 25% of their most recent downloaded apps 16

Evaluation Metric ◦ Define as : ◦ : The number of apps the user like in the top M ◦ : The total number of apps the user likes 17

Recommendation Accuracy Obtained by Different Number of Latent Topics 18 K=1000 gives the best recall scores

Importance of Genre Information 19 We compare the recommendation accuracies between models with and without genre information

Comparison of Different Topic Models 20 Enhanced-corpus generally provides better recall LLDA and inj+LLDA models consistently outperform the pure LDA models

Comparison against Other Recommendation Techniques 21 VSR underperformed against CF, it does outperform CBF The textual features in the app descriptions are noisy

Dissecting Specific LDA Topics 22 We observe that the injected pseudo-terms act as a guide for inj+LLDA’s inferencing process, which contributes to better latent topic generation.

Outline Introduction Method Experiment Conclusion 23

Conclusion Our framework utilizes a semi-supervised variant of LDA that accounts for both text and metadata to characterize version features into a set of latent topics We used genre information to discriminate the topic distributions and found it is a key factor in discriminating the topic distributions Our method targets particular versions of apps, allowing previously disfavored apps to be recommended 24