1 Effective Fusion-based Approaches for Recommender Systems Xin Supervisors: Prof. King and Prof. Lyu Thesis Committees: Prof. Lam, Prof. Lui, and Prof.

Slides:

Advertisements

Similar presentations

Recommender System A Brief Survey.

Advertisements

Collaborative QoS Prediction in Cloud Computing Department of Computer Science & Engineering The Chinese University of Hong Kong Hong Kong, China Rocky.

Active Learning for Streaming Networked Data Zhilin Yang, Jie Tang, Yutao Zhang Computer Science Department, Tsinghua University.

1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

Oct 14, 2014 Lirong Xia Recommender systems acknowledgment: Li Zhang, UCSC.

COLLABORATIVE FILTERING Mustafa Cavdar Neslihan Bulut.

Learning to Recommend Hao Ma Supervisors: Prof. Irwin King and Prof. Michael R. Lyu Dept. of Computer Science & Engineering The Chinese University of Hong.

An Approach to Evaluate Data Trustworthiness Based on Data Provenance Department of Computer Science Purdue University.

Chen Cheng1, Haiqin Yang1, Irwin King1,2 and Michael R. Lyu1

Carnegie Mellon 1 Maximum Likelihood Estimation for Information Thresholding Yi Zhang & Jamie Callan Carnegie Mellon University

ACM Multimedia th Annual Conference, October , 2004

Collaborative Ordinal Regression Shipeng Yu Joint work with Kai Yu, Volker Tresp and Hans-Peter Kriegel University of Munich, Germany Siemens Corporate.

Scalable Text Mining with Sparse Generative Models

1 Collaborative Filtering: Latent Variable Model LIU Tengfei Computer Science and Engineering Department April 13, 2011.

Chapter 12 (Section 12.4) : Recommender Systems Second edition of the book, coming soon.

Item-based Collaborative Filtering Recommendation Algorithms

SIGIR’09 Boston 1 Entropy-biased Models for Query Representation on the Click Graph Hongbo Deng, Irwin King and Michael R. Lyu Department of Computer Science.

Performance of Recommender Algorithms on Top-N Recommendation Tasks RecSys 2010 Intelligent Database Systems Lab. School of Computer Science & Engineering.

Distributed Networks & Systems Lab. Introduction Collaborative filtering Characteristics and challenges Memory-based CF Model-based CF Hybrid CF Recent.

1 Information Filtering & Recommender Systems (Lecture for CS410 Text Info Systems) ChengXiang Zhai Department of Computer Science University of Illinois,

Group Recommendations with Rank Aggregation and Collaborative Filtering Linas Baltrunas, Tadas Makcinskas, Francesco Ricci Free University of Bozen-Bolzano.

1 Formal Models for Expert Finding on DBLP Bibliography Data Presented by: Hongbo Deng Co-worked with: Irwin King and Michael R. Lyu Department of Computer.

A Hybrid Recommender System: User Profiling from Keywords and Ratings Ana Stanescu, Swapnil Nagar, Doina Caragea 2013 IEEE/WIC/ACM International Conferences.

Training and Testing of Recommender Systems on Data Missing Not at Random Harald Steck at KDD, July 2010 Bell Labs, Murray Hill.

Evaluation Methods and Challenges. 2 Deepak Agarwal & Bee-Chung ICML’11 Evaluation Methods Ideal method –Experimental Design: Run side-by-side.

CIKM’09 Date:2010/8/24 Advisor: Dr. Koh, Jia-Ling Speaker: Lin, Yi-Jhen 1.

Chengjie Sun,Lei Lin, Yuan Chen, Bingquan Liu Harbin Institute of Technology School of Computer Science and Technology 1 19/11/ :09 PM.

Google News Personalization: Scalable Online Collaborative Filtering

Online Learning for Collaborative Filtering

RecBench: Benchmarks for Evaluating Performance of Recommender System Architectures Justin Levandoski Michael D. Ekstrand Michael J. Ludwig Ahmed Eldawy.

Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.

EigenRank: A Ranking-Oriented Approach to Collaborative Filtering IDS Lab. Seminar Spring 2009 강 민 석강 민 석 May 21 st, 2009 Nathan.

A Novel Local Patch Framework for Fixing Supervised Learning Models Yilei Wang 1, Bingzheng Wei 2, Jun Yan 2, Yang Hu 2, Zhi-Hong Deng 1, Zheng Chen 2.

The Effect of Dimensionality Reduction in Recommendation Systems

Jiafeng Guo(ICT) Xueqi Cheng(ICT) Hua-Wei Shen(ICT) Gu Xu (MSRA) Speaker: Rui-Rui Li Supervisor: Prof. Ben Kao.

EigenRank: A ranking oriented approach to collaborative filtering By Nathan N. Liu and Qiang Yang Presented by Zachary 1.

WSP: A Network Coordinate based Web Service Positioning Framework for Response Time Prediction Jieming Zhu, Yu Kang, Zibin Zheng and Michael R. Lyu The.

Xutao Li1, Gao Cong1, Xiao-Li Li2

Pairwise Preference Regression for Cold-start Recommendation Speaker: Yuanshuai Sun

CoNMF: Exploiting User Comments for Clustering Web2.0 Items Presenter: He Xiangnan 28 June School of Computing National.

Hongbo Deng, Michael R. Lyu and Irwin King

Panther: Fast Top-k Similarity Search in Large Networks JING ZHANG, JIE TANG, CONG MA, HANGHANG TONG, YU JING, AND JUANZI LI Presented by Moumita Chanda.

Recommender Systems with Social Regularization Hao Ma, Dengyong Zhou, Chao Liu Microsoft Research Michael R. Lyu The Chinese University of Hong Kong Irwin.

A Novel Relational Learning-to- Rank Approach for Topic-focused Multi-Document Summarization Yadong Zhu, Yanyan Lan, Jiafeng Guo, Pan Du, Xueqi Cheng Institute.

11 A Classification-based Approach to Question Routing in Community Question Answering Tom Chao Zhou 1, Michael R. Lyu 1, Irwin King 1,2 1 The Chinese.

Service Reliability Engineering The Chinese University of Hong Kong

Collaborative Filtering via Euclidean Embedding M. Khoshneshin and W. Street Proc. of ACM RecSys, pp , 2010.

ICONIP 2010, Sydney, Australia 1 An Enhanced Semi-supervised Recommendation Model Based on Green’s Function Dingyan Wang and Irwin King Dept. of Computer.

Personalization Services in CADAL Zhang yin Zhuang Yuting Wu Jiangqin College of Computer Science, Zhejiang University November 19,2006.

Unsupervised Streaming Feature Selection in Social Media

User Modeling and Recommender Systems: recommendation algorithms

Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:

1 Learning to Impress in Sponsored Search Xin Supervisors: Prof. King and Prof. Lyu.

Item-Based Collaborative Filtering Recommendation Algorithms Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl GroupLens Research Group/ Army.

Reputation-aware QoS Value Prediction of Web Services Weiwei Qiu, Zhejiang University Zibin Zheng, The Chinese University of HongKong Xinyu Wang, Zhejiang.

Collaborative Deep Learning for Recommender Systems

Collaborative Filtering With Decoupled Models for Preferences and Ratings Rong Jin 1, Luo Si 1, ChengXiang Zhai 2 and Jamie Callan 1 Language Technology.

Hao Ma, Dengyong Zhou, Chao Liu Microsoft Research Michael R. Lyu

1 Dongheng Sun 04/26/2011 Learning with Matrix Factorizations By Nathan Srebro.

A Collaborative Quality Ranking Framework for Cloud Components

TJTS505: Master's Thesis Seminar

WSRec: A Collaborative Filtering Based Web Service Recommender System

Asymmetric Correlation Regularized Matrix Factorization for Web Service Recommendation Qi Xie1, Shenglin Zhao2, Zibin Zheng3, Jieming Zhu2 and Michael.

FRM: Modeling Sponsored Search Log with Full Relational Model

Q4 : How does Netflix recommend movies?

RECOMMENDER SYSTEMS WITH SOCIAL REGULARIZATION

ITEM BASED COLLABORATIVE FILTERING RECOMMENDATION ALGORITHEMS

Probabilistic Latent Preference Analysis

Fusing Rating-based and Hitting-based Algorithms in Recommender Systems Xin Xin

Learning to Rank with Ties

Presentation transcript:

1 Effective Fusion-based Approaches for Recommender Systems Xin Supervisors: Prof. King and Prof. Lyu Thesis Committees: Prof. Lam, Prof. Lui, and Prof. Yang Department of Computer Science and Engineering The Chinese University of Hong Kong July 8, 2011

2 Outline Background of Recommender Systems Motivation of the Thesis Part 1: Relational Fusion of Multiple Features Part 2: Effective Fusion of Regression and Ranking Part 3: Effective Fusion of Quality and Relevance Part 4: Impression Efficiency Optimization Conclusion

3 Exponential Increase of Information

4 Information Overload Too much information Noises

5 Recommender Systems To filter useful information for users –Movie recommendation from MovieLens

6 Ratings in Recommender Systems Ratings −R ecommendation results quality evaluation

7 Classical Regression Problem Task: predict unrated user-item pairs Figure. User-item matrix in recommender systems

8 An Overview of Techniques Content-based –[Melville 2002] Memory-based –User-based [Breese 1998], [Jin 2004] –Item-based [Deshpande 2004], [Sarwar 2001] –Hybrid [Wang 2006], [Ma 2007] Model-based –[Hofmann 2004], [Salakhutdinov 2008] State-of-the-art methods – Memory-based [Ma 2007] [Liu 2008] – Model-based [Salakhutdinov 2008] [Koren 2008] [Koren 2010] [Weimer 2007] Main Idea: Content Features Naive Method: If a user has given a high rating to a movie directed by Ang Lee, other movies directed by Ang Lee will be recommended to this user. Main Idea: Common Behavior Patterns Naive Method: If a user has given high a rating to a movie A; and many users who like A also like B. Movie B will be recommended to the user

9 Applications of Recommender Systems

10 Outline Background of Recommender Systems Motivation of the Thesis Part 1: Relational Fusion of Multiple Features Part 2: Effective Fusion of Regression and Ranking Part 3: Effective Fusion of Quality and Relevance Part 4: Impression Efficiency Optimization Conclusion

11 To combine multiple information and algorithms to get the better performance Two heads are better than one Fusion is effective –Reports on competitions such as Netflix [Koren 2008], KDD CUP [Kurucz 2007][Rosset 2007] Fusion-based Approaches Ratings Social Information Location Better Recommendation

12 Roadmap of the Thesis (1): Evaluation Structure of Recommender Systems Multi-dimensional Adaption Recommender Systems Impression Efficiency Relevance Quality Multi-measure adaption Multi-measure adaption Regression Ranking Regression Ranking Single measure and single dimension 2.Multi-measure adaption 3.Multi-dimensional adaption 4.Success of recommender systems Coverage Diversity Privacy

13 Roadmap of the Thesis (2): Summary Multi-dimensional Adaption Recommender Systems Impression Efficiency Relevance Quality Multi-measure adaption Multi-measure adaption Regression Ranking Regression Ranking Part 4 Part 3 Part 2 Part 1 Part 1: Relational fusion of multiple features − Full and oral paper in CIKM 2009 (cited count: 14) Part 2: Effective fusion of regression and ranking − Submitted to CIKM 2011 Part 3: Effective fusion of quality and relevance − Full paper in WSDM 2011 Part 4: Impression efficiency optimization − Prepared for WWW 2012

14 Outline Background of Recommender Systems Motivation of the Thesis Part 1: Relational Fusion of Multiple Features Part 2: Effective Fusion of Regression and Ranking Part 3: Effective Fusion of Quality and Relevance Part 4: Impression Efficiency Optimization Conclusion

15 Part 1: Relational Fusion of Multiple Features Multi-dimensional Adaption Recommender Systems Impression Efficiency Relevance Quality Multi-measure adaption Multi-measure adaption Regression Ranking Regression Ranking Limitations of previous work −Lack of relational dependency −Difficulty for integrating features.

16 Limitation (1): Lack of Relational Dependency within Predictions r 13 r 25 y 33 y 35 r 37 r 43 y 45 r 47 i1i1 i2i2 i3i3 i4i4 i5i5 i6i6 i7i7 u1u1 u2u2 u3u3 u4u4 Heuristic Fusion [Wang 2006] − Difficult to measure similarity between y 35 and y 43 − Cannot guarantee the nearness between y 35 and y 33 (or y 35 and y 45 ) EMDP [Ma 2007] − Error propagation

17 Limitation (2): Difficult to Integrate Features into an Unified Approach Linear Integration [Ma 2007] –Difficult to calculate the feature function weights Trust relationship Content information of items Profile information of users

18 Our Solution: Multi-scale Continuous Conditional Random Fields (MCCRF) Propose to utilize MCCRF as a relational fusion-based approach, which is extended from single-scale continuous conditional random fields Relational dependency within predictions can be modeled by the Markov property Feature weights are globally optimized

19 Relational Recommendation Formulation Let X denote observations. Let Y denote predictions. Traditional RecommendationRelational Recommendation

20 Traditional Single-scale Continuous Conditional Random Fields Feature example: Avg. Rate for item m Similarity between item m and item n

21 Multi-scale Continuous Conditional Random Fields Feature example: Avg. Rate for user l Similarity between item m and item n trust between user l and user j

22 Features Local features Avg. rating of the same occupation Avg. rating of the same age and gender Avg. rating of the same genre Relational features Similarity among itemsSimilarity among usersTrust relation

23 Algorithms-Training and Inference Objective Function ： Gradient ： Gibbs Sampling ： Training process Inference process Objective Function ： Simulated Annealing ：

24 Experiment-Setup Datasets –MovieLens –Epinions Metrics –Mean Absolute Error (MAE) –Root Mean Square Error (RMSE) Baselines –EPCC: combination of UPCC and IPCC (memory) –Aspect Model (AM): classical latent method (model) –Fusion: directly find similar users’ similar items –EMDP: two rounds prediction

25 Effectiveness of Dependency MovieLens Epinions Our MCCRF approach outperforms others consistently Fusion [Wang 2006] calculate inaccurate similarity between two predictions EMDP [Ma 2007] has the error propagation problem

26 Effectiveness of Features MovieLens Epinions Approaches with more features perform better Two heads are better than one MCCRF is effective in fusion of multiple features

27 Overall Performance The proposed MCCRF performs the best Effectiveness of relational feature dependency Effiective fusion of multiple features Table. Performance in MovieLens dataset Table. Performance in Epinions dataset

28 Summary of Part 1 We propose a novel model MCCRF as a framework for relational recommendation We propose an MCMC-based method for training and inference Experimental verification on the effectiveness of the proposed approach on MovieLens and Epinions

29 Outline Background of Recommender Systems Motivation of the Thesis Part 1: Relational Fusion of Multiple Features Part 2: Effective Fusion of Regression and Ranking Part 3: Effective Fusion of Quality and Relevance Part 4: Impression Efficiency Optimization Conclusion

30 Part 2: Effective Fusion of Regression and Ranking Multi-dimensional Adaption Recommender Systems Impression Efficiency Relevance Quality Multi-measure adaption Multi-measure adaption Regression Ranking Regression Ranking Limitation of previous work Over bias in single measure They cannot adapt to the other measure. Information is not fully utilized for data sparse problem

31 Regression v.s. Ranking Regression: modeling and predicting the ratings –Output, y j2 =3.6, y j3 =3.5, y j5 =1.0 … Ranking: modeling and predicting the ranking orders –Output, y j3 >y j2 >y j5 … Comparisons –Advantage of regression (1) intuitive (2) simple complexity –Advantage of ranking (1) richer information (2) direct for applications

32 Our Solution: Combining Regression and Ranking in Collaborative Filtering The work is the first attempt to investigate the combination of regression and ranking in collaborative filtering community As the first ever solution, we propose combination methods in both model-based and memory-based algorithms

33 Probabilistic graph of the models Model-based Methods Selection Regression method –Probabilistic Matrix Factorization (PMF) [Salakhutdinov 2007] Ranking method –List-wise Matrix Factorization (LMF) [Shi 2010]

34 Model-based Combination Objective function Gradient descent optimization

35 Memory-based Methods Selection Regression Method –User-based PCC [Breese 1998] Pearson Correlation Coefficient (PCC) similarity Rating Prediction Ranking Method –EigenRank [Liu SIGIR 2008] Kendall Rank Correlation Coefficient (KRCC) similarity Ranking Prediction

36 Memory-based Combination No objective function –To combine the results from regression and ranking algorithms Challenge –The output values are incompatible Naive combination Conflict: The results of the ranking model do not follow the Gauss distribution as the real data.

37 Sampling Trick EigenRank results Sampling results

38 Experimental Setup Datasets –MovieLens –1,682 items, 943 users –100,000 ratings –Netflix –17,770 items, 480,000users –100,000,000 ratings Metrics –Regression Measure –MAE, RMSE –Ranking Measure –Normalized Discount Cumulated Gain (NDCG) Setup –2/3 users training, 1/3 users testing –Given 5, Given 10, Given 15 Regression-prior and Ranking-prior –RegPModel, RegPMemory –RankPModel, RankPMemory

39 Performance of Model-based Combination Netflix DatasetMovieLens Dataset The combination methods outperform single-measure-adapted methods in all the metrics Regression-prior model has also an improvement in ranking-based measure Ranking-prior model has also an improvement in regression-based measure

40 Performance of Memory-based Combination Netflix DatasetMovieLens Dataset The combination methods outperform single-measure-adapted methods in all the metrics Regression-prior model has also an improvement in ranking-based measure Ranking-prior model has also an improvement in regression-based measure

41 Summary of Part 2 We conduct the first attempt to investigate the fusion of regression and ranking to solve the limitation of single- measure collaborative filtering algorithms We propose combination methods from both model- based and memory-based aspects Experimental result demonstrated that the combination will enhance performances in both metrics

42 Outline Background of Recommender Systems Motivation of the Thesis Part 1: Relational Fusion of Multiple Features Part 2: Effective Fusion of Regression and Ranking Part 3: Effective Fusion of Quality and Relevance Part 4: Impression Efficiency Optimization Conclusion

43 Part 3: Effective Fusion of Quality and Relevance Multi-dimensional Adaption Recommender Systems Impression Efficiency Relevance Quality Multi-measure adaption Multi-measure adaption Regression Ranking Regression Ranking Limitation of previous work Single-dimensional algorithm cannot adapt to multi- dimensional performance Incomplete recommendation.

44 Quality v.s. Relevance Quality: Whether recommended items can be rated with high scores Relevance: How many recommended items will be visited by the user Examples –A user may give a high rating to a classical movie for its good quality but he/she might be more likely to watch a recent one that is more relevant and interesting to their lives, though the latter might be worse in quality

45 Incompleteness Limitation (Qualitative Analysis) Quality-based methods ignore Relevance –Type A is missing –Users may not show interests to visit some of the recommended items Relevance-based methods ignore quality –Type D is missing –Users will suffer from normal-quality recommended results

46 Incompleteness Limitation (Quantitative Analysis) Quality-based algorithms –PMF [Salakhutdinov 2007], EigenRank [Liu 2008] Relevance-based algorithms –Assoc [Deshpande 2004], Freq [Sueiras 2007] Table. Performance on quality-based NDCG Table. Performance on relevance-based NDCG

47 Our Solution: Combining Quality-based and Relevance-based Algorithms Fusion of quality-based and relevance-based algorithms Continuous-time MArkov Process (CMAP) –Integration-unnatural limitation –Quantity-missing limitation

48 Integrated Objective Normalized Discount Cumulated Gain (NDCG) –Both quality-based NDCG and relevance-based NDCG are accepted as practical measures in previous work [Guan 2009] [Liu 2008] Quality-based NDCG Relevance-based NDCG Integrated NDCG

49 Fundamental Methods Selection Competitive quality-based method –EigenRank (random walk theory) Competitive relevance-based methods –Association-based methods [Deshpande 2004] (relational feature) –Frequency-based methods [Sueiras 2007] (local feature)

50 Combination Methods Linear Combination –Disadvantages: Incompatible values Rank Combination –Disadvantages: Missing quantity information Continuous-time MArkov Process (CMAP) – Combination with an intuitive interpretation without missing quantity information

51 Continuous-time MArkov Process (CMAP) Association feature (relational) combination Frequency feature (local) combination 1) Customers’ arrival follows the time- homogenous Poisson Process. 2) Service time follows exponential distribution with the same service rate u. 3) Waiting time of a customer on the condition that there is a queue:

52 Performance (Quantitative Analysis) The three combination methods outperform the baselines in all the configurations The CMAP algorithm outperforms the other two fundamental combination methods

53 Performance (Qualitative Analysis) The incomplete problem can be well solved by the combination Single-dimensional-adapted algorithmsMulti-dimensional-adapted algorithms

54 Summary of Part 3 Incompleteness limitation identification of single- dimensional-adapted algorithms CMAP, as well as the other two novel approaches, in fusing quality-based and relevance-based algorithms Experimental verification on the effectiveness of CMAP

55 Outline Background of Recommender Systems Motivation of the Thesis Part 1: Relational Fusion of Multiple Features Part 2: Effective Fusion of Regression and Ranking Part 3: Effective Fusion of Quality and Relevance Part 4: Impression Efficiency Optimization Conclusion

56 Part 4: Impression Efficiency Optimization Multi-dimensional Adaption Recommender Systems Impression Efficiency Relevance Quality Multi-measure adaption Multi-measure adaption Regression Ranking Regression Ranking Limitation of previous work The importance has been identified But the issue has never been carefully studied Impression Efficiency: How much revenue can be obtained by impressing a recommendation result?

57 Commercial Intrusion from Over-quantity Recommendation in Sponsored Search organic results sponsored search results query sponsored search results Evidences of commercial intrusion –Users have shown bias against sponsored search results [Marable 2003] –More than 82% of users will see organic results first [Jansen 2006] –Organic results have obtained much higher click-through rate [Danescu- Niculescu-Mizil 2010] –Irrelevant ads will train the users to ignore ads [Buscher 2010]

58 Our Solution Formulate the task of optimizing impression efficiency We identify the unstable problem, which makes the static method not working well We propose a novel dynamic approach to solve the unstable problem

59 A Background Example N=6;λ=0.33 I(q 1 ) +I(q 2 )+ I(q 3 ) + I(q 4 ) + I(q 5 ) + I(q 6 ) <=N*λ=6*0.33=2 Best strategy: I(q 1 ) = 0; I(q 2 ) = 0; I(q 3 ) = 0; I(q 4 ) = 1; I(q 5 ) = 1; I(q 6 ) = 0

60 Problem Formulation Impression efficiency optimization: Evaluation metric

61 Experimental Setup The data is collected from Oct to December CCM [Guo 2009] is employed as the revenue estimation function Assumption: for a query, the estimation revenue of the ad at the latter position is always smaller than that at the former position

62 The Static Method Case: N = 6, lambda = 0.33 Static method process: 1.Learn a threshold from previous window1 (threshold = 26) 2.Select the elements that is larger than the threshold

63 Performance of Static Method X-axis: lambda Y-axis: error rate 5% Gap The static method outperforms random method significantly There is a 5% gap of the static method from the best strategy

64 The Unstable Problem The data is not stable and is changing over time Verification through statistics in the real world dataset –The change of average revenue –The change of threshold –The change of query distribution –The change of click-through rate (CTR) Change of CTR It follows a Gaussian distribution More than 20% cases, the data has changed for more than 20%

65 The Dynamic Method Case: N = 6, lambda = Select 2 items from 6 queries 1)Generate a division from B(6,0.5)=4 2)Observe the first 4/e=1 group 3)Set the threshold = 30 4)Select the first element that is larger than 30 5)40 would be selected 6)Reset the threshold to 40 7)Select the first element that is larger than 40 8)80 would be selected 9)Output: = 120

66 Empirical Analysis of the Dynamic Algorithm Competitive ratio: − the ratio between the algorithm's performance and the best performance The competitive rate in all configurations is above 0.97

67 Performance of The Dynamic Algorithm The dynamic algorithm outperforms the static algorithm significantly

68 Summary of Part 4 Impression efficiency optimization formulation Unstable limitation identification in static methods We propose a dynamic algorithm Significant improvement

69 Outline Background of Recommender Systems Motivation of the Thesis Part 1: Relational Fusion of Multiple Features Part 2: Effective Fusion of Regression and Ranking Part 3: Effective Fusion of Quality and Relevance Part 4: Impression Efficiency Optimization Conclusion

70 Conclusion Multi-dimensional Adaption Recommender Systems Impression Efficiency Relevance Quality Multi-measure adaption Multi-measure adaption Regression Ranking Regression Ranking Part 4 Part 3 Part 2 Part 1 Part 1: Relational fusion of multiple features − Relational dependency is ignored − Difficulty in learning feature weights Part 2: Fusion of regression and ranking for multi-measure adaption − Single-measure-adapted algorithms cannot adapt to multiple measures Part 3: Fusion of relevance and quality for multi-dimensional adaption − Single-dimensional algorithms cannot adapt to multiple dimensions Part 4: Impression efficiency optimization − The unstable problem in the static method

71 Q & A Thanks very much to my dearest supervisors, thesis committees and colleagues in the lab for your great help!

72 Appendix (1) Publications Ph.D study –Xin Xin, Michael R. Lyu, and Irwin King. CMAP: Effective Fusion of Quality and Relevance for Multi-criteria Recommendation (Full Paper). In Proceedings of ACM 4 th International Conference on Web Search and Data Mining (WSDM 2011), Hong Kong, February –Xin Xin, Irwin King, Hongbo Deng, and Michael R. Lyu. A Social Recommendation Framework Based on Multi-scale Continuous Conditional Random Fields (Full and Oral Paper). In Proceedings of ACM 18th Conference on Information and Knowledge Management (CIKM 2009), Hong Kong, November Previous work –Xin Xin, Juanzi Li, Jie Tang, and Qiong Luo. Academic Conference Homepage Understanding Using Hierarchical Conditional Random Fields (Full and Oral Paper). In Proceedings of ACM17th Conference on Information and Knowledge Management (CIKM 2008), Napa Valley, CA, October –Xin Xin, Juanzi Li, and Jie Tang. Enhancing SemanticWeb by Semantic Annotation: Experiences in Building an Automatic Conference Calendar (Short Paper). In Proceedings of the 2007 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2007), Fremont, CA, November 2007.

73 Appendix (2) Unpublished Work Xin Xin, Haiqin Yang, Michael R. Lyu, and Irwin King. Combining Regression and Ranking in Collaborative Filtering. Submitted to CIKM Xin Xin, Wei Wang, Wei Yu, Jie Tang, Irwin King and Michael R. Lyu. Learning to Impress in Sponsored Search. Preparing to submit it to WWW Xin Xin, Michael R. Lyu, and Irwin King. Relational Fusion-based Framework for Recommender Systems. Preparing to submit it to TOIS. Wei Wang, Xin Xin, Irwin King, Jie Tang, and Michael R. Lyu. Compete or Collaborate? Incorporating Relational Influence within Search Results into Click Models in Sponsored Search. Submitted to CIKM 2011.

74 Appendix (3) MAE and RMSE Definition Mean Absolute Error (MAE) Root Mean Square Error (RMSE)

75 Normalized Discount Cumulated Gain (NDCG) –U is the number of test users. –Z is the normalization factor of a single user. –P is the position –r u,p is the rating of user u at position p. Example –Ideal rank: 3, 2, 1. Value1=Z=7/1+3/log(3)+1/log(4). –Current rank: 2, 3, 1. Value2=3/1+7/log(3)+1/log(4). –NDCG = Value2/Value1. Appendix (4) NDCG Definition

76 Appendix (5) PCC Definition Pearson Correlation Coefficient (PCC) –a and u are two uses. –I(a) are the items user a has rated. –r a,i is the rating of item i by user a. – is the average rating of user a.

77 Appendix (5) KRCC Definition Kendall Rank Correlation Coefficient (KRCC) –i u is the item set for user u. –I - is the indicating function.

78 Appendix (6) Complexity Analysis for MCCRF For each user-item pair, the calculation complexity is –O(#feature * #neighbor * #iteration)

79 Appendix (7) Cluster Method in Large Datasets for MCCRF To run the whole data will take too much memory in large datasets like Epinions Xue et al 2005 propose to employ cluster methods to solve this problem The figures show the impact of cluster size in Epinions using K-means cluster methods

80 Appendix (8) Complexity Analysis for Model-based Combination of Regression and Ranking For each user-item pair, the calculation complexity is –O(#observation * #latent feature)

81 Appendix (9) Complexity Analysis for Memory-based Combination of Regression and Ranking Similarity calculation –PCC O(#user * #item * #common item) –KRCC O(#user * #item * #item * #common item) Rating calculation –O(1) Stationary distribution calculation –O(#items * #iteration)

82 Appendix (10) Complexity Analysis for CMAP Stationary distribution calculation. –O(#item * #iteration)

83 Appendix (11) Sensitivity Analysis for Model-based Regression-prior Combination

84 Appendix (12) Sensitivity Analysis for Model-based Ranking-prior Combination

85 Appendix (13) Sensitivity Analysis for Memory-based Combination

86 Appendix (14) Convergence in Model- based Combination

87 Appendix (15) Sensitivity Analysis for CMAP –MovieLens –Netflix