Advisor: Prof. Shou-de Lin (林守德) Student: Eric L. Lee (李揚)

Slides:



Advertisements
Similar presentations
Linear Regression.
Advertisements

Fast Algorithms For Hierarchical Range Histogram Constructions
Data Mining Classification: Alternative Techniques
Collaborative Filtering Sue Yeon Syn September 21, 2005.
Christine Preisach, Steffen Rendle and Lars Schmidt- Thieme Information Systems and Machine Learning Lab (ISMLL) University of Hildesheim Germany Relational.
Learning to Recommend Hao Ma Supervisors: Prof. Irwin King and Prof. Michael R. Lyu Dept. of Computer Science & Engineering The Chinese University of Hong.
Integrating Bayesian Networks and Simpson’s Paradox in Data Mining Alex Freitas University of Kent Ken McGarry University of Sunderland.
Chen Cheng1, Haiqin Yang1, Irwin King1,2 and Michael R. Lyu1
Common Factor Analysis “World View” of PC vs. CF Choosing between PC and CF PAF -- most common kind of CF Communality & Communality Estimation Common Factor.
Presented by Li-Tal Mashiach Learning to Rank: A Machine Learning Approach to Static Ranking Algorithms for Large Data Sets Student Symposium.
1 Collaborative Filtering and Pagerank in a Network Qiang Yang HKUST Thanks: Sonny Chee.
Customizable Bayesian Collaborative Filtering Denver Dash Big Data Reading Group 11/19/2007.
Recommender systems Ram Akella February 23, 2011 Lecture 6b, i290 & 280I University of California at Berkeley Silicon Valley Center/SC.
Collaborative Ordinal Regression Shipeng Yu Joint work with Kai Yu, Volker Tresp and Hans-Peter Kriegel University of Munich, Germany Siemens Corporate.
Recommender systems Ram Akella November 26 th 2008.
1 Collaborative Filtering: Latent Variable Model LIU Tengfei Computer Science and Engineering Department April 13, 2011.
Chapter 12 (Section 12.4) : Recommender Systems Second edition of the book, coming soon.
Performance of Recommender Algorithms on Top-N Recommendation Tasks
Intrusion Detection Jie Lin. Outline Introduction A Frame for Intrusion Detection System Intrusion Detection Techniques Ideas for Improving Intrusion.
Matrix Factorization Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Modeling Relationship Strength in Online Social Networks Rongjing Xiang: Purdue University Jennifer Neville: Purdue University Monica Rogati: LinkedIn.
LOGO Recommendation Algorithms Lecturer: Dr. Bo Yuan
EMIS 8381 – Spring Netflix and Your Next Movie Night Nonlinear Programming Ron Andrews EMIS 8381.
Bug Localization with Machine Learning Techniques Wujie Zheng
Copyright © 2015 KDDI R&D Labs. Inc. All Rights Reserved
Chengjie Sun,Lei Lin, Yuan Chen, Bingquan Liu Harbin Institute of Technology School of Computer Science and Technology 1 19/11/ :09 PM.
Implementing Query Classification HYP: End of Semester Update prepared Minh.
Google News Personalization: Scalable Online Collaborative Filtering
Online Learning for Collaborative Filtering
EigenRank: A Ranking-Oriented Approach to Collaborative Filtering IDS Lab. Seminar Spring 2009 강 민 석강 민 석 May 21 st, 2009 Nathan.
Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date :
INTRODUCTION TO Machine Learning 3rd Edition
The Summary of My Work In Graduate Grade One Reporter: Yuanshuai Sun
Pairwise Preference Regression for Cold-start Recommendation Speaker: Yuanshuai Sun
Iterative similarity based adaptation technique for Cross Domain text classification Under: Prof. Amitabha Mukherjee By: Narendra Roy Roll no: Group:
CS Statistical Machine learning Lecture 12 Yuan (Alan) Qi Purdue CS Oct
NTU & MSRA Ming-Feng Tsai
G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate.
KAIST TS & IS Lab. CS710 Know your Neighbors: Web Spam Detection using the Web Topology SIGIR 2007, Carlos Castillo et al., Yahoo! 이 승 민.
Collaborative Filtering via Euclidean Embedding M. Khoshneshin and W. Street Proc. of ACM RecSys, pp , 2010.
Personalization Services in CADAL Zhang yin Zhuang Yuting Wu Jiangqin College of Computer Science, Zhejiang University November 19,2006.
A Framework to Predict the Quality of Answers with Non-Textual Features Jiwoon Jeon, W. Bruce Croft(University of Massachusetts-Amherst) Joon Ho Lee (Soongsil.
1 Random Walks on the Click Graph Nick Craswell and Martin Szummer Microsoft Research Cambridge SIGIR 2007.
Matrix Factorization & Singular Value Decomposition Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Improving Collaborative Filtering by Incorporating Customer Reviews Hui Hui Supervisor Prof Min-Yen Kan Dr. Kazunari Sugiyama 1.
CS791 - Technologies of Google Spring A Web­based Kernel Function for Measuring the Similarity of Short Text Snippets By Mehran Sahami, Timothy.
Collaborative Deep Learning for Recommender Systems
Hao Ma, Dengyong Zhou, Chao Liu Microsoft Research Michael R. Lyu
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Matrix Factorization and Collaborative Filtering
Statistics 202: Statistical Aspects of Data Mining
Mining Utility Functions based on user ratings
Artificial Neural Networks
Learning Recommender Systems with Adaptive Regularization
Asymmetric Correlation Regularized Matrix Factorization for Web Service Recommendation Qi Xie1, Shenglin Zhao2, Zibin Zheng3, Jieming Zhu2 and Michael.
Intelligent Information System Lab
Adopted from Bin UIC Recommender Systems Adopted from Bin UIC.
Probabilistic Models for Linear Regression
Location Recommendation — for Out-of-Town Users in Location-Based Social Network Yina Meng.
An Inteligent System to Diabetes Prediction
Advanced Artificial Intelligence
iSRD Spam Review Detection with Imbalanced Data Distributions
Advanced Artificial Intelligence Classification
Matrix Factorization & Singular Value Decomposition
Response Aware Model-Based Collaborative Filtering
Ensembles An ensemble is a set of classifiers whose combined results give the final decision. test feature vector classifier 1 classifier 2 classifier.
Semi-Supervised Learning
Using Clustering to Make Prediction Intervals For Neural Networks
Jia-Bin Huang Virginia Tech
Introduction to Machine learning
Presentation transcript:

Advisor: Prof. Shou-de Lin (林守德) Student: Eric L. Lee (李揚) Collaborative Filtering Based Model for Privacy-Preserving Course Recommendation Advisor: Prof. Shou-de Lin (林守德) Student: Eric L. Lee (李揚)

Outline Motivation and Introduction Baselines Our Methods Experiment and Results Conclusion

Motivation There are 10572 courses for students to choose in 2012, it costs lots of time for students searching courses they want to take!! We aims to develop a course recommendation system using only course records to protect the privacy of users in this paper.

Example My Model Recommended list Machine Learning Statistics Stu Cou … Most recommend Stu Cou Time Calculus 2009-1 Statistics 2010-2 1 Probability 2010-1 2 Physics 2008-2 .. … My Model Least recommend Former and current Students’ Course Record Output recommended list for each current students

Related Work Collaborative Filtering [1] [2] [3] (our baselines) Privacy-Preserving Course Recommendation Our Work!!! [1] Deshpande, Mukund, and George Karypis. "Item-based top-n recommendation algorithms." ACM Transactions on Information Systems (TOIS) 22.1 (2004): 143-177. [2] Koren, Y., Bell, R., & Volinsky, C. (2009). Matrix factorization techniques for recommender systems. Computer, (8), 30-37. [3] Rendle, S., Freudenthaler, C., Gantner, Z., & Schmidt-Thieme, L. (2009, June). BPR: Bayesian personalized ranking from implicit feedback. In Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence (pp. 452-461). AUAI Press.

One Class Collaborative Filtering Problem 1 ? 5 ? 3 2 4 1 Vs. OCCF problem Traditional Recommendation Problem

Lack of positive training data in the next grade 1st ~3rd grade 4th grade Former students  Current Students Current students we want to predict don’t have training data in the 4th grade.

Courses’ Order Traditional CF assume each course actions are i.i.d However, we think the course records in the consecutive year have a closer relationship.

Our Framework P1 P2 P3 MF Matrix Factorization (MF) Memory-Based CF P2 Bayesian Personal Ranking (BPR) Reduce Search Space P2 Two Stage Method P1: OCCF P2: Lack of training data in next grade P3: courses’ order P3 Course Network Regularization

Modified Memory Based CF Origin CF Modified CF Search space for similar students All other students Only Former students Course Candidates All the courses taken by the compared students Only 4th grade courses taken by the compared students

Modified Memory-Based Collaborative Filtering 1st ~ 3rd grade 4th grade Former Students Calculate the similarities Current Students Student u’s course record 𝑠∈𝑓𝑜𝑟𝑚𝑒𝑟 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠 𝑠𝑖𝑚(𝑢,𝑠) Score(u, 𝑐) ≔ ∗𝑝𝑟𝑒𝑓𝑒𝑟𝑒𝑛𝑐𝑒(𝑠,𝑐)

Our Framework P1 P2 P3 Matrix Factorization (MF) Bayesian Personal Ranking (BPR) P1: OCCF P2: Lack of training data in next grade P3 : courses’ order P2 Two Stage Method P3 Course Network Regularization

MF Minimizing Square Error 2 ? 1 3 4 ≈ X Q (courses) R P (student) min P,𝑄 𝑢,𝑖 ∈𝑟𝑎𝑡𝑖𝑛𝑔𝑠 [ ( 𝑅 𝑢,𝑖 − 𝑃 𝑢 ∙ 𝑄 𝑖 ) 2 ]+ 𝐶( 𝑃 2 + |𝑄| 2 ) Koren, Yehuda, Robert Bell, and Chris Volinsky. "Matrix factorization techniques for recommender systems." Computer 8 (2009): 30-37.

Our Framework P1 P2 P3 Matrix Factorization (MF) Bayesian Personal Ranking (BPR) P2 Two Stage Method P1: OCCF P2: Lack of training data in next grade P3: Courses’ order P3 Course Network Regularization

One class collaborative Filtering (OCCF) Problem Course 1 Course 2 Course 3 Course 4 Student 1 ? 1 Student 2 Student 3 Student 4 Characteristics: The rating matrix only has one class(take / or not take the course).

Bayesian Personal Ranking : Intuition student u 𝑗∈𝐼 𝑢 −  𝑖∈𝐼 𝑢 +  … …  Maximize the likelihood!!! 𝐼 𝑢 + :𝑇ℎ𝑒 𝑐𝑜𝑢𝑟𝑠𝑒𝑠 𝑡ℎ𝑎𝑡 𝑡ℎ𝑒 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠 ℎ𝑎𝑣𝑒 𝑡𝑎𝑘𝑒𝑛 𝐼 𝑢 − :𝑇ℎ𝑒 𝑐𝑜𝑢𝑟𝑠𝑒𝑠 𝑡ℎ𝑎𝑡 𝑡ℎ𝑒 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠 ℎ𝑎𝑣𝑒 𝑛 ′ 𝑡 𝑡𝑎𝑘𝑒𝑛

Bayesian Personal Ranking: Intuition We want to maximize the AUC that + rank higher than ? Let 𝐷 𝑆 ≔ 𝑢,𝑖,𝑗 |(𝑖,𝑗)∈ 𝐼 𝑢 + × 𝐼 𝑢 − max 𝜃 (𝑢,𝑖,𝑗)∈ 𝐷 𝑠 𝑃𝑟(𝜃| 𝑅 𝑢𝑖 > 𝑅 𝑢𝑗 ) m𝑖𝑛 𝜃 ⁡ (𝑢,𝑖,𝑗)∈ 𝐷 𝑠 ln⁡(1+ 𝑒 −(𝑦 𝑢𝑖 𝜃 − 𝑦 𝑢𝑗 (𝜃)) ) + ? Rendle, Steffen, et al. "BPR: Bayesian personalized ranking from implicit feedback." Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence. AUAI Press, 2009.

Matrix Factorization using Bayesian Personal Ranking (BPRMF) ≈ X Rating Matrix P Q 𝑢,𝑖,𝑗 ln 1+ exp − 𝑃 𝑢 𝑄 𝑖 − 𝑄 𝑗 + C 1 P + C 2 Q

Our Framework P1 P2 P3 Matrix Factorization (MF) Bayesian Personal Ranking (BPR) P1: OCCF P2: Lack of training data in next grade P3: Model the course relation P2 Two Stage Method P3 Course Network Regularization

Recall of BPRMF Bayesian Personal Ranking   … …  student u  … … 𝐼 𝑢 + 𝐼 𝑢 −   What courses has been taken by student u Who take the course c ? Course c latent features Student u latent features

 Problems Lack of training data in the next grade. 1st ~3rd grade 4th grade Former students  Current Students Lack of training data in the next grade. Therefore, these courses are all in the set of 𝐼 𝑢 − for current students

Problems  Courses in 1st ~3rd grade Student u  Courses in 4th grade Because the students u doesn’t have the positive training data in 4th grade, the final model tends to have an unreasonable bias to rank courses in 4th grade lower.

Two Stage Method - Motivation Students The latent features of course A The latent features of course B The latent features of two courses would be similar if they are taken by the same group of students

Two Stage Method - Motivation Although BPRMF have bias on 4th grade courses, it still learn the type of courses. If a student often take certain type of courses in the 1st ~ 3rd grade, he may take the same type of courses in the 4th grade.

Two Stage Method - Intuition Learn the type for courses Rank the type of courses the current students like Notation: P => The latent matrix for students Q => The latent matrix for courses

First Stage P Q min 𝑃,𝑄 𝑢,𝑖,𝑗 ln 1+ exp − 𝑃 𝑢 𝑄 𝑖 − 𝑄 𝑗 +𝐶|𝑃| Former students Get the Courses’ latent features Current Students P Q min 𝑃,𝑄 𝑢,𝑖,𝑗 ln 1+ exp − 𝑃 𝑢 𝑄 𝑖 − 𝑄 𝑗 +𝐶|𝑃|

Second Stage Q P min 𝑃 𝑢,𝑖,𝑗 ln 1+ exp − 𝑃 𝑢 𝑄 𝑖 − 𝑄 𝑗 +𝐶|𝑃| Former students Get the current students’ latent feature Current students Q P min 𝑃 𝑢,𝑖,𝑗 ln 1+ exp − 𝑃 𝑢 𝑄 𝑖 − 𝑄 𝑗 +𝐶|𝑃|

Review of Two Stage Methods X Use all the training course records Q P Use the course records of current students only X P Q

Our Framework P1 P2 P3 Matrix Factorization (MF) Bayesian Personal Ranking (BPR) P1: OCCF P2: Lack of training data in next grade P3: Courses’ Order P2 Two Stage Method P3 Course Network Regularization

Review of Two Stage Methods First Stage Learn the type of courses Second Stage Rank what the type of courses the students like to take

Idea Observation: Some courses will be very likely to be taken after certain courses in a consecutive year. And the Courses are often belongs to the same type Idea: Let these courses have similar course latent features

Recommender with social regularization Idea : The behaviors of the neighbors might be similar. More likely to happen Less likely to happen

Use the regularization term to let the latent features similar Individual based regularization 𝛽 2 𝑖 ∈𝑛𝑜𝑑𝑒𝑠 𝑓∈𝑁(𝑖) 𝑙𝑖𝑛𝑘(𝑖,𝑓)|| 𝑄 𝑖 − 𝑄 𝑓 || 2 Where N(i) is the set of neighbors of node i link(i,f) is the weight between node i and node f Ma, Hao, et al. "Recommender systems with social regularization." Proceedings of the fourth ACM international conference on Web search and data mining. ACM, 2011.

Objective Function 𝑢,𝑖,𝑗 ln 1+ exp − 𝑃 𝑢 𝑄 𝑖 − 𝑄 𝑗 + C 1 P + C 2 Q + β 2 𝑖 𝑁 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑 (𝑖) | 𝑄 𝑖 − 𝑄 𝑗 | 2 ∗F𝑜𝑙𝑙𝑜𝑤(𝑖,𝑗) 𝑾𝒉𝒆𝒓𝒆 𝒇𝒐𝒍𝒍𝒐𝒘 𝒊,𝒋 𝒊𝒔 𝒕𝒉𝒆 𝒘𝒆𝒊𝒈𝒉𝒕 𝒐𝒇 𝒕𝒉𝒆 𝒍𝒊𝒏𝒌 𝒃𝒆𝒕𝒘𝒆𝒆𝒏 𝒊 𝒂𝒏𝒅 𝒋 𝑾𝒉𝒆𝒓𝒆 𝑵 𝒕𝒉𝒓𝒆𝒔𝒉𝒐𝒍𝒅 𝒊 𝒊𝒔 𝒂 𝒔𝒆𝒕 𝒕𝒉𝒂𝒕 𝒇𝒐𝒍𝒍𝒐𝒘 𝒊,𝒋 >𝒕𝒉𝒓𝒆𝒔𝒉𝒐𝒍𝒅 Non Prunning Prunning

Example Name 1st grade 2nd grade 3rd grade Alice AB C D Bob A CD B Christine AD Follow (A,B) = # B courses taken after courses A in one year # courses taken after courses A in one year = 1 4 =0.25

Final Course Network of the simple small university Name 1st grade 2nd grade 3rd grade Alice AB C D Bob A CD B Christine AD

Combining Social Network to MF model Pruning ≈ X

Stochastic Gradient Descent Target Function: 𝑢,𝑖,𝑗 ∈ 𝐷 𝑠 ln 1+ exp − 𝑃 𝑢 𝑄 𝑖 − 𝑄 𝑗 + C 1 P 2 + C 2 Q 2 + β 2 𝑖 𝑁 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑 (𝑖) | 𝑄 𝑖 − 𝑄 𝑗 | 2 ∗𝑓𝑜𝑙𝑙𝑜𝑤(𝑖,𝑗) For each update: 𝑠𝑖𝑔= − 𝑒 − 𝑘 𝑃 𝑢𝑘 ( 𝑄 𝑖𝑘 − 𝑄 𝑗𝑘 ) 1+ 𝑒 − 𝑘 𝑃 𝑢𝑘 ( 𝑄 𝑖𝑘 − 𝑄 𝑗𝑘 ) 𝑃 𝑢 ≔ 𝑃 𝑢 −𝑎𝑙𝑝ℎ𝑎 ∗(𝑠𝑖𝑔∗ 𝑄 𝑖𝑘 − 𝑄 𝑗𝑘 + 𝐶 1 𝑃 𝑢 ) 𝑄 𝑖 ≔ 𝑄 𝑖 −𝑎𝑙𝑝ℎ𝑎∗(𝑠𝑖𝑔 ∗ 𝑃 𝑢 + 𝐶 2 𝑄 𝑖 + 𝛽 𝑛𝜖 𝑁 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑 (𝑖) 𝑄 𝑖 − 𝑄 𝑛 ∗𝑓𝑜𝑙𝑙𝑜𝑤(𝑖,𝑛) ) 𝑄 𝑗 ≔ 𝑄 𝑗 −𝑎𝑙𝑝ℎ𝑎∗(𝑠𝑖𝑔 ∗( −𝑃 𝑢 )+ 𝐶 2 𝑄 𝑖 + 𝛽 𝑛𝜖 𝑁 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑 (𝑖) 𝑄 𝑗 − 𝑄 𝑛 ∗𝑓𝑜𝑙𝑙𝑜𝑤(𝑗,𝑛) )

Personalized page rank Parameters: 𝛼 , T(the set of courses the student has taken): Assuming the student has taken C, D and E courses previously Choose a course from the neighbors with probability proportional to the weight of the link A B 𝑃(𝛼) E C D 𝑃(1−𝛼) Randomly choose a course from set T with a uniform distribution

Overview of our Methods CF- based model Stu Cou Time 2 2011-1 3 2012-2 1 4 2011-2 .. … Apply BPR Two Stage Method Social Regularization Perform personalized page rank Construct the course networks Graph-based model

Ensemble CF-Based Model Linear RankSVM Graph-Based Model

Data Set Description Use the triplets <student_id, course_id, Time(Year-Semester)> Year # Students # Course Actions 民國97 4736 311283 民國98 4686 299772 民國99 4555 285561

Training, Validation and Testing set 1~3 grade 4th grade Use the students in the fourth grade in 民國98 as validation set to choose the parameters 民國97 民國98 Test our model use the course records of fourth grade students enter NTU in 民國99 民國99

Evaluation The average AUC (Area under Reciever Opterating Characteristics) of students True positive Rate Ranking of the courses XOXXOOOXXOOOXXO X: the student doesn’t take the course O: the student does take the course False Positve Rate

Memory Based Collaborative Filtering Let 𝐴 𝑠 be the set of courses of student a take in s semesters Let 𝐵 𝑠 be the set of courses of student b take in s semesters Number of Intersection 𝑠𝑖𝑚(𝑎,𝑏)= 𝑠∈𝑠𝑒𝑚𝑒𝑠𝑡𝑒𝑟𝑠 𝐴 𝑠 ∩ 𝐵 𝑠 2. Jaccard Similarity sim(a,b)= 𝑠∈𝑠𝑒𝑚𝑒𝑠𝑡𝑒𝑟𝑠 𝐴 𝑠 ∩ 𝐵 𝑠 / 𝐴 𝑠 ∪ 𝐵 𝑠 TFIDF similarity 𝑠𝑖𝑚(𝑎,𝑏)= 𝑠∈𝑠𝑒𝑚𝑒𝑠𝑡𝑒𝑟𝑠 𝑐∈ 𝐴 𝑠 ∩ 𝐵 𝑠 𝑇𝐹𝐼𝐷𝐹(𝑐)

Memory-based CF Methods AUC Most Popular 0.817208 Normal CF 0.744253 Number of Intersection 0.867813 Jaccard 0.892154 TFIDF 0.932024 TFIDF_Whole 0.958880 (X以偷看到testing) Jaccard_Whole

Result of Model-Based Model Methods AUC One Stage BPRMF 0.936563 Two Stage BPRMF 0.940351 Two Stage BPRMF with Course Network 0.942707 P-value Target Baseline P-value Two Stage BPRMF One Stage BPRMF 𝟑.𝟐𝟒 ∗ 𝟏𝟎 −𝟏𝟐 Two Stage BPRMF with Course Network 𝟎.𝟎𝟏𝟑𝟒𝟓𝟔

Personalized Pagerank and Ensemble Methods AUC Personalized Page Rank 0.933267 Ensemble 0.970878

Result – AUC with 95% confident interval Methods AUC with 95% confident interval Most Popular 0.817208 ± 0.004643 Normal CF 0.744253 ±0.006043 Number of Intersection Similarity 0.867813 ± 0.004437 Jaccard Similarity 0.892154 ± 0.003905 TFIDF Similarity 0.932024 ± 0.003572 Personalized Page Rank 0.933267 ± 0.003637 BPRMF (one stage) 0.936563 ± 0.003016 BPRMF (two stage) 0.940351 ± 0.003187 BPRMF with course network prior (one stage) 0.940483 ± 0.002990 BPRMF with course network prior (two stage) 0.942707 ± 0.003021 Ensemble 0.970878±𝟎.𝟎𝟎𝟏𝟎𝟏𝟑

Conclusion and Future work We propose an accurate privacy-preserving course recommendation system. We can think the course record as a recommendation problem with seasonal product. Our method has the potential to apply on these kinds of data sets.

Thank you for listening