1 Cross-Selling with Collaborative Filtering Qiang Yang HKUST Thanks: Sonny Chee.

Slides:

Advertisements

Similar presentations

Recommender Systems & Collaborative Filtering

Advertisements

Item Based Collaborative Filtering Recommendation Algorithms

Prediction Modeling for Personalization & Recommender Systems Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.

Mustafa Cayci INFS 795 An Evaluation on Feature Selection for Text Clustering.

Clustering Categorical Data The Case of Quran Verses

A Graph-based Recommender System Zan Huang, Wingyan Chung, Thian-Huat Ong, Hsinchun Chen Artificial Intelligence Lab The University of Arizona 07/15/2002.

Jeff Howbert Introduction to Machine Learning Winter Collaborative Filtering Nearest Neighbor Approach.

1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

COLLABORATIVE FILTERING Mustafa Cavdar Neslihan Bulut.

1 Profit Mining: From Patterns to Action Ke Wang, Senqiang Zhou, Jiawei Han Simon Fraser University.

Item Selection By “Hub-Authority” Profit Ranking Ke Wang Ming-Yen Thomas Su Simon Fraser University.

Integrating Bayesian Networks and Simpson’s Paradox in Data Mining Alex Freitas University of Kent Ken McGarry University of Sunderland.

Item Selection By “Hub-Authority” Profit Ranking Presented by: Thomas Su.

Using a Trust Network To Improve Top-N Recommendation

I NCREMENTAL S INGULAR V ALUE D ECOMPOSITION A LGORITHMS FOR H IGHLY S CALABLE R ECOMMENDER S YSTEMS (S ARWAR ET AL ) Presented by Sameer Saproo.

1 Collaborative Filtering and Pagerank in a Network Qiang Yang HKUST Thanks: Sonny Chee.

CSE 574 – Artificial Intelligence II Statistical Relational Learning Instructor: Pedro Domingos.

Chapter 8 Collaborative Filtering Stand

Recommender systems Ram Akella February 23, 2011 Lecture 6b, i290 & 280I University of California at Berkeley Silicon Valley Center/SC.

Fast Algorithms for Association Rule Mining

(hyperlink-induced topic search)

Recommender systems Ram Akella November 26 th 2008.

How to Analyse Social Network? : Part 2 Power Laws and Rich-Get-Richer Phenomena Thank you for all referred contexts and figures.

Chapter 5 Data mining : A Closer Look.

Introduction to Data Mining Data mining is a rapidly growing field of business analytics focused on better understanding of characteristics and.

Chapter 12 (Section 12.4) : Recommender Systems Second edition of the book, coming soon.

Cao et al. ICML 2010 Presented by Danushka Bollegala.

Distributed Networks & Systems Lab. Introduction Collaborative filtering Characteristics and challenges Memory-based CF Model-based CF Hybrid CF Recent.

Recommender systems Drew Culbert IST /12/02.

A Comparative Study of Search Result Diversification Methods Wei Zheng and Hui Fang University of Delaware, Newark DE 19716, USA

Group Recommendations with Rank Aggregation and Collaborative Filtering Linas Baltrunas, Tadas Makcinskas, Francesco Ricci Free University of Bozen-Bolzano.

Clustering-based Collaborative filtering for web page recommendation CSCE 561 project Proposal Mohammad Amir Sharif

EMIS 8381 – Spring Netflix and Your Next Movie Night Nonlinear Programming Ron Andrews EMIS 8381.

1 University of Qom Information Retrieval Course Web Search (Link Analysis) Based on:

Classical Music for Rock Fans?: Novel Recommendations for Expanding User Interests Makoto Nakatsuji, Yasuhiro Fujiwara, Akimichi Tanaka, Toshio Uchiyama,

Author(s): Rahul Sami and Paul Resnick, 2009 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution.

25/03/2003CSCI 6405 Zheyuan Yu1 Finding Unexpected Information Taken from the paper : “Discovering Unexpected Information from your Competitor’s Web Sites”

Chengjie Sun,Lei Lin, Yuan Chen, Bingquan Liu Harbin Institute of Technology School of Computer Science and Technology 1 19/11/ :09 PM.

Google News Personalization: Scalable Online Collaborative Filtering

1 Recommender Systems Collaborative Filtering & Content-Based Recommending.

The PageRank Citation Ranking: Bringing Order to the Web Lawrence Page, Sergey Brin, Rajeev Motwani, Terry Winograd Presented by Anca Leuca, Antonis Makropoulos.

1 Social Networks and Collaborative Filtering Qiang Yang HKUST Thanks: Sonny Chee.

EigenRank: A Ranking-Oriented Approach to Collaborative Filtering IDS Lab. Seminar Spring 2009 강 민 석강 민 석 May 21 st, 2009 Nathan.

Collaborative Filtering  Introduction  Search or Content based Method  User-Based Collaborative Filtering  Item-to-Item Collaborative Filtering  Using.

Stefan Mutter, Mark Hall, Eibe Frank University of Freiburg, Germany University of Waikato, New Zealand The 17th Australian Joint Conference on Artificial.

A Content-Based Approach to Collaborative Filtering Brandon Douthit-Wood CS 470 – Final Presentation.

1 Collaborative Filtering & Content-Based Recommending CS 290N. T. Yang Slides based on R. Mooney at UT Austin.

EigenRank: A ranking oriented approach to collaborative filtering By Nathan N. Liu and Qiang Yang Presented by Zachary 1.

Recommender Systems Debapriyo Majumdar Information Retrieval – Spring 2015 Indian Statistical Institute Kolkata Credits to Bing Liu (UIC) and Angshul Majumdar.

Data Mining: Knowledge Discovery in Databases Peter van der Putten ALP Group, LIACS Pre-University College LAPP-Top Computer Science February 2005.

Recommender Systems. Recommender Systems (RSs) n RSs are software tools providing suggestions for items to be of use to users, such as what items to buy,

1 1 COMP5331: Knowledge Discovery and Data Mining Acknowledgement: Slides modified based on the slides provided by Lawrence Page, Sergey Brin, Rajeev Motwani.

CMU SCS : Multimedia Databases and Data Mining Lecture #30: Data Mining - assoc. rules C. Faloutsos.

1 Friends and Neighbors on the Web Presentation for Web Information Retrieval Bruno Lepri.

Collaborative Filtering via Euclidean Embedding M. Khoshneshin and W. Street Proc. of ACM RecSys, pp , 2010.

Personalization Services in CADAL Zhang yin Zhuang Yuting Wu Jiangqin College of Computer Science, Zhejiang University November 19,2006.

User Modeling and Recommender Systems: recommendation algorithms

1 Random Walks on the Click Graph Nick Craswell and Martin Szummer Microsoft Research Cambridge SIGIR 2007.

Item-Based Collaborative Filtering Recommendation Algorithms Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl GroupLens Research Group/ Army.

Collaborative Filtering - Pooja Hegde. The Problem : OVERLOAD Too much stuff!!!! Too many books! Too many journals! Too many movies! Too much content!

Item-Based Collaborative Filtering Recommendation Algorithms

Recommender systems 06/10/2017 S. Trausan-Matu.

Data Mining: Concepts and Techniques

Recommender Systems & Collaborative Filtering

CS728 The Collaboration Graph

Adopted from Bin UIC Recommender Systems Adopted from Bin UIC.

Location Recommendation — for Out-of-Town Users in Location-Based Social Network Yina Meng.

ITEM BASED COLLABORATIVE FILTERING RECOMMENDATION ALGORITHEMS

Modeling and Optimizing Word of Mouth

15-826: Multimedia Databases and Data Mining

Presentation transcript:

1 Cross-Selling with Collaborative Filtering Qiang Yang HKUST Thanks: Sonny Chee

2 Motivation Question: A user bought some products already what other products to recommend to a user? Collaborative Filtering (CF) Automates “circle of advisors”. +

3 Collaborative Filtering “..people collaborate to help one another perform filtering by recording their reactions...” (Tapestry) Finds users whose taste is similar to you and uses them to make recommendations. Complimentary to IR/IF. IR/IF finds similar documents – CF finds similar users.

4 Example Which movie would Sammy watch next? Ratings 1--5 If we just use the average of other users who voted on these movies, then we get Matrix= 3; Titanic= 14/4=3.5 Recommend Titanic! But, is this reasonable?

5 Types of Collaborative Filtering Algorithms Collaborative Filters Statistical Collaborative Filters Probabilistic Collaborative Filters [PHL00] Bayesian Filters [BP99][BHK98] Association Rules [Agrawal, Han] Open Problems Sparsity, First Rater, Scalability

6 Statistical Collaborative Filters Users annotate items with numeric ratings. Users who rate items “similarly” become mutual advisors. Recommendation computed by taking a weighted aggregate of advisor ratings.

7 Basic Idea Nearest Neighbor Algorithm Given a user a and item i First, find the the most similar users to a, Let these be Y Second, find how these users (Y) ranked i, Then, calculate a predicted rating of a on i based on some average of all these users Y How to calculate the similarity and average?

8 Statistical Filters GroupLens [Resnick et al 94, MIT] Filters UseNet News postings Similarity: Pearson correlation Prediction: Weighted deviation from mean

9 Pearson Correlation

10 Pearson Correlation Weight between users a and u Compute similarity matrix between users Use Pearson Correlation (-1, 0, 1) Let items be all items that users rated

11 Prediction Generation Predicts how much a user a likes an item i Generate predictions using weighted deviation from the mean : sum of all weights (1)

12 Error Estimation Mean Absolute Error (MAE) for user a Standard Deviation of the errors

13 Example SammyDylanMathew Sammy Dylan Mathew Users Correlation =0.83

14 Statistical Collaborative Filters Ringo [Shardanand and Maes 95 (MIT)] Recommends music albums Each user buys certain music artists’ CDs Base case: weighted average Predictions Mean square difference First compute dissimilarity between pairs of users Then find all users Y with dissimilarity less than L Compute the weighted average of ratings of these users Pearson correlation (Equation 1) Constrained Pearson correlation (Equation 1 with weighted average of similar users (corr > L))

15 Open Problems in CF “Sparsity Problem” CFs have poor accuracy and coverage in comparison to population averages at low rating density [GSK + 99]. “First Rater Problem” The first person to rate an item receives no benefit. CF depends upon altruism. [AZ97]

16 Open Problems in CF “Scalability Problem” CF is computationally expensive. Fastest published algorithms (nearest-neighbor) are n 2. Any indexing method for speeding up? Has received relatively little attention. References in CF: ref.htm ref.htm

17 References P. Domingos and M. Richardson, Mining the Network Value of Customers, Proceedings of the Seventh International Conference on Knowledge Discovery and Data Mining (pp ), San Francisco, CA: ACM Press.

18 Motivation Network value is ignored (Direct marketing). Examples: Market toAffected (under the network effect) Low expected profit High expected profit Marketed

19 Some Successful Case Hotmail Grew from 0 to 12 million users in 18 months Each include a promotional URL of it. ICQ Expand quickly First appear, user addicted to it Depend it to contact with friend

20 Introduction Incorporate the network value in maximizing the expected profit. Social networks: modeled by the Markov random field Probability to buy = Desirability of the item + Influence from others Goal = maximize the expected profit

21 Focus Making use of network value practically in recommendation Although the algorithm may be used in other applications, the focus is NOT a generic algorithm

22 Assumption Customer (buying) decision can be affected by other customer’s rating Market to people who is inclined to see the film One will not continue to use the system if he did not find its recommendations useful (natural elimination assumption)

23 Modeling View the markets as Social Networks Model the Social Network as Markov random field What is Markov random field ? An experiment with outcomes being functions of more than one continuous variable. [e.g. P(x,y,z)] The outcome depends on the neighbors’.

24 Variable definition X={X 1, …,X n } : a set of n potential customer, X i =1 (buy), X i =0 (not buy) X k (known value), X u (unknown value) N i ={X i,1,…, X i,n } : neighbor of X i Y={Y 1,…, Y m } : a set of attribute to describe the product M={M 1,…, M n } : a set of market action to each customer

25 Example (set of Y) Using EachMovie as example. X i : Whether the person i saw the movie ? Y : The movie genre R i : Rating to the movie by person i It sets Y as the movie genre, different problems can set different Y.

26 Goal of modeling To find the market action (M) to different customer, to achieve best profit. Profit is called ELP (expected lift in profit) ELP i (X k,Y,M) = r 1 P(X i =1|X k,Y,f i 1 (M))- r 0 P(X i =1|X k,Y,f i 0 (M)) –c r 1: revenue with market action r 0: revenue without market action

27 Three different modeling algorithm Single pass Greedy search Hill-climbing search

28 Scenarios Customer {A,B,C,D} A: He/She will buy the product if someone suggest and discount (  and M=1) C,D: He/She will buy the product if someone suggest or discount (M=1) B: He/She will never buy the product CADB Discount / suggest Discount + suggest Discount / suggest M=1 The best:

29 Single pass For each i, set M i =1 if ELP(X k,Y,f i 1 (M 0 )) > 0, and set M i =0 otherwise. Adv: Fast algorithm, one pass only Disadv: Some market action to the later customer may affect the previous customer And they are ignored

30 Single Pass Example M = {0,0,0,0}ELP(X k,Y,f 0 1 (M 0 )) <= 0 M = {0,0,0,0}ELP(X k,Y,f 1 1 (M 0 )) <= 0 M = {0,0,0,0}ELP(X k,Y,f 2 1 (M 0 )) > 0 M = {0,0,1,0}ELP(X k,Y,f 3 1 (M 0 )) > 0 M = {0,0,1,1}Done Single pass A, B, C, D CADB Discount / suggest Discount + suggest Discount / suggest M=1

31 Greedy Algorithm Set M= M 0. Loop through the M i ’s, setting each M i to 1 if ELP(X k,Y,f i 1 (M)) > ELP(X k,Y,M). Continue until no changes in M. Adv: Later changes to the M i ’s will affect the previous M i. Disadv: It takes much more computation time, several scans needed.

32 Greedy Example M 0 = {0,0,0,0}no pass M = {0,0,1,1}first pass M = {1,0,1,1}second pass M = {1,0,1,1}Done A, B, C, D CADB Discount / suggest Discount + suggest Discount / suggest M=1

33 Hill-climbing search Set M= M 0. Set M i1 =1, where i 1 =argmax i {ELP(X k,Y,f i 1 (M))}. Repeat Let i=argmax i {ELP(X k,Y, f i 1 ( f i1 1 (M)))} set M i =1, Until there is no i for setting M i =1 with a larger ELP. Adv: The best M will be calculated, as each time the best M i will be selected. Disadv: The most expensive algorithm.

34 Hill Climbing Example M = {0,0,0,0}no pass M = {0,0,1,0}first pass M = {1,0,1,0}Second pass M = {1,0,1,0}Done A, B, C, D CADB Discount / suggest Discount + suggest Discount / suggest M=1 The best:

35 Who Are the Neighbors? Mining Social Networks by Using Collaborative Filtering (CFinSC). Using Pearson correlation coefficient to calculate the similarity. The result in CFinSC can be used to calculate the Social networks. ELP and M can be found by Social networks.

36 Who are the neighbors? Calculate the weight of every customer by the following equation:

37 Neighbors’ Ratings for Product Calculate the Rating of the neighbor by the following equation. If the neighbor did not rate the item, R jk is set to mean of R j

38 Estimating the Probabilities P(X i ):Items rated by user i P(Y k |X i ):Obtained by counting the number of occurrences of each value of Y k with each value of X i. P(M i |X i ):Select user in random, do market action to them, record their effect. (If data not available, using prior knowledge to judge)

39 Preprocessing Zero mean Prune people ratings cover too few movies (10) Non-zero standard deviation in ratings Penalize the Pearson correlation coefficient if both users rate very few movies in common Remove all movies which were viewed by < 1% of the people

40 Experiment Setup Data: Each movie Trainset & Testset (temporal effect) TrainsetTestset (old)(new) 1/969/969/97 1/96 9/9612/96 rating released

41 Experiment Setup – cont. Target: 3 methods of searching an optimized marketing action VS baseline (direct marketing)

42 Experiment Results [Quote from the paper directly]

43 Experiment Results – cont. Proposed algorithms are much better than direct marketing Hill > (slight) greedy >> single-pass >> direct Higher α, better results!

44 Item Selection By “ Hub-Authority ” Profit Ranking ACM KDD2002 Ke Wang Ming-Yen Thomas Su Simon Fraser University

45 Ranking in Inter-related World Web pages Social networks Cross-selling

46 $1.5 $10 $3 $0.5 $5 $3 $15 $2 $8 100% 30% 50% 35% 60% Item Ranking with Cross- selling Effect What are the most profitable items?

47 The Hub/Authority Modeling Hubs i: “ introductory ” for sales of other items j (i->j). Authorities j: “ necessary ” for sales of other items i (i->j). Solution: model the mutual enforcement of hub and authority weights through links. Challenges: Incorporate individual profits of items and strength of links, and ensure hub/authority weights converges

48 Selecting Most Profitable Items Size-constrained selection given a size s, find s items that produce the most profit as a whole solution: select the s items at the top of ranking Cost-constrained selection given the cost for selecting each item, find a collection of items that produce the most profit as a whole solution: the same as above for uniform cost

49 Solution to const-constrained selection # of items selected Optimal cutoff Estimated profit Selection cost

50 Web Page Ranking Algorithm – HITS (Hyperlink- Induced Topic Search) Mutually reinforcing relationship Hub weight: h(i) =  a(j), for all page j such that i have a link to j Authority weight: a(i) =  h(j), for all page j that have a link to i h(j) a and h converge if normalized before each iteration

51 The Cross-Selling Graph Find frequent items and 2-itemsets Create a link i  j if Conf(i  j) is above a specified value (i and j may be same) “ Quality ” of link i  j: prof(i)*conf(i  j). Intuitively, it is the credit of j due to its influence on i

52 Computing Weights in HAP For each iteration, Authority weights: a(i) =  j  i prof(j)  conf(j  i)  h(j) Hub weights: h(i) =  i  j prof(i)  conf(i  j)  a(i) Cross-selling matrix B B[i, j] = prof(i)  conf(i, j) for link i  j B[i, j]=0 if no link i  j (i.e. (i, j) is not frequent set) Compute weights iteratively or use eigen analysis Rank items using their authority weights

53 Example Given frequent items, X, Y, and Z and the table We get the cross-selling matrix B: prof(X) = $5 conf(X  Y)= 0.2conf(Y  X)= 0.06 prof(Y) = $1 conf(X  Z)= 0.8conf(Z  X)= 0.2 prof(Z) = $0.1 conf(Y  Z)= 0.5conf(Z  Y)= XYZ X Y Z e.g. B[X,Y] = prof(X)  conf(X,Y) =

54 Example (con ’ t) prof(X) = $5, prof(Y) = $1, prof(Z) = $0.1 a(X) = 0.767, a(Y) = 0.166, a(Z) = HAP Ranking is different from ranking the individual profit The cross-selling effect increases the profitability of Z

55 Empirical Study Conduct experiments on two datasets Compare 3 selection methods: HAP, PROFSET [4, 5], and Na ï ve. HAP generate the highest estimated profit in most cases.

56 Empirical Study Drug StoreSynthetic Transaction #193,99510,000 Item #26,1281,000 Avg. Trans length Total profit$1,006,970$317,579 minsupp0.1%0.05%0.5%0.1% Freq. items Freq. pairs

57 Experiment Results *PROFSET[4]