Modeling the Internet and the Web School of Information and Computer Science University of California, Irvine E-Commerce.

Slides:



Advertisements
Similar presentations
Recommender Systems & Collaborative Filtering
Advertisements

Google News Personalization: Scalable Online Collaborative Filtering
Item Based Collaborative Filtering Recommendation Algorithms
Modelling Relevance and User Behaviour in Sponsored Search using Click-Data Adarsh Prasad, IIT Delhi Advisors: Dinesh Govindaraj SVN Vishwanathan* Group:
LEARNING INFLUENCE PROBABILITIES IN SOCIAL NETWORKS Amit Goyal Francesco Bonchi Laks V. S. Lakshmanan University of British Columbia Yahoo! Research University.
A Graph-based Recommender System Zan Huang, Wingyan Chung, Thian-Huat Ong, Hsinchun Chen Artificial Intelligence Lab The University of Arizona 07/15/2002.
Spread of Influence through a Social Network Adapted from :
Back to Table of Contents
Jeff Howbert Introduction to Machine Learning Winter Collaborative Filtering Nearest Neighbor Approach.
Fast Bayesian Matching Pursuit Presenter: Changchun Zhang ECE / CMR Tennessee Technological University November 12, 2010 Reading Group (Authors: Philip.
Recommender Systems Aalap Kohojkar Yang Liu Zhan Shi March 31, 2008.
1 Graphical Models in Data Assimilation Problems Alexander Ihler UC Irvine Collaborators: Sergey Kirshner Andrew Robertson Padhraic Smyth.
Lecture 14: Collaborative Filtering Based on Breese, J., Heckerman, D., and Kadie, C. (1998). Empirical analysis of predictive algorithms for collaborative.
1 Unsupervised Learning With Non-ignorable Missing Data Machine Learning Group Talk University of Toronto Monday Oct 4, 2004 Ben Marlin Sam Roweis Rich.
Web Metrics October 26, 2006 Steven Schwartz President, PowerWebResults.com Southeastern Massachusetts E-Commerce Network University of Massachusetts –
Agent Technology for e-Commerce
Customizable Bayesian Collaborative Filtering Denver Dash Big Data Reading Group 11/19/2007.
Recommender systems Ram Akella February 23, 2011 Lecture 6b, i290 & 280I University of California at Berkeley Silicon Valley Center/SC.
Recommender systems Ram Akella November 26 th 2008.
CS 277: Data Mining Recommender Systems
Walter Hop Web-shop Order Prediction Using Machine Learning Master’s Thesis Computational Economics.
FALL 2012 DSCI5240 Graduate Presentation By Xxxxxxx.
1 Forecasting Field Defect Rates Using a Combined Time-based and Metrics-based Approach: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb Mary Shaw Carnegie.
Consumer Behavior, Market Research
Fall 2006 Davison/LinCSE 197/BIS 197: Search Engine Strategies 6-1 Module II Overview PLANNING: Things to Know BEFORE You Start… Why SEM? Goal Analysis.
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005 Slide 9.1 Chapter 9 : Social Networks What is a social.
Distributed Networks & Systems Lab. Introduction Collaborative filtering Characteristics and challenges Memory-based CF Model-based CF Hybrid CF Recent.
Generating Intelligent Links to Web Pages by Mining Access Patterns of Individuals and the Community Benjamin Lambert Omid Fatemieh CS598CXZ Spring 2005.
E-commerce Vocabulary Terms. E-commerce Buying and selling of goods, services, or information via World Wide Web, , or other pathways on the Internet.
E-commerce Vocabulary Terms By: Laura Kinchen. Buying and selling of goods, services, or information via World Wide Web, , or other pathways on the.
Streaming Predictions of User Behavior in Real- Time Ethan DereszynskiEthan Dereszynski (Webtrends) Eric ButlerEric Butler (Cedexis) OSCON 2014.
Outline What Neural Networks are and why they are desirable Historical background Applications Strengths neural networks and advantages Status N.N and.
+ Recommending Branded Products from Social Media Jessica CHOW Yuet Tsz Yongzheng Zhang, Marco Pennacchiotti eBay Inc. eBay Inc.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
EMIS 8381 – Spring Netflix and Your Next Movie Night Nonlinear Programming Ron Andrews EMIS 8381.
Chengjie Sun,Lei Lin, Yuan Chen, Bingquan Liu Harbin Institute of Technology School of Computer Science and Technology 1 19/11/ :09 PM.
Google News Personalization: Scalable Online Collaborative Filtering
Latent Semantic Analysis Hongning Wang Recap: vector space model Represent both doc and query by concept vectors – Each concept defines one dimension.
Implicit User Feedback Hongning Wang Explicit relevance feedback 2 Updated query Feedback Judgments: d 1 + d 2 - d 3 + … d k -... Query User judgment.
1 Social Networks and Collaborative Filtering Qiang Yang HKUST Thanks: Sonny Chee.
Learning Geographical Preferences for Point-of-Interest Recommendation Author(s): Bin Liu Yanjie Fu, Zijun Yao, Hui Xiong [KDD-2013]
Chapter 12: Web Usage Mining - An introduction Chapter written by Bamshad Mobasher Many slides are from a tutorial given by B. Berendt, B. Mobasher, M.
EigenRank: A Ranking-Oriented Approach to Collaborative Filtering IDS Lab. Seminar Spring 2009 강 민 석강 민 석 May 21 st, 2009 Nathan.
Collaborative Filtering  Introduction  Search or Content based Method  User-Based Collaborative Filtering  Item-to-Item Collaborative Filtering  Using.
A Sparse Non-Parametric Approach for Single Channel Separation of Known Sounds Paris Smaragdis, Madhusudana Shashanka, Bhiksha Raj NIPS 2009.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
Probabilistic Models for Discovering E-Communities Ding Zhou, Eren Manavoglu, Jia Li, C. Lee Giles, Hongyuan Zha The Pennsylvania State University WWW.
EigenRank: A ranking oriented approach to collaborative filtering By Nathan N. Liu and Qiang Yang Presented by Zachary 1.
Collaborative Filtering Zaffar Ahmed
Pairwise Preference Regression for Cold-start Recommendation Speaker: Yuanshuai Sun
Google News Personalization Big Data reading group November 12, 2007 Presented by Babu Pillai.
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
Personalization Services in CADAL Zhang yin Zhuang Yuting Wu Jiangqin College of Computer Science, Zhejiang University November 19,2006.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
Matrix Factorization & Singular Value Decomposition Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Dependency Networks for Inference, Collaborative filtering, and Data Visualization Heckerman et al. Microsoft Research J. of Machine Learning Research.
Collaborative Filtering - Pooja Hegde. The Problem : OVERLOAD Too much stuff!!!! Too many books! Too many journals! Too many movies! Too much content!
Data Resource Management – MGMT An overview of where we are right now SQL Developer OLAP CUBE 1 Sales Cube Data Warehouse Denormalized Historical.
1 Dongheng Sun 04/26/2011 Learning with Matrix Factorizations By Nathan Srebro.
Recommender Systems & Collaborative Filtering
CS728 The Collaboration Graph
Web Mining Ref:
Multimodal Learning with Deep Boltzmann Machines
Machine Learning Basics
Collaborative Filtering Nearest Neighbor Approach
Network Screening & Diagnosis
Topic Models in Text Processing
Lecture 16. Classification (II): Practical Considerations
Presentation transcript:

Modeling the Internet and the Web School of Information and Computer Science University of California, Irvine E-Commerce

Modeling the Internet and the Web School of Information and Computer Science University of California, Irvine 2 Outline Introduction Customer Data on the Web Automated Recommender Systems Networks and Recommendations Web Path Analysis for Purchase Prediction

Modeling the Internet and the Web School of Information and Computer Science University of California, Irvine 3 Introduction Some Motivating Questions –Can we design algorithms to help recommend new products to visitors based on their browsing behavior? –Can we better understand factors influencing how customers make purchases on a website? –Can we predict in real time who will make purchases based on their observed navigation patterns?

Modeling the Internet and the Web School of Information and Computer Science University of California, Irvine 4 Customer Data on the Web Data collection on client, server sides and anywhere in between Goal determine who is purchasing what products Tracking customer data –Web logs, E-Commerce logs, cookies, explicit login –Data then used to provide personalized content to site users to: Assist customers in locating their target selections “Encourage” customers to make certain selections

Modeling the Internet and the Web School of Information and Computer Science University of California, Irvine 5 Automated Recommender Systems Problem framed in two ways –Users ‘vote’ for pages/items (binary) –Users rank pages/items (multivalued) Results are captured in a generally sparse matrix (users x items) Complication: no votes can occur because users do not vote on items they do not like (Breeze, et al 1998) –Ignored by most recommender systems

Modeling the Internet and the Web School of Information and Computer Science University of California, Irvine 6 Automated Recommender Systems

Modeling the Internet and the Web School of Information and Computer Science University of California, Irvine 7 Evaluating Recommender Systems Cautions in data interpretation –Users may purchase items regardless of recommendations –Users may also avoid purchases they might have made based on recommendations Approaches to recommender algorithms –Nearest-neighbor –Model-based collaborative filtering –Others?

Modeling the Internet and the Web School of Information and Computer Science University of California, Irvine 8 Nearest-Neighbor Collaborative Filtering Basic principle: utilize user’s vote history to predict future votes/recommendations –Find most similar users to the target user in the training matrix and fill in the target user’s missing vote values based on these “nearest-neighbors” A typical normalized prediction scheme: goal: predict vote for item ‘j’ based on other users, weighted towards those with similar past votes as target user ‘a’

Modeling the Internet and the Web School of Information and Computer Science University of California, Irvine 9 Nearest-Neighbor Collaborative Filtering Another challenge: defining weights –What is “the most optimal weight calculation” to use? Requires fine tuning of weighting algorithm for the particular data set –What do we do when the target user has not voted enough to provide a reliable set of nearest- neighbors? One approach: use default votes (popular items) to populate matrix on items neither the target user nor the nearest- neighbor have voted on A different approach: model-based prediction using Dirichlet priors to smooth the votes (see chapter 7) Other factors include relative vote counts for all items between users, thresholding, clustering (see Sarwar, 2000)

Modeling the Internet and the Web School of Information and Computer Science University of California, Irvine 10 Nearest-Neighbor Collaborative Filtering Structure based recommendations –Recommendations based on similarities between items with positive votes (as opposed to votes of other users) –Structure of item dependencies modeled through dimensionality reduction via singular value decomposition (SVD) aka latent semantic indexing (see chapter 4) Approximate the set of row-vector votes as a linear combination of basis column-vectors –i.e. find the set of columns to least-squares minimize the difference between the row estimations and their true values Perform nearest-neighbor calculations to project predictions for all items

Modeling the Internet and the Web School of Information and Computer Science University of California, Irvine 11 Model Based Collaborative Filtering Recommendations based on a model of relationships between items based on historical voting patterns in the training set –Better performance than nearest-neighbor analysis Joint distribution modeling –Uses one model as basis for predictions Conditional distribution modeling –A model for each item predicting future vote based on votes for each of the other items

Modeling the Internet and the Web School of Information and Computer Science University of California, Irvine 12 Model Based Collaborative Filtering Joint distribution modeling: A practical approach –Model joint distribution as a finite mixture of simpler distributions –Additional simplification is achieved by assuming that votes are independent of others within a component Limitation: assumes that users can be described with one model of the ‘K’ mixture components –Hoffman and Puzicha (1999) propose a workaround asserting that each row of votes represents up to ‘K’ mixture components, rather than a single component

Modeling the Internet and the Web School of Information and Computer Science University of California, Irvine 13 Model Based Collaborative Filtering –Another limitation: all predictions are based on the (static) training set Conditional distribution modeling –Better results by creating a model for each item conditioned on the others rather than using a single joint density model –Decision trees Heckerman et al. (2000) Greedy approach to approximate tree structure Predictions are made for each item not purchased or visited Performance –Accuracy nearly equal to Bayesian networks –Offline memory usage significantly less than Bayesian networks –Offline computation time complexity better than Bayesian networks

Modeling the Internet and the Web School of Information and Computer Science University of California, Irvine 14 Model-Based Combining of Votes and Content Combine content-specific information with other information (e.g. structure, vote) –Useful for determining item similarity (Mooney and Roy 2000) and creating user models –Useful when there is no vote history –Implementation (Popescul et al 2000) Extension of (Hoffman and Puzicha 1999) Joint density is determined assuming a hidden latent variable making users, documents, and words conditionally independent i.e.

Modeling the Internet and the Web School of Information and Computer Science University of California, Irvine 15 Model-Based Combining of Votes and Content The hidden variable represents multiple (hidden) topics of a document Conditional probabilities of the hidden parameter are calculated using EM Sparsity still remains a problem for content-based modeling

Modeling the Internet and the Web School of Information and Computer Science University of California, Irvine 16 Challenges Noisy Data –The same user may use multiple IP addresses/logins –Different users may use the same IP address/login Privacy –No cookies! Changing user habits –Previous history may not accurately predict present purchase selection Continuous updating of user activities

Modeling the Internet and the Web School of Information and Computer Science University of California, Irvine 17 Networks & Recommendation Word-of-Mouth –Needs little explicit advertising –Products are recommended to friends, family, co-workers, etc. –This is the primary form of advertising behind the growth of Google

Modeling the Internet and the Web School of Information and Computer Science University of California, Irvine 18 Product Recommendation Hotmail –Very little direct advertising in the beginning –Launched in July ,000 subscribers after a month 100,000 subscribers after 3 months 1,000,000 subscribers after 6 months 12,000,000 subscribers after 18 months By April 2002 Hotmail had 110 million subscribers

Modeling the Internet and the Web School of Information and Computer Science University of California, Irvine 19 Product Recommendation What was Hotmail’s primary form of advertising? –Small link to the sign up page at the bottom of every sent by a subscriber ‘Spreading Activation’ Implicit recommendation

Modeling the Internet and the Web School of Information and Computer Science University of California, Irvine 20 Spreading Activation Network effects –Even if a small number of people who receive the message subscribe (~0.1%), the service will spread rapidly –This can be contrasted with the current practice of SPAM SPAM is not sent by friends, family, co-workers No implicit recommendation SPAM is often viewed as not providing a good service

Modeling the Internet and the Web School of Information and Computer Science University of California, Irvine 21 Modeling Spreading Activation Diffusion Model –Montgomery (2002) Applied models used in marketing literature, Bass (1969) to the hotmail phenomena Similar word-of-mouth networks used in selling consumer electronics such as refrigerators and televisions We want to predict at time t how many individuals k(t) will adopt the product out of a population of N possible adopters

Modeling the Internet and the Web School of Information and Computer Science University of California, Irvine 22 Modeling Spreading Activation Diffusion Model –Two ways individuals will subscribe Direct Advertising –At time t, N – k(t) individuals have not subscribed –α ≥ 0 percent of these individuals will subscribe due to direct advertising Word-of-Mouth –At time t, there are k(t)(N – k(t)) possible connections between subscribers and non-subscribers –β ≥ 0 percent of these connections will cause a non- subscriber to subscribe

Modeling the Internet and the Web School of Information and Computer Science University of California, Irvine 23 Modeling Spreading Activation Combine these and we get the following expression: Solve this and we get:

Modeling the Internet and the Web School of Information and Computer Science University of California, Irvine 24 Modeling Spreading Activation

Modeling the Internet and the Web School of Information and Computer Science University of California, Irvine 25 Modeling Spreading Activation

Modeling the Internet and the Web School of Information and Computer Science University of California, Irvine 26 Modeling Spreading Activation Diffusion Model –This does not completely model the what actually occurred –However, it is simple and provides a lot of interesting (useful) information –Other work Domingos & Richardson (2001) Markov Random Field Model Daley & Gani (1999) various deterministic and stochastic models

Modeling the Internet and the Web School of Information and Computer Science University of California, Irvine 27 Purchase Prediction We want to predict whether or not a shopper will make a purchase –We know demographics –We know page view patterns –Can we accurately predict if the user will make a purchase or not?

Modeling the Internet and the Web School of Information and Computer Science University of California, Irvine 28 Purchase Prediction Li et al. (2002) –Study 1160 shoppers at between April 1 and April 30, –The data was collected client side so they knew exactly what pages were displayed to the user –They also knew the demographics (predominantly well-educated and affluent)

Modeling the Internet and the Web School of Information and Computer Science University of California, Irvine 29 Purchase Prediction Li et al. (2002) –There were 14,512 page views which they divided into 1659 sessions Mean: 8.75 Median: 5 Standard deviation: 16.4 Min: 1 Max: 570 7% of sessions contained a purchase

Modeling the Internet and the Web School of Information and Computer Science University of California, Irvine 30 Purchase Prediction Li et al. (2002) –Divided the pages into 8 classes Home (H), main page Account (A), account information pages List (L), pages with lists of items Product (P), page with a single item Information (I), informational pages (shipping etc.) Shopping cart (S) Order (O), indicates a completed order Entry or Exit (E), entering or leaving the site

Modeling the Internet and the Web School of Information and Computer Science University of California, Irvine 31 Purchase Prediction Li et al. (2002) –Each session was represented by a string of the form: I H H I I L I I E –A session containing an O is considered having made a purchase –The average length of a session with a purchase was 34.5 and without was only 6.8

Modeling the Internet and the Web School of Information and Computer Science University of California, Irvine 32 Purchase Prediction Markov transition matrix –For sessions with no purchase

Modeling the Internet and the Web School of Information and Computer Science University of California, Irvine 33 Purchase Prediction Li et al. (2002) –They did several models based on this data Tested on predicting next page and predicting a purchase Best models 64% accurate at predicting next page After 2 page views the best models predicted 12% true positives and 5.3% false positives After 6 page views 13.1% true positives and 2.9% false positives