WEB PERSONALIZATION NLP Course Seminar Group 14 Vishaal Jatav (04d05013) Varun Garg (04d05015)‏

Slides:



Advertisements
Similar presentations
Recommender Systems & Collaborative Filtering
Advertisements

Google News Personalization: Scalable Online Collaborative Filtering
Evaluating the Robustness of Learning from Implicit Feedback Filip Radlinski Thorsten Joachims Presentation by Dinesh Bhirud
A Graph-based Recommender System Zan Huang, Wingyan Chung, Thian-Huat Ong, Hsinchun Chen Artificial Intelligence Lab The University of Arizona 07/15/2002.
SEARCHING QUESTION AND ANSWER ARCHIVES Dr. Jiwoon Jeon Presented by CHARANYA VENKATESH KUMAR.
Collaborative Filtering Sue Yeon Syn September 21, 2005.
Jeff Howbert Introduction to Machine Learning Winter Collaborative Filtering Nearest Neighbor Approach.
By: Mr Hashem Alaidaros MIS 211 Lecture 4 Title: Data Base Management System.
COMP423 Intelligent Agents. Recommender systems Two approaches – Collaborative Filtering Based on feedback from other users who have rated a similar set.
Language Model based Information Retrieval: University of Saarland 1 A Hidden Markov Model Information Retrieval System Mahboob Alam Khalid.
Rubi’s Motivation for CF  Find a PhD problem  Find “real life” PhD problem  Find an interesting PhD problem  Make Money!
WebMiningResearch ASurvey Web Mining Research: A Survey Raymond Kosala and Hendrik Blockeel ACM SIGKDD, July 2000 Presented by Shan Huang, 4/24/2007.
Web Mining Research: A Survey
Agent Technology for e-Commerce
WebMiningResearchASurvey Web Mining Research: A Survey Raymond Kosala and Hendrik Blockeel ACM SIGKDD, July 2000 Presented by Shan Huang, 4/24/2007 Revised.
Recommender systems Ram Akella February 23, 2011 Lecture 6b, i290 & 280I University of California at Berkeley Silicon Valley Center/SC.
Recommender systems Ram Akella November 26 th 2008.
CONTENT-BASED BOOK RECOMMENDING USING LEARNING FOR TEXT CATEGORIZATION TRIVIKRAM BHAT UNIVERSITY OF TEXAS AT ARLINGTON DATA MINING CSE6362 BASED ON PAPER.
12 -1 Lecture 12 User Modeling Topics –Basics –Example User Model –Construction of User Models –Updating of User Models –Applications.
Personalized Ontologies for Web Search and Caching Susan Gauch Information and Telecommunications Technology Center Electrical Engineering and Computer.
Overview of Web Data Mining and Applications Part I
Chapter 12 (Section 12.4) : Recommender Systems Second edition of the book, coming soon.
FALL 2012 DSCI5240 Graduate Presentation By Xxxxxxx.
The 2nd International Conference of e-Learning and Distance Education, 21 to 23 February 2011, Riyadh, Saudi Arabia Prof. Dr. Torky Sultan Faculty of Computers.
Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.
Page 1 WEB MINING by NINI P SURESH PROJECT CO-ORDINATOR Kavitha Murugeshan.
Distributed Networks & Systems Lab. Introduction Collaborative filtering Characteristics and challenges Memory-based CF Model-based CF Hybrid CF Recent.
Recommender systems Drew Culbert IST /12/02.
Improved search for Socially Annotated Data Authors: Nikos Sarkas, Gautam Das, Nick Koudas Presented by: Amanda Cohen Mostafavi.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Google News Personalization: Scalable Online Collaborative Filtering
Toward the Next generation of Recommender systems
1 Recommender Systems Collaborative Filtering & Content-Based Recommending.
Personalized Search Xiao Liu
Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign.
Collaborative Information Retrieval - Collaborative Filtering systems - Recommender systems - Information Filtering Why do we need CIR? - IR system augmentation.
Machine Learning Tutorial Amit Gruber The Hebrew University of Jerusalem.
Web Image Retrieval Re-Ranking with Relevance Model Wei-Hao Lin, Rong Jin, Alexander Hauptmann Language Technologies Institute School of Computer Science.
Personalized Course Navigation Based on Grey Relational Analysis Han-Ming Lee, Chi-Chun Huang, Tzu- Ting Kao (Dept. of Computer Science and Information.
Personalized Interaction With Semantic Information Portals Eric Schwarzkopf DFKI
1 Statistical Machine Translation Models for Personalized Search Rohini U AOL India R&D, Bangalore India Vamshi Ambati Language.
1 Collaborative Filtering & Content-Based Recommending CS 290N. T. Yang Slides based on R. Mooney at UT Austin.
Recommender Systems Debapriyo Majumdar Information Retrieval – Spring 2015 Indian Statistical Institute Kolkata Credits to Bing Liu (UIC) and Angshul Majumdar.
Data Mining: Knowledge Discovery in Databases Peter van der Putten ALP Group, LIACS Pre-University College LAPP-Top Computer Science February 2005.
Automatic Video Tagging using Content Redundancy Stefan Siersdorfer 1, Jose San Pedro 2, Mark Sanderson 2 1 L3S Research Center, Germany 2 University of.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Recommender Systems. Recommender Systems (RSs) n RSs are software tools providing suggestions for items to be of use to users, such as what items to buy,
Search Engine using Web Mining COMS E Web Enhanced Information Mgmt Prof. Gail Kaiser Presented By: Rupal Shah (UNI: rrs2146)
Google News Personalization Big Data reading group November 12, 2007 Presented by Babu Pillai.
User Modeling and Recommender Systems: Introduction to recommender systems Adolfo Ruiz Calleja 06/09/2014.
Post-Ranking query suggestion by diversifying search Chao Wang.
Relevance Models and Answer Granularity for Question Answering W. Bruce Croft and James Allan CIIR University of Massachusetts, Amherst.
Information Design Trends Unit Five: Delivery Channels Lecture 2: Portals and Personalization Part 2.
Augmenting (personal) IR Readings Review Evaluation Papers returned & discussed Papers and Projects checkin time.
Peter Brusilovsky. Index What is adaptive navigation support? History behind adaptive navigation support Adaptation technologies that provide adaptive.
Personalization Services in CADAL Zhang yin Zhuang Yuting Wu Jiangqin College of Computer Science, Zhejiang University November 19,2006.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
User Modeling and Recommender Systems: recommendation algorithms
Artificial Intelligence Techniques Internet Applications 4.
Text Information Management ChengXiang Zhai, Tao Tao, Xuehua Shen, Hui Fang, Azadeh Shakery, Jing Jiang.
An Adaptive User Profile for Filtering News Based on a User Interest Hierarchy Sarabdeep Singh, Michael Shepherd, Jack Duffy and Carolyn Watters Web Information.
CS791 - Technologies of Google Spring A Web­based Kernel Function for Measuring the Similarity of Short Text Snippets By Mehran Sahami, Timothy.
Item-Based Collaborative Filtering Recommendation Algorithms
COMP423 Intelligent Agents. Recommender systems Two approaches – Collaborative Filtering Based on feedback from other users who have rated a similar set.
Data Mining: Concepts and Techniques
Recommender Systems & Collaborative Filtering
Collaborative Filtering Nearest Neighbor Approach
Author: Kazunari Sugiyama, etc. (WWW2004)
Google News Personalization: Scalable Online Collaborative Filtering
Web Mining Research: A Survey
Presentation transcript:

WEB PERSONALIZATION NLP Course Seminar Group 14 Vishaal Jatav (04d05013) Varun Garg (04d05015)‏

Roadmap Motivation Introduction The Personalization Process Personalization Approaches Personalization Techniques Issues Conclusion

Motivation Some Facts  Overwhelming amount of information on web  Not all the documents are relevant to the user  Users cannot convey their information needs  Users never find any document 100% relevant Users expect more personal behavior  I don't want results of Delhi when I am in Bombay.  I was looking for crane (the bird) not crane (the machine).

Google Customization

Google (without personalization)‏

Google (with personalization)‏

Google Search History

Introduction Personalization  React differently to different users  System reacts in a way the users want it to  Ultimately bring back the user to the system Web Personalization  Apply machine learning and data mining  Build models of user behavior (called profiles)‏  Predict user's needs and expectations  Adaptively estimate better models

The Personalization Process Consider the following pieces of information  Geographical Location  Age, gender, ethnicity, religion, etc.  Interests  Previous reviews on products  How could these pieces of information help? How to collect these information?

The Personalization Process (Contd...)‏ Collect lots of information on the user behavior  Information must be attributable to a single user Decide on a user model  Featuring user needs, lifestyle, situations, etc. Create user profile for each user of the system  Profile captures the individuality of the user Habits, browsing behavior, lifestyle, etc. With every interaction, modify the user profile

The Personalization Process More Formally Web is a collection of n items I = {i 1,i 2,....i n } User comes from a set U = {u 1,u 2,...u m } User has rated each item by r uk : I → [0,1] U !  where, i j = ! means i j is not rated by the user I k (u) is set of items not yet rated by user u k I k (r) is set of items rated by user u k GOAL: recommend items i j to user u a that are present in I a (u), which might be of his interest

Classification of Personalization Approaches Individual Vs Collaborative Reactive Vs Proactive User Vs Item Information

Classification of Personalization Approaches Individual Vs Collaborative Individual approach (Google Personalized Search)‏  Use only individual user's data  Generate user profile by analyzing User's browsing behavior User's active feedback on the system  Advantage Can be implemented on the client-side - no privacy violation  Disadvantage Based only on past interactions – lack of serendipity

Classification of Personalization Approaches Individual Vs Collaborative Contd... Collaborative approach (Amazon recommendations)‏  Find the neighborhood of the active user  React according to an assumption If A is like B, then B likes the same things as A likes  Disadvantages New item rating problem New user problem  Advantage Better than individual approach - Once the two problems are solved.

Classification of Personalization Approaches Reactive Vs Proactive Reactive approach  Explicitly ask user for preferences Either in the form of query or feedback Proactive approach  Learn user preferences by user behavior No explicit preference demand from the user  Behavior is extracted Click-through rates Navigational pattern

Classification of Personalization Approaches User Vs Item Information User Information  Geographic location (from IP address)‏  age, gender, marital status, etc (explicit query)‏  Lifestyle, etc. (inference from past behavior)‏ Item Information  Content of Topics – movie genre, etc.  Product/ domain ontology

Personalization Techniques Content-Based Filtering Collaborative Filtering Model Based Personalization  Rule based  Graph theoretic  Language Model

Content-Based Filtering Syskill and Webert use explicit feedback  Individual, Reactive, Item-information  Uses naïve Bayes to distinguish likes from dislikes  Initial probabilities updated with new interactions  Uses 128 most informative words from each item Letizia uses implicit feedback  Individual, Proactive, Item-information  Find likes/dislikes based on tf-idf similarity Others use nearest-neighborhood for similarity

Collaborative Filtering Found successful in recommendation systems General Technique  For every user, a user neighborhood is computed Neighborhood contains users who have rated several items almost equally  Get candidate items for recommendations Items seen by the neighborhood but not by active user u a Data is stored in the form of a rating matrix  Items as rows and users as columns

Collaborative Filtering Contd.... System must provide the following algorithms  Measure similarity between users For creation of the neighborhood Pearson and Spearman Correlation, cosine similarity, etc.  Predicting rank of the item not rated by the user To decide order with which these items will be presented Weighted sum of ranks – most common  Select neighborhood subset for prediction To reduce large amount of computation Threshold in similarity value – most common

Model Based Personalization Approaches Executed in two stages  Offline process – to create the actual model  Online process – using the model and interaction Common data used for model generation  Web usage data (web history, click-through rates, etc.)‏  Item's structure and content data Examples  Rule-Based Models  Graph-Theoretic Models  Language Models

Model Based Personalization Rule Based Models Association rule-based  Item i a is in unordered association with i b  If user considers i b, then i a is a good recommendation Sequence rule-based  Item i a is in sequential association with i b  If user considers i a, then i b is a good recommendation Association between items can be stored as a dependency graph

Model Based Personalization Graph Theoretic Model Ratings data is transformed into a directed graph  Nodes are users  A edge between u i and u j means that u i predicts u j  Weights on edges represents the predictability To predict if an item i k will be of interest to u i  Calculate shortest path from u i to any user u r Where u r has rated i k  Predicted rating is calculated as a function of path between u i and u r

Model Based Personalization Language Modeling Approaches Without using user's relevance feedback  Simple language modeling Using user's relevance feedback  N gram based methods  Noisy channel model based method

Language Model Approach Simple Language Modeling Without using user's feedback History consists of all the words in the past queries Learn User Profile as {(w 1,P(w 1 )),... (w n,P(w n ))} where

Language Model Approach Simple Language Modeling Sample User profile

Language Model Approach Simple Language Modeling Re-ranking of unpersonalized results  Re-ranking is done according to P(Q|D,u)‏ α Is a weighter parameter between 0 and 1 UP is user profile

Language Model Approach N gram based approach Using user's relevance feedback Learn User Profile  Let H u represent the search history of user u H = {(q 1, rf 1 ), (q 2, rf 2 ), (q 3, rf 3 ),...., (q n, rf n )}  Unigram Now the user profile consists of {(w 1, P(w 1 )), (w 2, P(w 2 )), (w 3, P(w 3 )),...., (w n, P(w n ))}

Language Model Approach N gram based approach Sample Unigram User Profile

Language Model Approach N gram based approach  Bigram the user profile consists of {(w 1 w 2, P(w 2 |w 1 )), (w 2 w 3, P(w 3 |w 2 )),..., (w n-1 w n, P(w n |w n-1 ))}

Language Model Approach N gram based approach Sample Bigram User Profile

Language Model Approach N gram based approach Re-ranking unpersonalized results  Based on unigram (α = weighting parameter)‏ Q = q 1 q 2 q q n P(q 1 q 2 q q n )= P(q 1 ) P(q 2 ) P(q 3 ) P(q n )‏

Language Model Approach N gram based approach  Based on bigrams Q = q 1 q 2 q q n P(q 1 q 2 q q n )= P(q 1 |q 2 ) P(q 2 |q 3 ) P(q n-1 |q n )‏

Language Model Approach Noisy Channel based approach With using User's Feedback (Implicit)‏ User history is represented as  H i = (Q 1,D 1 ), (Q 2,D 2 ),.... (Q N,D N )‏  D i is the document visited for Q i  D consists of words w 1, w 2,.... w m Basic Idea – Statistical Machine Translation  Given Parallel Text of languages S and T  We get P(t i |s i ) ∀ s i S and t i T  Using EM we get the optimized model P(T|S)‏

Language Model Approach Noisy Channel based approach Similarly  T = past queries Q 1, Q 2,.... Q K  S = text of relevant documents for queries T  We learn the model P(Q|D) or more precisely P(q i |w j )‏ Assumption  Translate the ideal [information containing] document into a query  Document – a verbose language  Query – a compact language User profile is stored as  Tuples

Language Model Approach Noisy Channel based approach Sample Noisy Channel User Profile

Language Model Approach Noisy Channel based approach Re-ranking  Re-rank the documents using P(Q|D,u)‏ α = weighting parameter P(q i |GE) is the lexical probability of q i

Issues in Personalization Cold Start Problem (new user problem)‏ Latency Problem (new item problem)‏ Data sparseness Scalability Privacy Recommendation List Diversity Robustness

Conclusion Web personalization is the need of the hour for e-businesses A relatively new research topic  Several issues are yet to be solved effectively Data should be collected without evading user privacy Creating user models effectively and scaling it to the size of a large number of users/ items is at the core of Personalization

Bibliography Rohini U, Vamshi Ambati and Vasudeva Varma. Statistical Machine Translation Models for Personalized Search. In the Proceedings of 3rd International Joint Conference on Natural Language Processing (IJCNLP 2008), January 7-12, 2008, Hyderabad, India. Sarabjot S. Anand and Bamshad Mobasher. Intelligent techniques for web personalization. In Intelligent Techniques for Web Personalization, pages Springer, Vasudeva Verma. Personalization in Information Retrieval, Extraction and Access. In Workshop On Ontology, NLP, Personalization And IE/IR - IIT Bombay, Mumbai July Snapshots from Google Inc.

Questions