Personalized Web Search by Mapping User Queries to Categories Fang Liu Presented by Jing Zhang CS491CXZ February 26, 2004.

Slides:



Advertisements
Similar presentations
Introduction to Information Retrieval Introduction to Information Retrieval Lecture 7: Scoring and results assembly.
Advertisements

Linear Classifiers (perceptrons)
Lazy vs. Eager Learning Lazy vs. eager learning
Navneet Goyal. Instance Based Learning  Rote Classifier  K- nearest neighbors (K-NN)  Case Based Resoning (CBR)
Instance Based Learning
What is missing? Reasons that ideal effectiveness hard to achieve: 1. Users’ inability to describe queries precisely. 2. Document representation loses.
K nearest neighbor and Rocchio algorithm
Clustering… in General In vector space, clusters are vectors found within  of a cluster vector, with different techniques for determining the cluster.
CS 590M Fall 2001: Security Issues in Data Mining Lecture 3: Classification.
Learning Techniques for Information Retrieval Perceptron algorithm Least mean.
Author: Jason Weston et., al PANS Presented by Tie Wang Protein Ranking: From Local to global structure in protein similarity network.
Mobile Web Search Personalization Kapil Goenka. Outline Introduction & Background Methodology Evaluation Future Work Conclusion.
A machine learning approach to improve precision for navigational queries in a Web information retrieval system Reiner Kraft
Retrieval Evaluation. Brief Review Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.
Re-ranking Documents Segments To Improve Access To Relevant Content in Information Retrieval Gary Madden Applied Computational Linguistics Dublin City.
Minimum Spanning Trees Displaying Semantic Similarity Włodzisław Duch & Paweł Matykiewicz Department of Informatics, UMK Toruń School of Computer Engineering,
1 Nearest Neighbor Learning Greg Grudic (Notes borrowed from Thomas G. Dietterich and Tom Mitchell) Intro AI.
Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.
University of Kansas Department of Electrical Engineering and Computer Science Dr. Susan Gauch April 2005 I T T C Dr. Susan Gauch Personalized Search Based.
Context-Based Metrics For Evaluating Changes to Web Pages Thesis Defense By Suvendu Kumar Dash Texas A&M University.
MARS: Applying Multiplicative Adaptive User Preference Retrieval to Web Search Zhixiang Chen & Xiannong Meng U.Texas-PanAm & Bucknell Univ.
Learning Techniques for Information Retrieval We cover 1.Perceptron algorithm 2.Least mean square algorithm 3.Chapter 5.2 User relevance feedback (pp )
CSCI 5417 Information Retrieval Systems Jim Martin Lecture 6 9/8/2011.
Slide Image Retrieval: A Preliminary Study Guo Min Liew and Min-Yen Kan National University of Singapore Web IR / NLP Group (WING)
Latent Semantic Indexing Debapriyo Majumdar Information Retrieval – Spring 2015 Indian Statistical Institute Kolkata.
2008/06/06 Y.H.Chang Towards Effective Browsing of Large Scale Social Annotations1 Towards Effective Browsing of Large Scale Social Annotations WWW 2007.
APPLICATIONS OF DATA MINING IN INFORMATION RETRIEVAL.
1 Context-Aware Search Personalization with Concept Preference CIKM’11 Advisor : Jia Ling, Koh Speaker : SHENG HONG, CHUNG.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Chien-Shing Chen Author: Tie-Yan.
Implementation in C+CUDA of Multi-Label Text Categorizers In automated multi-label text categorization problems with large numbers of labels, the training.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Finding Better Answers in Video Using Pseudo Relevance Feedback Informedia Project Carnegie Mellon University Carnegie Mellon Question Answering from Errorful.
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
Exploring Online Social Activities for Adaptive Search Personalization CIKM’10 Advisor : Jia Ling, Koh Speaker : SHENG HONG, CHUNG.
Glasgow 02/02/04 NN k networks for content-based image retrieval Daniel Heesch.
Querying Structured Text in an XML Database By Xuemei Luo.
WEB SEARCH PERSONALIZATION WITH ONTOLOGICAL USER PROFILES Data Mining Lab XUAN MAN.
1 Motivation Web query is usually two or three words long. –Prone to ambiguity –Example “keyboard” –Input device of computer –Musical instruments How can.
Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic and Andrew Zisserman.
Personalized Search Xiao Liu
Wei Feng , Jiawei Han, Jianyong Wang , Charu Aggarwal , Jianbin Huang
1 Automatic Classification of Bookmarked Web Pages Chris Staff Second Talk February 2007.
Improving Web Search Results Using Affinity Graph Benyu Zhang, Hua Li, Yi Liu, Lei Ji, Wensi Xi, Weiguo Fan, Zheng Chen, Wei-Ying Ma Microsoft Research.
Query Suggestion Naama Kraus Slides are based on the papers: Baeza-Yates, Hurtado, Mendoza, Improving search engines by query clustering Boldi, Bonchi,
Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.
Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering.
Knowledge Learning by Using Case Based Reasoning (CBR)
Recommender Systems Debapriyo Majumdar Information Retrieval – Spring 2015 Indian Statistical Institute Kolkata Credits to Bing Liu (UIC) and Angshul Majumdar.
Modern information retreival Chapter. 02: Modeling (Latent Semantic Indexing)
A Practical Web-based Approach to Generating Topic Hierarchy for Text Segments CIKM2004 Speaker : Yao-Min Huang Date : 2005/03/10.
Post-Ranking query suggestion by diversifying search Chao Wang.
Outline K-Nearest Neighbor algorithm Fuzzy Set theory Classifier Accuracy Measures.
Lazy Learners K-Nearest Neighbor algorithm Fuzzy Set theory Classifier Accuracy Measures.
26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.
Augmenting (personal) IR Readings Review Evaluation Papers returned & discussed Papers and Projects checkin time.
Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:
1 Personalized IR Reloaded Xuehua Shen
Web News Sentence Searching Using Linguistic Graph Similarity
Information Retrieval and Web Search
Latent Semantic Indexing
Adopted from Bin UIC Recommender Systems Adopted from Bin UIC.
Author: Kazunari Sugiyama, etc. (WWW2004)
6. Implementation of Vector-Space Retrieval
Feature Selection for Ranking
Relevance and Reinforcement in Interactive Browsing
Recommendation Systems
Zhixiang Chen & Xiannong Meng U.Texas-PanAm & Bucknell Univ.
Recuperação de Informação B
WSExpress: A QoS-Aware Search Engine for Web Services
Presentation transcript:

Personalized Web Search by Mapping User Queries to Categories Fang Liu Presented by Jing Zhang CS491CXZ February 26, 2004

Background Different users use same key words such as “apple” can be fruit or computer Category hierarchy can’t fit in one screen Users are impatient to identify hierarchy before submit query

Central problem How to personalize web search by mapping user queries to categories?

Key ideas of this paper Build profile (both user and general profile) on search history Deduce appropriate categories based on user’s profile Associate query key words with category Return top 3 categories to user each time

Methods to map key words to category Use both user profile and general profile Use user profile only Use general profile only

Build user profiles (1) Tree representation of search record

Build user profiles (2) Predefined Input Output

Build general profile First two level of ODP category hierarchy (619 categories) Row1 Row2

Algorithms to learn profiles Linear Least Squares Fit (LLST) Rocchio-based Algorithm K-Nearest Neighbor (kNN) Adaptive Learning

LLSF Singular Value Decomposition

Pseudo-LLSF (pLLSF)

Ricchio-based Algorithm (bRocchio) where m is the number of documents in DT, N i is the number of documents that are related to the i-th category, and M(i,j) is the average weight of the j-th term in all documents that are related to the i-th category.

kNN where q is the query; c j is the j-th category; d i is a document among the k nearest neighbors of q and the i-th row vector in DT, Cos(q, d i ) is the cosine similarity between q and d i, and DC(i,j) denotes whether d i is related to the j-th category.

Adaptive Learning (aRocchio)

Data sets for the experiment

Performance Evaluation where n is the number of related categories to the query, score ci is the score of a related category c i that is ranked among the top 3, rank ci is the rank of c i and ideal_rank ci is the highest possible rank for c i

Experiment Results (1) Batch Learning Method

Experiment Results (2) Comparison of Mapping methods

Experiment Results (3) Adaptive Learning (aRocchio)

Discussions Why user 1 have lowest accuracy and user 3 have highest accuracy for batch learning method?