UCAIR Project Xuehua Shen, Bin Tan, ChengXiang Zhai

Slides:



Advertisements
Similar presentations
Recommender Systems & Collaborative Filtering
Advertisements

2008 © ChengXiang Zhai Dragon Star Lecture at Beijing University, June 21-30, 龙星计划课程 : 信息检索 Personalized Search & User Modeling ChengXiang Zhai.
Overview of Collaborative Information Retrieval (CIR) at FIRE 2012 Debasis Ganguly, Johannes Leveling, Gareth Jones School of Computing, CNGL, Dublin City.
Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.
Searchable Web sites Recommendation Date : 2012/2/20 Source : WSDM’11 Speaker : I- Chih Chiu Advisor : Dr. Koh Jia-ling 1.
1 Learning User Interaction Models for Predicting Web Search Result Preferences Eugene Agichtein Eric Brill Susan Dumais Robert Ragno Microsoft Research.
“ The Anatomy of a Large-Scale Hypertextual Web Search Engine ” Presented by Ahmed Khaled Al-Shantout ICS
Information Retrieval in Practice
Search Engines and Information Retrieval
Basic IR: Queries Query is statement of user’s information need. Index is designed to map queries to likely to be relevant documents. Query type, content,
IR Challenges and Language Modeling. IR Achievements Search engines  Meta-search  Cross-lingual search  Factoid question answering  Filtering Statistical.
Mobile Web Search Personalization Kapil Goenka. Outline Introduction & Background Methodology Evaluation Future Work Conclusion.
Web Logs and Question Answering Richard Sutcliffe 1, Udo Kruschwitz 2, Thomas Mandl University of Limerick, Ireland 2 - University of Essex, UK 3.
1 CS 430 / INFO 430 Information Retrieval Lecture 24 Usability 2.
COMP 630L Paper Presentation Javy Hoi Ying Lau. Selected Paper “A Large Scale Evaluation and Analysis of Personalized Search Strategies” By Zhicheng Dou,
University of Kansas Department of Electrical Engineering and Computer Science Dr. Susan Gauch April 2005 I T T C Dr. Susan Gauch Personalized Search Based.
Putting Query Representation and Understanding in Context: ChengXiang Zhai Department of Computer Science University of Illinois at Urbana-Champaign A.
12 -1 Lecture 12 User Modeling Topics –Basics –Example User Model –Construction of User Models –Updating of User Models –Applications.
The Relevance Model  A distribution over terms, given information need I, (Lavrenko and Croft 2001). For term r, P(I) can be dropped w/o affecting the.
Personalized Ontologies for Web Search and Caching Susan Gauch Information and Telecommunications Technology Center Electrical Engineering and Computer.
Overview of Search Engines
Cohort Modeling for Enhanced Personalized Search Jinyun YanWei ChuRyen White Rutgers University Microsoft BingMicrosoft Research.
CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.
Personalization in Local Search Personalization of Content Ranking in the Context of Local Search Philip O’Brien, Xiao Luo, Tony Abou-Assaleh, Weizheng.
Search Engines and Information Retrieval Chapter 1.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
CIKM’09 Date:2010/8/24 Advisor: Dr. Koh, Jia-Ling Speaker: Lin, Yi-Jhen 1.
Personalized Search Cheng Cheng (cc2999) Department of Computer Science Columbia University A Large Scale Evaluation and Analysis of Personalized Search.
Hao Wu Nov Outline Introduction Related Work Experiment Methods Results Conclusions & Next Steps.
Clustering Personalized Web Search Results Xuehua Shen and Hong Cheng.
Probabilistic Query Expansion Using Query Logs Hang Cui Tianjin University, China Ji-Rong Wen Microsoft Research Asia, China Jian-Yun Nie University of.
Implicit User Feedback Hongning Wang Explicit relevance feedback 2 Updated query Feedback Judgments: d 1 + d 2 - d 3 + … d k -... Query User judgment.
Personalized Search Xiao Liu
Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign.
Toward A Session-Based Search Engine Smitha Sriram, Xuehua Shen, ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Search Engine Architecture
GUIDED BY DR. A. J. AGRAWAL Search Engine By Chetan R. Rathod.
 Examine two basic sources for implicit relevance feedback on the segment level for search personalization. Eye tracking Display time.
WIRED Week 3 Syllabus Update (next week) Readings Overview - Quick Review of Last Week’s IR Models (if time) - Evaluating IR Systems - Understanding Queries.
Personalizing Web Search using Long Term Browsing History Nicolaas Matthijs, Cambridge Filip Radlinski, Microsoft In Proceedings of WSDM
Personalized Interaction With Semantic Information Portals Eric Schwarzkopf DFKI
Personalization with user’s local data Personalizing Search via Automated Analysis of Interests and Activities 1 Sungjick Lee Department of Electrical.
Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.
Personalized Social Search Based on the User’s Social Network David Carmel et al. IBM Research Lab in Haifa, Israel CIKM’09 16 February 2011 Presentation.
Implicit User Modeling for Personalized Search Xuehua Shen, Bin Tan, ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Search Result Interface Hongning Wang Abstraction of search engine architecture User Ranker Indexer Doc Analyzer Index results Crawler Doc Representation.
Threshold Setting and Performance Monitoring for Novel Text Mining Wenyin Tang and Flora S. Tsai School of Electrical and Electronic Engineering Nanyang.
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
Active Feedback in Ad Hoc IR Xuehua Shen, ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
1 Click Chain Model in Web Search Fan Guo Carnegie Mellon University PPT Revised and Presented by Xin Xin.
Automatic Labeling of Multinomial Topic Models
The Loquacious ( 愛說話 ) User: A Document-Independent Source of Terms for Query Expansion Diane Kelly et al. University of North Carolina at Chapel Hill.
CS798: Information Retrieval Charlie Clarke Information retrieval is concerned with representing, searching, and manipulating.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Text Information Management ChengXiang Zhai, Tao Tao, Xuehua Shen, Hui Fang, Azadeh Shakery, Jing Jiang.
Nonintrusive Personalization in Interactive IR Xuehua Shen Department of Computer Science University of Illinois at Urbana-Champaign Thesis Committee:
Context-Sensitive IR using Implicit Feedback Xuehua Shen, Bin Tan, ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
University Of Seoul Ubiquitous Sensor Network Lab Query Dependent Pseudo-Relevance Feedback based on Wikipedia 전자전기컴퓨터공학 부 USN 연구실 G
Information Retrieval in Practice
Recommender Systems & Collaborative Filtering
Evaluation Anisio Lacerda.
User Characterization in Search Personalization
Inferring People’s Site Preference in Web Search
Search Engine Architecture
Author: Kazunari Sugiyama, etc. (WWW2004)
John Lafferty, Chengxiang Zhai School of Computer Science
Manuscript Transcription Assistant Initiative
Search Engine Architecture
Retrieval Performance Evaluation - Measures
Presentation transcript:

UCAIR Project Xuehua Shen, Bin Tan, ChengXiang Zhai

2 Outline Motivation Progress –Framework –Model –System –Evaluation Road ahead –Continuous work –New direction

3 Problem of Context-Independent Search Jaguar Car Apple Software Animal Chemistry Software

4 Other Context Info: Dwelling time Mouse movement Clickthrough Query History Put Search in Context Apple software Hobby …

5 Outline Motivation Progress –Framework –Model –System –Evaluation Road ahead –Continuous work –New direction

6 A Decision Theoretic Framework Model interactive IR as “action dialog”: cycles of user action and system response User actionSystem response Submit a new queryRetrieve new documents View a documentRerank document

7 A Decision Theoretic Framework (cont.) Search optimal system response given a new user action

8 User Models Components of user model M –User information need –User viewed documents S –User actions A t and system responses R t-1 –…

9 Loss Functions Loss function for result reranking Loss function for query expansion

10 Implicit User Modeling Update user information need given a new query Learn better user models given skipped top n documents and viewed the (n+1)-th document

11 Outline Motivation Progress –Framework –Model –System –Evaluation Road ahead –Continuous work –New direction

12 Four Contextual Language Models Q2Q2 {C 2,1, C 2,2,C 2,3, … } C2C2 … Q1Q1 User Query {C 1,1, C 1,2,C 1,3, …} C1C1 User Clickthrough ? User Information Need How to model and use all the information? QkQk e.g., Apple software e.g., Apple - Mac OS X Apple - Mac OS X The Apple Mac OS X product page. Describes features in the current version of Mac OS X, a screenshot gallery, latest software downloads, and a directory of...

13 Retrieval Model QkQk D θQkθQk θDθD Similarity Measure Results Basis: Unigram language model + KL divergence U Contextual search: query model update using user query and clickthrough history Query HistoryClickthrough

14 Fixed Coefficient Interpolation (FixInt) QkQk Q1Q1 Q k-1 … C1C1 C k-1 … Average user query history and clickthrough Linearly interpolate history models Linearly interpolate current query and history model

15 Bayesian Interpolation (BayesInt) Q1Q1 Q k-1 … C1C1 C k-1 … Average user query and clickthrough history Intuition: if the current query Q k is longer, we should trust Q k more QkQk Dirichlet Prior

16 Online Bayesian Update (OnlineUp) QkQk C2C2 Q1Q1 Intuition: continuous belief update about user information need Q2Q2 C1C1

17 Batch Bayesian Update (BatchUp) C1C1 C2C2 … C k-1 Intuition: clickthrough data may not decay QkQk Q1Q1 Q2Q2

18 Outline Motivation Progress –Framework –Model –System –Evaluation Road ahead –Continuous work –New direction

19 UCAIR Toolbar Architecture ( Search Engine (e.g., Google) Search History Log (e.g.,past queries, clicked results) Query Modification Result Re-Ranking User Modeling Result Buffer UCAIR User query results clickthrough…

20 System Characteristics Client side personalization –Privacy –Distribution of computation –More clues about the user Implicit user modeling Bayesian decision theory and statistical language model

21 User Actions Submit a keyword query View a document Click the “Back” button Click the “Next” link

22 System Responses Decide relatedness of neighboring queries and do query expansion Update user model according to clickthrough Rerank unseen documents

23 Outline Motivation Progress –Framework –Model –System –Evaluation Road ahead –Continuous work –New direction

24 TREC Style Evaluation – Data Set Data collection: TREC AP88-90 Topics: 30 hard topics of TREC topics System: search engine + RDBMS Context: Query and clickthrough history of 3 participants ( )

25 Experiment Design Models: FixInt, BayesInt, OnlineUp and BatchUp Performance Comparison: Q k vs. Q k +H Q +H C Evaluation Metrics: MAP and docs

26 Overall Effect of Search Context Query FixInt (  =0.1,  =1.0) BayesInt (  =0.2, =5.0) OnlineUp (  =5.0, =15.0) BatchUp (  =2.0, =15.0) Q3Q Q 3 +H Q +H C Improve 72.4%32.6%93.8%39.4%67.7%20.2%92.4%39.4% Q4Q Q 4 +H Q +H C Improve 66.2%15.5%78.2%19.9%47.8%6.9%77.2%16.4% Interaction history helps system improve retrieval accuracy BayesInt better than FixInt; BatchUp better than OnlineUp

27 Using Clickthrough Data Only Q3Q Q 3 +H C Improve81.9%37.1% Q4Q Q 4 +H C Improve72.6%18.1% Q3Q Q 3 +H C Improve23.8%23.0% Q4Q Q 4 +H C Improve15.7%-4.1% Q3Q Q 3 +H C Improve99.7%42.4% Q4Q Q 4 +H C Improve 67.2%13.9% BayesInt (  =0.0, =5.0) Clickthrough data can improve retrieval accuracy of unseen relevant docs Clickthrough data corresponding to non- relevant docs are useful for feedback

28 Sensitivity of BatchUp Parameters BatchUp is stable with different parameter settings Best performance is achieved when  =2.0; =15.0

29 A User Study of Personalized Search Six participants use UCAIR toolbar to do web search Topics are selected from TREC web track and terabyte track Participants explicitly evaluate the relevance of top 30 search results from Google and UCAIR

30 Precision at Top N Documents Ranking Method Google UCAIR Improveme nt 8.0%17.8%20.2%21.8% More user interaction, better user model and retrieval accuracy

31 Precision-Recall Curve

32 Outline Motivation Progress –Framework –Model –System –Evaluation Road ahead –Continuous work –New direction

33 Decision Theoretic Framework User model –Include more factors (e.g., readability) –Represent information need in a multi-theme way –Learn user model from data accurately –Compute user model efficiently Loss function goes beyond relevance Short-term context synergize with long-term context

34 Retrieval Models Bridge existing retrieval models and decision theoretic framework (same for active feedback work) Deduce new retrieval models from decision theoretic framework Find effective and efficient retrieval models

35 Retrieval Models (cont.) Study specific parameter settings for personalized web search (e.g., ranking of snippets) Utilize context information in finer-granularity (e.g., query relationship and relative judgment of clickthrough data)

36 System Make system more robust and more efficient Enrich user profile (bookmark, local files, etc.) Study user interface design –How many results are personalized –Aggressive vs. conservative personalization –Result representation –… Study session boundary detection algorithms

37 System (cont.) Add new features into UCAIR toolbar –Incorporate clustering into the system –Predict user preference based on non-textual features (e.g. website, document format) Analyze logs –Simple statistics –Query similarity in a community Distribute the toolbar

38 Evaluation Build an evaluation data set for contextual search (utilize TREC interactive track) Make a large scale user study of contextual search Study privacy issue of UCAIR toolbar Study how to share user logs When will personalization be more effective than non-personalization and vice versa

39 Outline Motivation Progress –Framework –Model –System –Evaluation Road ahead –Continuous work –New direction

40 Application Apply techniques in different domains – Personalized tutoring system – Personalized bioinfo system Collaborative filtering application – Goodies for connecting people – Social network? Combination of client and server for personalization

41 Personalization is a dead end by CEO (Raul Valdes-Perez ) of Vivisimo in Nov., 2004 People are not static Surfing data is weak Whole web page is misleading Home computers are shared by family members Query is short Best personalization is done by individuals themselves Vivisimo way: Clustering, then user explore themselves

42 Personalization is the Holy Grail for search co-founder of Yahoo! (Jerry Yang ) in March, 2005 One size does fit not all CNN report [Yang] also said that the key challenge for Yahoo! and all search companies going forward will be to find ways to increased the personalization of results, i.e. making sure that a user truly finds what he or she is looking for when typing in a keyword search. "The relevance of search is still the Holy Grail for any search application," Yang said.CNN report

43 Thank you ! The End