Presentation is loading. Please wait.

Presentation is loading. Please wait.

UCAIR Project Xuehua Shen, Bin Tan, ChengXiang Zhai

Similar presentations


Presentation on theme: "UCAIR Project Xuehua Shen, Bin Tan, ChengXiang Zhai"— Presentation transcript:

1 UCAIR Project Xuehua Shen, Bin Tan, ChengXiang Zhai http://sifaka.cs.uiuc.edu/ir/ucair/

2 2 Outline Motivation Progress –Framework –Model –System –Evaluation Road ahead –Continuous work –New direction

3 3 Problem of Context-Independent Search Jaguar Car Apple Software Animal Chemistry Software

4 4 Other Context Info: Dwelling time Mouse movement Clickthrough Query History Put Search in Context Apple software Hobby …

5 5 Outline Motivation Progress –Framework –Model –System –Evaluation Road ahead –Continuous work –New direction

6 6 A Decision Theoretic Framework Model interactive IR as “action dialog”: cycles of user action and system response User actionSystem response Submit a new queryRetrieve new documents View a documentRerank document

7 7 A Decision Theoretic Framework (cont.) Search optimal system response given a new user action

8 8 User Models Components of user model M –User information need –User viewed documents S –User actions A t and system responses R t-1 –…

9 9 Loss Functions Loss function for result reranking Loss function for query expansion

10 10 Implicit User Modeling Update user information need given a new query Learn better user models given skipped top n documents and viewed the (n+1)-th document

11 11 Outline Motivation Progress –Framework –Model –System –Evaluation Road ahead –Continuous work –New direction

12 12 Four Contextual Language Models Q2Q2 {C 2,1, C 2,2,C 2,3, … } C2C2 … Q1Q1 User Query {C 1,1, C 1,2,C 1,3, …} C1C1 User Clickthrough ? User Information Need How to model and use all the information? QkQk e.g., Apple software e.g., Apple - Mac OS X Apple - Mac OS X The Apple Mac OS X product page. Describes features in the current version of Mac OS X, a screenshot gallery, latest software downloads, and a directory of...

13 13 Retrieval Model QkQk D θQkθQk θDθD Similarity Measure Results Basis: Unigram language model + KL divergence U Contextual search: query model update using user query and clickthrough history Query HistoryClickthrough

14 14 Fixed Coefficient Interpolation (FixInt) QkQk Q1Q1 Q k-1 … C1C1 C k-1 … Average user query history and clickthrough Linearly interpolate history models Linearly interpolate current query and history model

15 15 Bayesian Interpolation (BayesInt) Q1Q1 Q k-1 … C1C1 C k-1 … Average user query and clickthrough history Intuition: if the current query Q k is longer, we should trust Q k more QkQk Dirichlet Prior

16 16 Online Bayesian Update (OnlineUp) QkQk C2C2 Q1Q1 Intuition: continuous belief update about user information need Q2Q2 C1C1

17 17 Batch Bayesian Update (BatchUp) C1C1 C2C2 … C k-1 Intuition: clickthrough data may not decay QkQk Q1Q1 Q2Q2

18 18 Outline Motivation Progress –Framework –Model –System –Evaluation Road ahead –Continuous work –New direction

19 19 UCAIR Toolbar Architecture (http://sifaka.cs.uiuc.edu/ir/ucair/download.html) Search Engine (e.g., Google) Search History Log (e.g.,past queries, clicked results) Query Modification Result Re-Ranking User Modeling Result Buffer UCAIR User query results clickthrough…

20 20 System Characteristics Client side personalization –Privacy –Distribution of computation –More clues about the user Implicit user modeling Bayesian decision theory and statistical language model

21 21 User Actions Submit a keyword query View a document Click the “Back” button Click the “Next” link

22 22 System Responses Decide relatedness of neighboring queries and do query expansion Update user model according to clickthrough Rerank unseen documents

23 23 Outline Motivation Progress –Framework –Model –System –Evaluation Road ahead –Continuous work –New direction

24 24 TREC Style Evaluation – Data Set Data collection: TREC AP88-90 Topics: 30 hard topics of TREC topics 1-150 System: search engine + RDBMS Context: Query and clickthrough history of 3 participants ( http://sifaka.cs.uiuc.edu/ir/ucair/QCHistory.zip )

25 25 Experiment Design Models: FixInt, BayesInt, OnlineUp and BatchUp Performance Comparison: Q k vs. Q k +H Q +H C Evaluation Metrics: MAP and Pr@20 docs

26 26 Overall Effect of Search Context Query FixInt (  =0.1,  =1.0) BayesInt (  =0.2, =5.0) OnlineUp (  =5.0, =15.0) BatchUp (  =2.0, =15.0) MAPpr@20MAPpr@20MAPpr@20MAPpr@20 Q3Q3 0.04210.14830.04210.14830.04210.14830.04210.1483 Q 3 +H Q +H C 0.07260.19670.08160.20670.07060.17830.08100.2067 Improve 72.4%32.6%93.8%39.4%67.7%20.2%92.4%39.4% Q4Q4 0.05360.19330.05360.19330.05360.19330.05360.1933 Q 4 +H Q +H C 0.08910.22330.09550.23170.07920.20670.09500.2250 Improve 66.2%15.5%78.2%19.9%47.8%6.9%77.2%16.4% Interaction history helps system improve retrieval accuracy BayesInt better than FixInt; BatchUp better than OnlineUp

27 27 Using Clickthrough Data Only QueryMAPpr@20 Q3Q3 0.04210.1483 Q 3 +H C 0.07660.2033 Improve81.9%37.1% Q4Q4 0.05360.1930 Q 4 +H C 0.09250.2283 Improve72.6%18.1% QueryMAPpr@20 Q3Q3 0.04210.1483 Q 3 +H C 0.05210.1820 Improve23.8%23.0% Q4Q4 0.05360.1930 Q 4 +H C 0.06200.1850 Improve15.7%-4.1% QueryMAPpr@20 Q3Q3 0.03310.125 Q 3 +H C 0.06610.178 Improve99.7%42.4% Q4Q4 0.04420.165 Q 4 +H C 0.07390.188 Improve 67.2%13.9% BayesInt (  =0.0, =5.0) Clickthrough data can improve retrieval accuracy of unseen relevant docs Clickthrough data corresponding to non- relevant docs are useful for feedback

28 28 Sensitivity of BatchUp Parameters BatchUp is stable with different parameter settings Best performance is achieved when  =2.0; =15.0

29 29 A User Study of Personalized Search Six participants use UCAIR toolbar to do web search Topics are selected from TREC web track and terabyte track Participants explicitly evaluate the relevance of top 30 search results from Google and UCAIR

30 30 Precision at Top N Documents Ranking Method prec@5prec@10prec@20prec@30 Google0.5380.4720.3770.308 UCAIR0.5810.5560.4530.375 Improveme nt 8.0%17.8%20.2%21.8% More user interaction, better user model and retrieval accuracy

31 31 Precision-Recall Curve

32 32 Outline Motivation Progress –Framework –Model –System –Evaluation Road ahead –Continuous work –New direction

33 33 Decision Theoretic Framework User model –Include more factors (e.g., readability) –Represent information need in a multi-theme way –Learn user model from data accurately –Compute user model efficiently Loss function goes beyond relevance Short-term context synergize with long-term context

34 34 Retrieval Models Bridge existing retrieval models and decision theoretic framework (same for active feedback work) Deduce new retrieval models from decision theoretic framework Find effective and efficient retrieval models

35 35 Retrieval Models (cont.) Study specific parameter settings for personalized web search (e.g., ranking of snippets) Utilize context information in finer-granularity (e.g., query relationship and relative judgment of clickthrough data)

36 36 System Make system more robust and more efficient Enrich user profile (bookmark, local files, etc.) Study user interface design –How many results are personalized –Aggressive vs. conservative personalization –Result representation –… Study session boundary detection algorithms

37 37 System (cont.) Add new features into UCAIR toolbar –Incorporate clustering into the system –Predict user preference based on non-textual features (e.g. website, document format) Analyze logs –Simple statistics –Query similarity in a community Distribute the toolbar

38 38 Evaluation Build an evaluation data set for contextual search (utilize TREC interactive track) Make a large scale user study of contextual search Study privacy issue of UCAIR toolbar Study how to share user logs When will personalization be more effective than non-personalization and vice versa

39 39 Outline Motivation Progress –Framework –Model –System –Evaluation Road ahead –Continuous work –New direction

40 40 Application Apply techniques in different domains – Personalized tutoring system – Personalized bioinfo system Collaborative filtering application – Goodies for connecting people – Social network? Combination of client and server for personalization

41 41 Personalization is a dead end by CEO (Raul Valdes-Perez ) of Vivisimo in Nov., 2004 People are not static Surfing data is weak Whole web page is misleading Home computers are shared by family members Query is short Best personalization is done by individuals themselves Vivisimo way: Clustering, then user explore themselves

42 42 Personalization is the Holy Grail for search co-founder of Yahoo! (Jerry Yang ) in March, 2005 One size does fit not all CNN report [Yang] also said that the key challenge for Yahoo! and all search companies going forward will be to find ways to increased the personalization of results, i.e. making sure that a user truly finds what he or she is looking for when typing in a keyword search. "The relevance of search is still the Holy Grail for any search application," Yang said.CNN report

43 43 Thank you ! The End


Download ppt "UCAIR Project Xuehua Shen, Bin Tan, ChengXiang Zhai"

Similar presentations


Ads by Google