Web Query Disambiguation from Short Sessions Lilyana Mihalkova* and Raymond Mooney University of Texas at Austin *Now at University of Maryland College.

Slides:



Advertisements
Similar presentations
Online Max-Margin Weight Learning with Markov Logic Networks Tuyen N. Huynh and Raymond J. Mooney Machine Learning Group Department of Computer Science.
Advertisements

Processing XML Keyword Search by Constructing Effective Structured Queries Jianxin Li, Chengfei Liu, Rui Zhou and Bo Ning Swinburne University of Technology,
University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Discriminative Structure and Parameter.
1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
Bayesian Abductive Logic Programs Sindhu Raghavan Raymond J. Mooney The University of Texas at Austin 1.
Online Max-Margin Weight Learning for Markov Logic Networks Tuyen N. Huynh and Raymond J. Mooney Machine Learning Group Department of Computer Science.
Markov Logic Networks Instructor: Pedro Domingos.
Speeding Up Inference in Markov Logic Networks by Preprocessing to Reduce the Size of the Resulting Grounded Network Jude Shavlik Sriraam Natarajan Computer.
Efficient Weight Learning for Markov Logic Networks Daniel Lowd University of Washington (Joint work with Pedro Domingos)
Markov Logic: A Unifying Framework for Statistical Relational Learning Pedro Domingos Matthew Richardson
Speaker:Benedict Fehringer Seminar:Probabilistic Models for Information Extraction by Dr. Martin Theobald and Maximilian Dylla Based on Richards, M., and.
Evaluating Search Engine
Daozheng Chen 1, Mustafa Bilgic 2, Lise Getoor 1, David Jacobs 1, Lilyana Mihalkova 1, Tom Yeh 1 1 Department of Computer Science, University of Maryland,
Basic IR: Queries Query is statement of user’s information need. Index is designed to map queries to likely to be relevant documents. Query type, content,
PROBLEM BEING ATTEMPTED Privacy -Enhancing Personalized Web Search Based on:  User's Existing Private Data Browsing History s Recent Documents 
Context-Aware Query Classification Huanhuan Cao 1, Derek Hao Hu 2, Dou Shen 3, Daxin Jiang 4, Jian-Tao Sun 4, Enhong Chen 1 and Qiang Yang 2 1 University.
CSE 574 – Artificial Intelligence II Statistical Relational Learning Instructor: Pedro Domingos.
Mobile Web Search Personalization Kapil Goenka. Outline Introduction & Background Methodology Evaluation Future Work Conclusion.
1 Learning Entity Specific Models Stefan Niculescu Carnegie Mellon University November, 2003.
CSE 574: Artificial Intelligence II Statistical Relational Learning Instructor: Pedro Domingos.
Web Logs and Question Answering Richard Sutcliffe 1, Udo Kruschwitz 2, Thomas Mandl University of Limerick, Ireland 2 - University of Essex, UK 3.
1 CS 430 / INFO 430 Information Retrieval Lecture 24 Usability 2.
COMP 630L Paper Presentation Javy Hoi Ying Lau. Selected Paper “A Large Scale Evaluation and Analysis of Personalized Search Strategies” By Zhicheng Dou,
Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.
University of Kansas Department of Electrical Engineering and Computer Science Dr. Susan Gauch April 2005 I T T C Dr. Susan Gauch Personalized Search Based.
1 Learning the Structure of Markov Logic Networks Stanley Kok & Pedro Domingos Dept. of Computer Science and Eng. University of Washington.
Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.
Cohort Modeling for Enhanced Personalized Search Jinyun YanWei ChuRyen White Rutgers University Microsoft BingMicrosoft Research.
Improving web image search results using query-relative classifiers Josip Krapacy Moray Allanyy Jakob Verbeeky Fr´ed´eric Jurieyy.
1 CS 178H Introduction to Computer Science Research What is CS Research?
Personalization in Local Search Personalization of Content Ranking in the Context of Local Search Philip O’Brien, Xiao Luo, Tony Abou-Assaleh, Weizheng.
1 Transfer Learning by Mapping and Revising Relational Knowledge Raymond J. Mooney University of Texas at Austin with acknowledgements to Lily Mihalkova,
Searching the Web Dr. Frank McCown Intro to Web Science Harding University This work is licensed under Creative Commons Attribution-NonCommercial 3.0Attribution-NonCommercial.
Implicit An Agent-Based Recommendation System for Web Search Presented by Shaun McQuaker Presentation based on paper Implicit:
Dr. Susan Gauch When is a rock not a rock? Conceptual Approaches to Personalized Search and Recommendations Nov. 8, 2011 TResNet.
1 Applying Collaborative Filtering Techniques to Movie Search for Better Ranking and Browsing Seung-Taek Park and David M. Pennock (ACM SIGKDD 2007)
Improving Web Search Ranking by Incorporating User Behavior Information Eugene Agichtein Eric Brill Susan Dumais Microsoft Research.
Query Routing in Peer-to-Peer Web Search Engine Speaker: Pavel Serdyukov Supervisors: Gerhard Weikum Christian Zimmer Matthias Bender International Max.
Grouping search-engine returned citations for person-name queries Reema Al-Kamha, David W. Embley (Proceedings of the 6th annual ACM international workshop.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Glasgow 02/02/04 NN k networks for content-based image retrieval Daniel Heesch.
Interpreting Advertiser Intent in Sponsored Search BHANU C VATTIKONDA, SANTHOSH KODIPAKA, HONGYAN ZHOU, VACHA DAVE, SAIKAT GUHA, ALEX C SNOEREN 1.
Markov Logic And other SRL Approaches
Transfer in Reinforcement Learning via Markov Logic Networks Lisa Torrey, Jude Shavlik, Sriraam Natarajan, Pavan Kuppili, Trevor Walker University of Wisconsin-Madison,
Giorgos Giannopoulos (IMIS/”Athena” R.C and NTU Athens, Greece) Theodore Dalamagas (IMIS/”Athena” R.C., Greece) Timos Sellis (IMIS/”Athena” R.C and NTU.
Personalized Search Xiao Liu
CONCLUSION & FUTURE WORK Given a new user with an information gathering task consisting of document IDs and respective term vectors, this can be compared.
Detecting Dominant Locations from Search Queries Lee Wang, Chuang Wang, Xing Xie, Josh Forman, Yansheng Lu, Wei-Ying Ma, Ying Li SIGIR 2005.
Web Image Retrieval Re-Ranking with Relevance Model Wei-Hao Lin, Rong Jin, Alexander Hauptmann Language Technologies Institute School of Computer Science.
Query Suggestion Naama Kraus Slides are based on the papers: Baeza-Yates, Hurtado, Mendoza, Improving search engines by query clustering Boldi, Bonchi,
BEHAVIORAL TARGETING IN ON-LINE ADVERTISING: AN EMPIRICAL STUDY AUTHORS: JOANNA JAWORSKA MARCIN SYDOW IN DEFENSE: XILING SUN & ARINDAM PAUL.
Personalized Social Search Based on the User’s Social Network David Carmel et al. IBM Research Lab in Haifa, Israel CIKM’09 16 February 2011 Presentation.
Web Information Retrieval Prof. Alessandro Agostini 1 Context in Web Search Steve Lawrence Speaker: Antonella Delmestri IEEE Data Engineering Bulletin.
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
DeepDive Model Dongfang Xu Ph.D student, School of Information, University of Arizona Dec 13, 2015.
Post-Ranking query suggestion by diversifying search Chao Wang.
Bloom Cookies: Web Search Personalization without User Tracking Authors: Nitesh Mor, Oriana Riva, Suman Nath, and John Kubiatowicz Presented by Ben Summers.
26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.
A New Algorithm for Inferring User Search Goals with Feedback Sessions.
Divided Pretreatment to Targets and Intentions for Query Recommendation Reporter: Yangyang Kang /23.
Happy Mittal (Joint work with Prasoon Goyal, Parag Singla and Vibhav Gogate) IIT Delhi New Rules for Domain Independent Lifted.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Predicting Short-Term Interests Using Activity-Based Search Context CIKM’10 Advisor: Jia Ling, Koh Speaker: Yu Cheng, Hsieh.
To Personalize or Not to Personalize: Modeling Queries with Variation in User Intent Presented by Jaime Teevan, Susan T. Dumais, Daniel J. Liebling Microsoft.
Progress Report ekker. Problem Definition In cases such as object recognition, we can not include all possible objects for training. So transfer learning.
University Of Seoul Ubiquitous Sensor Network Lab Query Dependent Pseudo-Relevance Feedback based on Wikipedia 전자전기컴퓨터공학 부 USN 연구실 G
Inferring People’s Site Preference in Web Search
Author: Kazunari Sugiyama, etc. (WWW2004)
Panagiotis G. Ipeirotis Luis Gravano
Retrieval Performance Evaluation - Measures
Presentation transcript:

Web Query Disambiguation from Short Sessions Lilyana Mihalkova* and Raymond Mooney University of Texas at Austin *Now at University of Maryland College Park

Web Query Disambiguation 2 scrubs

3 Existing Approaches Well-studied problem: – [e.g., Sugiyama et al. 04, Sun et al. 05, Dou et al. 07] Build a user profile from a long history of that user’s interactions with the search engine

4 Concerns Privacy concerns – AOL Data Release – “Googling Considered Harmful” [Conti 06] Pragmatic concerns – Storing and protecting data – Identifying users across searches

Proposed Setting Base personalization only on short glimpses of user search activity captured in brief short- term sessions Do not assume across-session user identifiers of any sort 5

How Short is Short-Term? Number of sessions with that many queries Number of queries before ambiguous query

7 Is This Enough Info? 98.7 fm kroq scrubs ??? huntsville hospital ebay.com scrubs ??? scrubs-tv.com scrubs.com

More Closely Related Work [Almeida & Almeida 04]: Similar assumption of short sessions, but better suited for a specialized search engine (e.g. on computer science literature)

Main Challenge How to harness this small amount of potentially noisy information available for each user? – Exploit the relations among sessions, URLs, and queries

Relationship are established by shared queries or clicks that are themselves related Query strings and URLs are related by sharing identical keywords, or by being identical The choices users make over the set of possible results are also interdependent

Exploiting Relational Information To overcome sparsity in individual sessions, need techniques that can effectively combine effects of noisy relations of varying types among entities – Use statistical relational learning (SRL) [Getoor & Taskar 07] – We used one particular SRL model: Markov logic networks (MLNs) [Richardson & Domingos 06]

Rest of This Talk Background on MLNs Details of our model – Ways of relating users – Clauses defined in the model Experimental results Future work

Markov Logic Networks (MLNs) Set of weighted first-order clauses The larger the weight of a clause, the greater the penalty for not satisfying a grounding of this clause – Clauses can be viewed as relational features 13 [Richardson & Domingos 06]

MLNs Continued Q: set of unknown query atoms E: set of evidence atoms What is P(Q|E), according to an MLN: All formulas Number of satisfied groundings of formula i Normalizing constant

MLN Learning and Inference A wide range of algorithms available in the Alchemy software package [Kok et al. 05] Weight learning: we used contrastive divergence [Lowd & Domingos 07] – Adapted it for streaming data Inference: we used MC-SAT [Poon & Domingos 06]

16 Re-Ranking of Search Results MLN Hand-code a set of formulas to capture regularities in the domain -Challenge: define an effective set of relations Learn weights over these formulas from the data -Challenge: noisy data

17 Specific Relationships huntsville hospital ebay scrubs huntsvillehospital.org ebay.com ??? huntsville school... hospitallink.com scrubs scrubs-tv.com … ebay.com scrubs scrubs.com

Collaborative Clauses The user will click on result chosen by sessions related by: ambiguous query some query ambiguous query Shared click Shared keyword click-to-click, click-to- search, search-to-click, or search-to-search ambiguous query some query ambiguous query some other 18

Popularity Clause User will choose result chosen by any previous session, regardless of whether it is related – Effect is that the most popular result becomes the most likely pick 19

Local Clauses User will choose result that shares a keyword with a previous search or click in the current session ambiguous query some query Not effective because of data sparsity 20

Balance Clause If the user chooses one of the results, she will not choose another – Sets up a competition among possible results – Allows the same set of weights to work well for different-size problems 21

22 Empirical Evaluation: Data Provided by MSR Collected in May 2006 on MSN Search First 25 days is training, last 6 days is test ×Does not specify which queries are ambiguous Used DMOZ hierarchy of pages to establish ambiguity ×Only lists actually clicked results, not all that were seen Assumed user saw all results that were clicked at least once for that exact query string

Empirical Evaluation: Models Tested Random re-ranker Popularity Standard collaborative filtering baselines based on preferences of a set of the most closely similar sessions – Collaborative-Pearson – Collaborative-Cosine MLNs – Collaborative + Popularity + Balance – Collaborative + Balance – Collaborative + Popularity + Local + Balance In Paper

24 Empirical Evaluation: Measure Area under the ROC curve (mean average true negative rate) In Paper: Area under precision-recall curve, aka mean average precision

25 AUC-ROC Intuitive Interpretation Assuming that user scans results from the top until a relevant one is found, AUC-ROC captures what proportion of the irrelevant results are not considered by the user

26 Results Collaborative + Popularity + Balance Collaborative + Popularity + Balance

Difficulty Levels 27 Average Case MLN Popularity Possible Results for a given query Proportion Clicked KL

Difficulty Levels 28 Worst Case Worst Case

Future Directions Incorporate more info into the models – How to retrieve relevant information quickly Learn more nuanced models – Cluster shared clicks/keywords and learn separate weight for clusters More advanced measures of user relatedness – Incorporate similarity measures – Use time spent on a page as an indicator of interestedness

Thank you Questions?

First-Order Logic Relations and attributes are represented as predicates 31 Predicate Variable Predicate Literal WorkedFor(brando, coppola) Ground Literal, “Grounding” for short WorkedFor(A, B) Actor(A)

Clauses 32 Dependencies and regularities among the predicates are represented as clauses: PremisesConclusio n Movie(T, A) Director(A) Actor(A) To obtain a grounding of a clause, replace all variables with entity names: Movie(godfather, pacino)Director(pacino) Actor(pacino)