A Markov Random Field Model for Term Dependencies

Slides:



Advertisements
Similar presentations
Document Summarization using Conditional Random Fields Dou Shen, Jian-Tao Sun, Hua Li, Qiang Yang, Zheng Chen IJCAI 2007 Hao-Chin Chang Department of Computer.
Advertisements

Markov Networks Alan Ritter.
Learning on the Test Data: Leveraging “Unseen” Features Ben Taskar Ming FaiWong Daphne Koller.
1 Opinion Summarization Using Entity Features and Probabilistic Sentence Coherence Optimization (UIUC at TAC 2008 Opinion Summarization Pilot) Nov 19,
DOMAIN DEPENDENT QUERY REFORMULATION FOR WEB SEARCH Date : 2013/06/17 Author : Van Dang, Giridhar Kumaran, Adam Troy Source : CIKM’12 Advisor : Dr. Jia-Ling.
Beyond Bags of Words: A Markov Random Field Model for Information Retrieval Don Metzler.
Welcome to the World of Investigative Tasks
Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.
A Markov Random Field Model for Term Dependencies Chetan Mishra CS 6501 Paper Presentation Ideas, graphs, charts, and results from paper of same name by.
A Markov Random Field Model for Term Dependencies Donald Metzler and W. Bruce Croft University of Massachusetts, Amherst Center for Intelligent Information.
1 Tensor Query Expansion: a cognitively motivated relevance model Mike Symonds, Peter Bruza, Laurianne Sitbon and Ian Turner Queensland University of Technology.
Integrating term dependencies according to their utility Jian-Yun Nie University of Montreal 1.
Patent Search QUERY Log Analysis Shariq Bashir Department of Software Technology and Interactive Systems Vienna.
Information Retrieval in Practice
Title Extraction from Bodies of HTML Documents and its Application to Web Page Retrieval Microsoft Research Asia Yunhua Hu, Guomao Xin, Ruihua Song, Guoping.
Query Rewriting Using Monolingual Statistical Machine Translation Stefan Riezler Yi Liu Google 2010 Association for Computational Linguistics.
A Comparative Study of Search Result Diversification Methods Wei Zheng and Hui Fang University of Delaware, Newark DE 19716, USA
A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
USER INTERFACE USER INTERFACE February 2, 2006 Intern 박지현 Analyzing shared and team mental models Janice Langan-Foxa Anthony Wirthb, Sharon Codea, Kim.
Effective Query Formulation with Multiple Information Sources
A General Optimization Framework for Smoothing Language Models on Graph Structures Qiaozhu Mei, Duo Zhang, ChengXiang Zhai University of Illinois at Urbana-Champaign.
Graph-based Text Classification: Learn from Your Neighbors Ralitsa Angelova , Gerhard Weikum : Max Planck Institute for Informatics Stuhlsatzenhausweg.
Prototype-Driven Learning for Sequence Models Aria Haghighi and Dan Klein University of California Berkeley Slides prepared by Andrew Carlson for the Semi-
Semantic v.s. Positions: Utilizing Balanced Proximity in Language Model Smoothing for Information Retrieval Rui Yan†, ♮, Han Jiang†, ♮, Mirella Lapata‡,
Conditional Random Fields for ASR Jeremy Morris July 25, 2006.
August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 1/28 Question Answering Passage Retrieval Using Dependency Parsing Hang Cui.
AN EFFECTIVE STATISTICAL APPROACH TO BLOG POST OPINION RETRIEVAL Ben He Craig Macdonald Iadh Ounis University of Glasgow Jiyin He University of Amsterdam.
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,
1 Use graphs and not pure logic Variables represented by nodes and dependencies by edges. Common in our language: “threads of thoughts”, “lines of reasoning”,
Context-Aware Query Classification Huanhuan Cao, Derek Hao Hu, Dou Shen, Daxin Jiang, Jian-Tao Sun, Enhong Chen, Qiang Yang Microsoft Research Asia SIGIR.
Michael Bendersky, W. Bruce Croft Dept. of Computer Science Univ. of Massachusetts Amherst Amherst, MA SIGIR
Yr 7.  Pupils use mathematics as an integral part of classroom activities. They represent their work with objects or pictures and discuss it. They recognise.
Divided Pretreatment to Targets and Intentions for Query Recommendation Reporter: Yangyang Kang /23.
Indri at TREC 2004: UMass Terabyte Track Overview Don Metzler University of Massachusetts, Amherst.
A Framework to Predict the Quality of Answers with Non-Textual Features Jiwoon Jeon, W. Bruce Croft(University of Massachusetts-Amherst) Joon Ho Lee (Soongsil.
1 Random Walks on the Click Graph Nick Craswell and Martin Szummer Microsoft Research Cambridge SIGIR 2007.
Relevant Document Distribution Estimation Method for Resource Selection Luo Si and Jamie Callan School of Computer Science Carnegie Mellon University
Dan Roth University of Illinois, Urbana-Champaign 7 Sequential Models Tutorial on Machine Learning in Natural.
University Of Seoul Ubiquitous Sensor Network Lab Query Dependent Pseudo-Relevance Feedback based on Wikipedia 전자전기컴퓨터공학 부 USN 연구실 G
LEARNING IN A PAIRWISE TERM-TERM PROXIMITY FRAMEWORK FOR INFORMATION RETRIEVAL Ronan Cummins, Colm O’Riordan (SIGIR’09) Speaker : Yi-Ling Tai Date : 2010/03/15.
Unsupervised Learning Part 2. Topics How to determine the K in K-means? Hierarchical clustering Soft clustering with Gaussian mixture models Expectation-Maximization.
Evaluation Anisio Lacerda.
A Formal Study of Information Retrieval Heuristics
Summary of “Efficient Deep Learning for Stereo Matching”
7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.
An Empirical Study of Learning to Rank for Entity Search
Conditional Random Fields for ASR
The Relationship between Confidence Intervals & Hypothesis Tests
Multimodal Learning with Deep Boltzmann Machines
Martin Rajman, Martin Vesely
Compact Query Term Selection Using Topically Related Text
Location Recommendation — for Out-of-Town Users in Location-Based Social Network Yina Meng.
Weakly Learning to Match Experts in Online Community
Learning Markov Networks
Lecture 5 Unsupervised Learning in fully Observed Directed and Undirected Graphical Models.
Markov Random Fields Presented by: Vladan Radosavljevic.
Linking Memories across Time via Neuronal and Dendritic Overlaps in Model Neurons with Active Dendrites  George Kastellakis, Alcino J. Silva, Panayiota.
Date: 2012/11/15 Author: Jin Young Kim, Kevyn Collins-Thompson,
Discriminative Probabilistic Models for Relational Data
Actively Learning Ontology Matching via User Interaction
INF 141: Information Retrieval
Large Scale Findability Analysis
Markov Networks.
Information Retrieval and Web Design
Using Link Information to Enhance Web Page Classification
Topic: Semantic Text Mining
Testing Hypotheses about a Population Proportion
Outline Texture modeling - continued Markov Random Field models
Presentation transcript:

A Markov Random Field Model for Term Dependencies Hongyu Li & Chaorui Chang

Background Dependencies exist between terms in a collection of text Estimating statistical models for general term dependencies is infeasible due to data sparsity Most work on modeling term dependencies in the past has focused on phrases proximity or term co-occurrences

Hypothesis and Solution Dependence models will be more effective for larger collections than smaller collections Incorporating several types of evidence into a dependence model will further improve effectiveness Introducing Markov Random Field

Markov Random Field Also called undirected graph models , model joint distributions In the paper used to model the joint distribution 𝑃 Λ 𝑄,𝐷 over queries Q and documents D Assume graph G consists of query nodes qi and a document node D Joint distribution is defined by 𝑃 Λ 𝑄,𝐷 = 1 𝑍 Λ 𝑐∈𝐶(𝐺) 𝜓(𝑐;Λ) Where C(G) is the set of cliques in G, each 𝜓(.;Λ) is a non-negative potential function

3 variants of MRF model Full independence Query terms qi are independent Sequential dependence Dependence between neighboring query terms Full dependence All query terms are in some way dependent

Potential functions Potential function for 2-clique 𝜓 𝑇 𝑐 = 𝜆 𝑇 log 𝑃 𝑞 𝑖 𝐷 = 𝜆 𝑇 log [ 1− 𝛼 𝐷 𝑡𝑓 𝑞 𝑖 ,𝐷 𝐷 + 𝛼 𝐷 𝑐𝑓 𝑞 𝑖 |𝐶| ] Contiguously sets of query terms within the clique 𝜓 𝑂 𝑐 = 𝜆 𝑂 log 𝑃 #1(𝑞 𝑖 ,…, 𝑞 𝑖+𝑘 ) 𝐷 = 𝜆 𝑂 log [ 1− 𝛼 𝐷 𝑡𝑓 #1(𝑞 𝑖 ,…, 𝑞 𝑖+𝑘 ),𝐷 𝐷 + 𝛼 𝐷 𝑐𝑓 #1(𝑞 𝑖 ,…, 𝑞 𝑖+𝑘 ) |𝐶| ] Non-contiguous sets of query terms 𝜓 𝑈 𝑐 = 𝜆 𝑈 log 𝑃 #𝑢𝑤N(𝑞 𝑖 ,…, 𝑞 𝑗 ) 𝐷 = 𝜆 𝑈 log [ 1− 𝛼 𝐷 𝑡𝑓 #𝑢𝑤N(𝑞 𝑖 ,…, 𝑞 𝑗 ),𝐷 𝐷 + 𝛼 𝐷 𝑐𝑓 #𝑢𝑤N(𝑞 𝑖 ,…, 𝑞 𝑗 ) |𝐶| ]

Ranking Define the ranking function 𝑃 Λ 𝐷|𝑄 = 𝑃 Λ 𝑄,𝐷 𝑃 Λ 𝑄 𝑟𝑎𝑛𝑘 log 𝑃 Λ 𝑄,𝐷 −𝑙𝑜𝑔 𝑃 Λ 𝑄 Potential function can be parameterized as 𝜓 𝑐;Λ =exp⁡( 𝜆 𝑐 𝑓 𝑐 ) 𝑃 Λ 𝐷|𝑄 𝑟𝑎𝑛𝑘 𝑐∈𝐶(𝐺) 𝜆 𝑐 𝑓(𝑐) = 𝑐∈𝑇 𝜆 𝑇 𝑓 𝑇 (𝑐) + 𝑐∈𝑂 𝜆 𝑂 𝑓 𝑂 𝑐 + 𝑐∈𝑂∪𝑈 𝜆 𝑈 𝑓 𝑈 (𝑐)

Training Set parameter values (𝜆 𝑇 , 𝜆 𝑂 , 𝜆 𝑈 ) Train the model by directly maximizing mean average precision Ranking function is invariant to parameter scale, thus 𝜆 𝑇 + 𝜆 𝑂 + 𝜆 𝑈 =1 Example mean average precision surface for the GOV2 collection using the full dependence model

3.Experimental Results analyze the retrieval effectiveness across different collections Journal &Press :small homogeneous collections Web Collections: larger and less homogeneous

3.1 Full Independence variant the cliques are only members of the set T , and therefore we set 𝜆 𝑂 = 𝜆 𝑈 = 0, 𝜆 𝑇 = 1. Ranking function :

AvgP refers to mean average precision, P@10 is precision at 10 ranked documents, and µ is the smoothing parameter used. This results provide a baseline to compare the sequential and full dependence variants Full independence variant results.

3.2 Sequential Dependence variant Models of this form have cliques in T , O, and U. Ranking function : The unordered feature function, 𝑓 𝑈 , has a free parameter N that allows the size of the unordered window (scope of proximity) to vary. We explore window sizes of 2, sentence(8), 50, and “unlimited” to see what impact they have on effectiveness.

Show very little difference across the various window sizes. For the AP, WT10g, and GOV2 collection, the sentence-sized windows performed the best. For the WSJ collection, N = 2 performed the best. The sequential dependence variant outperforms the full independence variant sequential dependence variant results

3.3 Full Dependence variant Consists of cliques in T , O and U. ranking function : We set the parameter N in the feature function 𝑓 𝑈 to be four times the number of query terms in the clique c. We analyze the impact ordered and unordered window feature functions have on effectiveness.

AP collection, there is very little difference . The results for the WSJ collection the ordered features produce a clear improvement over the unordered features, but there is very little difference between using ordered features and the combination of ordered and unordered. The results for the two web collections, WT10g and GOV2, are similar. In both, unordered features perform better than ordered features, but the combination of both ordered and unordered features led to noticeable improvements in mean average precision. full dependence variant results

Strict matching via ordered window features is more important for the smaller newswire collections, due to the homogeneous, clean nature of the documents For the web collections, the opposite is true.

4. CONCLUSIONS Three dependence model variants are described, where each captures different dependencies between query terms. Modeling dependencies can significantly improve retrieval effectiveness across a range of collections. Possible future work includes exploring a wider range of potential functions, applying the model to other retrieval tasks and so on.

thank you