A Markov Random Field Model for Term Dependencies Chetan Mishra CS 6501 Paper Presentation Ideas, graphs, charts, and results from paper of same name by.

Slides:



Advertisements
Similar presentations
1 The PageRank Citation Ranking: Bring Order to the web Lawrence Page, Sergey Brin, Rajeev Motwani and Terry Winograd Presented by Fei Li.
Advertisements

Search Engines Information Retrieval in Practice All slides ©Addison Wesley, 2008.
DOMAIN DEPENDENT QUERY REFORMULATION FOR WEB SEARCH Date : 2013/06/17 Author : Van Dang, Giridhar Kumaran, Adam Troy Source : CIKM’12 Advisor : Dr. Jia-Ling.
Beyond Bags of Words: A Markov Random Field Model for Information Retrieval Don Metzler.
ImageCLEF breakout session Please help us to prepare ImageCLEF2010.
Language Model based Information Retrieval: University of Saarland 1 A Hidden Markov Model Information Retrieval System Mahboob Alam Khalid.
Part II − Application to Markov Structures. A Motivating Example ✕✕✕✕✕✕✕✕
CS774. Markov Random Field : Theory and Application Lecture 06 Kyomin Jung KAIST Sep
Information Retrieval Visualization CPSC 533c Class Presentation Qixing Zheng March 22, 2004.
Incorporating Language Modeling into the Inference Network Retrieval Framework Don Metzler.
A Markov Random Field Model for Term Dependencies Donald Metzler and W. Bruce Croft University of Massachusetts, Amherst Center for Intelligent Information.
CSE 574 – Artificial Intelligence II Statistical Relational Learning Instructor: Pedro Domingos.
Semantic text features from small world graphs Jure Leskovec, IJS + CMU John Shawe-Taylor, Southampton.
Integrating term dependencies according to their utility Jian-Yun Nie University of Montreal 1.
PROBABILITY AND SAMPLES: THE DISTRIBUTION OF SAMPLE MEANS.
Information Retrieval in Practice
Improving Software Package Search Quality Dan Fingal and Jamie Nicolson.
Webpage Understanding: an Integrated Approach
The College of Saint Rose CSC 460 / CIS 560 – Search and Information Retrieval David Goldschmidt, Ph.D. from Search Engines: Information Retrieval in Practice,
Multi-Style Language Model for Web Scale Information Retrieval Kuansan Wang, Xiaolong Li and Jianfeng Gao SIGIR 2010 Min-Hsuan Lai Department of Computer.
Active Learning for Class Imbalance Problem
CS774. Markov Random Field : Theory and Application Lecture 08 Kyomin Jung KAIST Sep
Introduction to Web Mining Spring What is data mining? Data mining is extraction of useful patterns from data sources, e.g., databases, texts, web,
A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.
CIKM’09 Date:2010/8/24 Advisor: Dr. Koh, Jia-Ling Speaker: Lin, Yi-Jhen 1.
CS315 – Link Analysis Three generations of Search Engines Anchor text Link analysis for ranking Pagerank HITS.
Mining the Web to Create Minority Language Corpora Rayid Ghani Accenture Technology Labs - Research Rosie Jones Carnegie Mellon University Dunja Mladenic.
Effective Query Formulation with Multiple Information Sources
A General Optimization Framework for Smoothing Language Models on Graph Structures Qiaozhu Mei, Duo Zhang, ChengXiang Zhai University of Illinois at Urbana-Champaign.
Relevance Feedback Hongning Wang What we have learned so far Information Retrieval User results Query Rep Doc Rep (Index) Ranker.
Web Image Retrieval Re-Ranking with Relevance Model Wei-Hao Lin, Rong Jin, Alexander Hauptmann Language Technologies Institute School of Computer Science.
The Anatomy of a Large-Scale Hyper textual Web Search Engine S. Brin, L. Page Presenter :- Abhishek Taneja.
GUIDED BY DR. A. J. AGRAWAL Search Engine By Chetan R. Rathod.
Prototype-Driven Learning for Sequence Models Aria Haghighi and Dan Klein University of California Berkeley Slides prepared by Andrew Carlson for the Semi-
WEB MINING. In recent years the growth of the World Wide Web exceeded all expectations. Today there are several billions of HTML documents, pictures and.
Date : 2013/03/18 Author : Jeffrey Pound, Alexander K. Hudek, Ihab F. Ilyas, Grant Weddell Source : CIKM’12 Speaker : Er-Gang Liu Advisor : Prof. Jia-Ling.
 Why I am reading  Defines what is important  Creates motivation  Determines speed  Increases memory of material.
©2003 Paula Matuszek CSC 9010: Text Mining Applications Dr. Paula Matuszek (610)
Building a Distributed Full-Text Index for the Web by Sergey Melnik, Sriram Raghavan, Beverly Yang and Hector Garcia-Molina from Stanford University Presented.
Approximate sentence matching and its applications in corpus-based research Rafał Jaworski INFuture2015, Zagreb, Croatia.
Dependency Networks for Collaborative Filtering and Data Visualization UAI-2000 발표 : 황규백.
Latent Dirichlet Allocation
KNN & Naïve Bayes Hongning Wang Today’s lecture Instance-based classifiers – k nearest neighbors – Non-parametric learning algorithm Model-based.
Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,
More Than Relevance: High Utility Query Recommendation By Mining Users' Search Behaviors Xiaofei Zhu, Jiafeng Guo, Xueqi Cheng, Yanyan Lan Institute of.
Title Authors Introduction Text, text, text, text, text, text Background Information Text, text, text, text, text, text Observations Text, text, text,
Modern Retrieval Evaluations Hongning Wang
A Novel Relational Learning-to- Rank Approach for Topic-focused Multi-Document Summarization Yadong Zhu, Yanyan Lan, Jiafeng Guo, Pan Du, Xueqi Cheng Institute.
Relevance Feedback Hongning Wang
On Using SIFT Descriptors for Image Parameter Evaluation Authors: Patrick M. McInerney 1, Juan M. Banda 1, and Rafal A. Angryk 2 1 Montana State University,
A Framework to Predict the Quality of Answers with Non-Textual Features Jiwoon Jeon, W. Bruce Croft(University of Massachusetts-Amherst) Joon Ho Lee (Soongsil.
1 Random Walks on the Click Graph Nick Craswell and Martin Szummer Microsoft Research Cambridge SIGIR 2007.
Markov Random Fields in Vision
CS791 - Technologies of Google Spring A Web­based Kernel Function for Measuring the Similarity of Short Text Snippets By Mehran Sahami, Timothy.
Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.
KNN & Naïve Bayes Hongning Wang
Innovative Novartis Knowledge Center
Federated text retrieval from uncooperative overlapped collections Milad Shokouhi, RMIT University, Melbourne, Australia Justin Zobel, RMIT University,
Learning Bayesian Networks for Complex Relational Data
Reading Notes Wang Ning Lab of Database and Information Systems
Evaluation of IR Systems
An Empirical Study of Learning to Rank for Entity Search
Martin Rajman, Martin Vesely
Video Summarization via Determinantal Point Processes (DPP)
Compact Query Term Selection Using Topically Related Text
A Markov Random Field Model for Term Dependencies
Speaker: Jim-An Tsai Advisor: Professor Jia-ling Koh
Introduction State your research question, problem leading to the study and purpose for the study. Identify the research approach, participants and research.
INFORMATION VISUALIZATION (CS 5984) PRESENTATION
Presentation transcript:

A Markov Random Field Model for Term Dependencies Chetan Mishra CS 6501 Paper Presentation Ideas, graphs, charts, and results from paper of same name by Metzler and Croft 2005 (SIGIR)

Agenda 1.Motivation behind the work 2.Background – What is a Markov Random Field (MRF)? 3.Research Insight – How did the authors use MRF to model term dependencies? Results? 4.Future Work – If you thought this was interesting, how could you build on this? 5.Conclusion 6501: Text Mining2 AgendaMotivationBackgroundResearch InsightFuture WorkConclusion

Motivation Terms are not independently distributed – A model incorporating term dependencies should outperform a model that ignores them One problem: models incorporating term dependencies seemed no better or worse – Statistical models weren’t effectively modeling term dependencies – Why? 6501: Text Mining3 AgendaMotivationBackgroundResearch InsightFuture WorkConclusion

Motivation Two Problems (perspective of authors): – Problem 1: Most models have taken bag of word- like approaches (which have tremendous data requirements) – Solution 1: We need a new type of model – Problem 2: Term dependency modeling (even with a reasonable model) requires a significant corpus – Solution 2: Add to research testing collections large, web-scraped corpuses 6501: Text Mining4 AgendaMotivationBackgroundResearch InsightFuture WorkConclusion

Background What is a Markov random field (MRF) model? – Fancy name for a bidirectional graph-based model – Often used in machine learning to succinctly model joint distributions MRF models are used in the paper to tackle the problem of document retrieval with response to a query 6501: Text Mining5 AgendaMotivationBackgroundResearch InsightFuture WorkConclusion

Model Overview 6501: Text Mining6 AgendaMotivationBackgroundResearch InsightFuture WorkConclusion

Model Overview 6501: Text Mining7 AgendaMotivationBackgroundResearch InsightFuture WorkConclusion

Model Overview 6501: Text Mining8 By the joint probability law AgendaMotivationBackgroundResearch InsightFuture WorkConclusion

The Markov Random Field Model 6501: Text Mining9 AgendaMotivationBackgroundResearch InsightFuture WorkConclusion

The Markov Random Field Model 6501: Text Mining10 AgendaMotivationBackgroundResearch InsightFuture WorkConclusion

The Markov Random Field Model The paper looks at the performance of three general types of dependencies: – Independence – Sequential dependence – Full dependence Visual Depiction: 6501: Text Mining11 Metzler and Croft ‘05 AgendaMotivationBackgroundResearch InsightFuture WorkConclusion

Potential Functions 6501: Text Mining12 AgendaMotivationBackgroundResearch InsightFuture WorkConclusion All log scale!

Parameter Training 6501: Text Mining13 AgendaMotivationBackgroundResearch InsightFuture WorkConclusion

Parameter Training What optimization technique do we use? – Authors found a shape common to the metric surface via parameter sweep A hill-climbing search should work well 6501: Text Mining14 AgendaMotivationBackgroundResearch InsightFuture WorkConclusion

Results Did MRF’s help? – I’d say so. Significant gains across data sets 6501: Text Mining15 AgendaMotivationBackgroundResearch InsightFuture WorkConclusion IndependentSequential DependenceFull Dependence

Future Work 6501: Text Mining16 AgendaMotivationBackgroundResearch InsightFuture WorkConclusion

1.Motivation behind the work 2.Background – What is a Markov Random Field Model (MRF)? 3.Research Insight – How did the authors use MRF to model term dependencies? Results? 4.Future Work – If you thought this was interesting, how could you build on this? 5.Conclusion 6501: Text Mining17 AgendaMotivationBackgroundResearch InsightFuture WorkConclusion

Questions? 6501: Text Mining18