Download presentation
Presentation is loading. Please wait.
Published byKathleen Hunt Modified over 9 years ago
1
Survey on Long Queries in Keyword Search : Phrase-based IR Sungchan Park 2008. 08. 07.
2
Copyright 2008 by CEBT Survey So Far… Jaehui Term Proximity Scoring Jung-Yeon Semantic Query Jongheum Index Structure Optimized for Multi-keyword Query 2
3
Copyright 2008 by CEBT My Topic: Phrase-based IR Why? The presence of phrases is one significant difference between single word queries and multi word queries. And identifying phrases is important for understanding real meanings of sentences. – Ex) “hot dog” Thus, how to identify and use phrases in queries is important in devising processing strategy for multi word queries. Focus of Survey Using Phrases(Judging Relevance) – Skipped the contents about identifying phrases 3
4
Copyright 2008 by CEBT Early Researches on Phrase-based IR Using fixed proximity constraints(window size) “The Use of Phrase and Structural Queries in Information Retrieval”(1991) “Evaluation of Syntactic Phrase Indexing”(1996) … 4 word#1word#2word#3 Relevant Document Query Phrase word#1 word#2 word#3 Window
5
Copyright 2008 by CEBT Progress #1: Structural Proximity “Phrase-based Information Retrieval” A.T. Arampatiz et al. 1998 Identifying noun phrases in documents, and using the noun phrases for criteria of “nearness” 5 … A noun phrase identified by NLP engine … radioprogramsBBC Relevant Document Query Phrase The studios for later BBC on radio programs
6
Copyright 2008 by CEBT Progress #1: Structural Proximity, Experiment Experiment Result Gained high precision But loses recall – The auhors wrote it can be addressed by taking into account linguistic variation and anaphora. 6
7
Copyright 2008 by CEBT Progress #2: Varied Window Size “An Effective Approach to Document Retrieval via Utilizing Wordnet and Recognizing Phrases” Shuang Liu et al. 2004 – Their consequent work was published in 2007 Classifying phrases into four types – Proper name – Dictionary phrase – Simple phrase – Complex phrase – Proximity constraints of each types are different! 7
8
Copyright 2008 by CEBT Progress #2: Varied Window Size, Example 8 SungchanPark NOT Relevant DocumentQuery Phrase #1 Sungchan Park … was hospitalized for mental problem … and had been on lithium for his illness Recently … mentalillness Relevant DocumentQuery Phrase #2 mental illness
9
Copyright 2008 by CEBT Progress #2: Varied Window Size, Solution Solution Learning the window size for each phrase types. – Result by Decision Tree Proper name : 0 Dictionary phrase : 16 Simple phrase : 48 Complex phrase : 78 9
10
Copyright 2008 by CEBT Progress #2: Varied Window Size, Experiment Experiment Result The author did not compare their approach with naïve approach. In my focus, above result only shows that phrase-based IR can improve performance of IR system. 10
11
Copyright 2008 by CEBT Conclusion Phrase-based relevance model have been researched by only few researchers However, the progresses are interesting – Determine nearness via sentence structure. – Varying proximity constraints according to type of query phrase. 11
12
Copyright 2008 by CEBT References The Use of Phrase and Structural Queries in Information Retrieval, 1991 Evaluation of Syntactic Phrase Indexing, 1996 Phrase-based Information Retrieval, 1998 Phrase Recognition and Expansion for Short, Precision-biased Queries based on a Query log, 1999 The Use of Phrases from Query Texts in Information Retrieval, 2000 An Effective Approach to Document Retrieval via Utilizing Wordnet and Recognizing Phrases, 2004 The Role of Multi-word Units in Interactive Information Retrieval, 2005 Recognition and Classification of Noun Phrases in Queries for Effective Retrieval, 2007 12
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.