Download presentation
Presentation is loading. Please wait.
1
Modern Information Retrieval Chapter 4 Query Languages
2
The type of query the user might formulate is largely dependent on the underlying information retrieval model
3
Keyword-based querying single-word queries text documents are long sequences of words ranking of results by term frequency and inverse document frequency exact positions where the query word appears may need to be output
4
context queries to search words near other words phrase query: a sequence of single-words enhance retrieval proximity query: a sequence of single-words with a maximum allowed distance between them enhance the power of retrieval the words may or may not be required to appear in the same order as in the query
5
Boolean queries e 1 BUT e 2 NOT e 2
6
Pattern matching data retrieval capabilities as enhanced tools for IR types of patterns word: computer prefix of a word: comput suffix of a word: ter substring of a word: ute range formed by two words in lexicographical order: communication and computer
7
word with an error threshold edit distance: minimum number of character insertions, deletions, and replacements needed to make the query and the target equal computeers computational biology unit cost edit distance w(a b)=1, a b (replacement) w(a )=w( b)=1 (deletion and insertion)
8
given any two strings S 1 =abac, S 2 =aaccb compute by dynamic programming method from x to y H: delete; V: insert; C: replace the edit distance is 3 abac 01234 a10123 a21112 c32221 c43332 b54343 a b a c c b a a c c b (1 deletion, 2 insertions)
9
regular expression a regular expression is a pattern built up by simple strings and the union, concatenation and repetition operators pro(blem ︱ tein)(s ︱ ε)(0 ︱ 1 ︱ 2)*
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.