Download presentation
Presentation is loading. Please wait.
1
Modern Information Retrieval Chapter 2 Modeling
2
Can keywords be used to represent a document or a query? keywords as query and matching as query processing cannot generate good results, in general ranking algorithm, document relevance and IR model
3
Taxonomy of IR models
4
Ad hoc and filtering retrieval ad hoc retrieval: static document collection, queries submitted filtering retrieval: static queries, document streaming user profile describes user ’ s preference keywords, relevance feedback and dynamic keywords adjustment
5
Formal characterization of IR models
6
Classic IR Index terms deciding on the importance of a term is difficult consider a term ’ s semantics as well as its distribution in all documents weight ’ s are used to quantify the importance of the index terms for describing the document contents
7
mutual independence assumption simplifies the task of fast ranking computation
8
Boolean model index term weights are binary query as a Boolean expression not, and, or as connectives Users might find it difficult to specify their information needs
9
advantages and disadvantages each document is either relevant or non- relevant given = (0,1,0), is document d j an answer?
10
Vector model Allows partial matching and ranking by a similarity measure
12
Computing index term weights term frequency, tf factor: how well the term describes the document contents inverse document frequency, idf factor: how well the term represents the document
14
the vector model is a popular retrieval model due to its simplicity and performance
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.