1 Boolean Model. 2 A document is represented as a set of keywords. Queries are Boolean expressions of keywords, connected by AND, OR, and NOT, including.

Slides:



Advertisements
Similar presentations
1 Evaluations in information retrieval. 2 Evaluations in information retrieval: summary The following gives an overview of approaches that are applied.
Advertisements

Traditional IR models Jian-Yun Nie.
Boolean and Vector Space Retrieval Models
Chapter 5: Introduction to Information Retrieval
Modern information retrieval Modelling. Introduction IR systems usually adopt index terms to process queries IR systems usually adopt index terms to process.
Multimedia Database Systems
Basic IR: Modeling Basic IR Task: Slightly more complex:
INSTRUCTOR: DR.NICK EVANGELOPOULOS PRESENTED BY: QIUXIA WU CHAPTER 2 Information retrieval DSCI 5240.
Modern Information Retrieval Chapter 1: Introduction
Beyond Boolean Queries Ranked retrieval  Thus far, our queries have all been Boolean.  Documents either match or don’t.  Good for expert users with.
Web Search - Summer Term 2006 II. Information Retrieval (Basics Cont.)
Motivation and Outline
IR Models: Overview, Boolean, and Vector
Search Engines and Information Retrieval
ISP 433/533 Week 2 IR Models.
1 Boolean and Vector Space Retrieval Models Many slides in this section are adapted from Prof Raymond Mooney (UTexas), Prof. Joydeep Ghosh (UT ECE) who.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
Information Retrieval Modeling CS 652 Information Extraction and Integration.
T.Sharon - A.Frank 1 Internet Resources Discovery (IRD) IR Queries.
Modern Information Retrieval Chapter 2 Modeling. Probabilistic model the appearance or absent of an index term in a document is interpreted either as.
Modern Information Retrieval Chapter 2 Modeling. Can keywords be used to represent a document or a query? keywords as query and matching as query processing.
Chapter 2Modeling 資工 4B 陳建勳. Introduction.  Traditional information retrieval systems usually adopt index terms to index and retrieve documents.
Modeling Modern Information Retrieval
Project Management: The project is due on Friday inweek13.
Retrieval Models II Vector Space, Probabilistic.  Allan, Ballesteros, Croft, and/or Turtle Properties of Inner Product The inner product is unbounded.
Modern Information Retrieval Chapter 2 Modeling. Can keywords be used to represent a document or a query? keywords as query and matching as query processing.
Recuperação de Informação. IR: representation, storage, organization of, and access to information items Emphasis is on the retrieval of information (not.
Information Retrieval: Foundation to Web Search Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems August 13, 2015 Some.
Modeling (Chap. 2) Modern Information Retrieval Spring 2000.
Boolean and Vector Space Models
Search Engines and Information Retrieval Chapter 1.
CS344: Introduction to Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 32-33: Information Retrieval: Basic concepts and Model.
Querying Structured Text in an XML Database By Xuemei Luo.
Information Retrieval Introduction/Overview Material for these slides obtained from: Modern Information Retrieval by Ricardo Baeza-Yates and Berthier Ribeiro-Neto.
PrasadL2IRModels1 Models for IR Adapted from Lectures by Berthier Ribeiro-Neto (Brazil), Prabhakar Raghavan (Yahoo and Stanford) and Christopher Manning.
Information Retrieval Chapter 2: Modeling 2.1, 2.2, 2.3, 2.4, 2.5.1, 2.5.2, Slides provided by the author, modified by L N Cassel September 2003.
Information Retrieval Models - 1 Boolean. Introduction IR systems usually adopt index terms to process queries Index terms:  A keyword or group of selected.
Basic ranking Models Boolean and Vector Space Models.
Introduction to Digital Libraries hussein suleman uct cs honours 2003.
Information Retrieval Model Aj. Khuanlux MitsophonsiriCS.426 INFORMATION RETRIEVAL.
CSCE 5300 Information Retrieval and Web Search Introduction to IR models and methods Instructor: Rada Mihalcea Class web page:
1 University of Palestine Topics In CIS ITBS 3202 Ms. Eman Alajrami 2 nd Semester
1 Patrick Lambrix Department of Computer and Information Science Linköpings universitet Information Retrieval.
Vector Space Models.
1 Information Retrieval LECTURE 1 : Introduction.
The Boolean Model Simple model based on set theory
Information Retrieval Transfer Cycle Dania Bilal IS 530 Fall 2007.
Information Retrieval and Web Search IR models: Boolean model Instructor: Rada Mihalcea Class web page:
Recuperação de Informação B Cap. 02: Modeling (Set Theoretic Models) 2.6 September 08, 1999.
Set Theoretic Models 1. IR Models Non-Overlapping Lists Proximal Nodes Structured Models Retrieval: Adhoc Filtering Browsing U s e r T a s k Classic Models.
Ranking of Database Query Results Nitesh Maan, Arujn Saraswat, Nishant Kapoor.
Information Retrieval and Web Search Introduction to IR models and methods Rada Mihalcea (Some of the slides in this slide set come from IR courses taught.
Introduction n IR systems usually adopt index terms to process queries n Index term: u a keyword or group of selected words u any word (more general) n.
Plan for Today’s Lecture(s)
Why the interest in Queries?
Latent Semantic Indexing
موضوع پروژه : بازیابی اطلاعات Information Retrieval
Evaluation of IR Performance
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
Relational Database Design
4. Boolean and Vector Space Retrieval Models
Boolean and Vector Space Retrieval Models
Recuperação de Informação B
Recuperação de Informação B
Information Retrieval and Web Design
Recuperação de Informação B
Berlin Chen Department of Computer Science & Information Engineering
Information Retrieval and Web Design
Advanced information retrieval
Presentation transcript:

1 Boolean Model

2 A document is represented as a set of keywords. Queries are Boolean expressions of keywords, connected by AND, OR, and NOT, including the use of brackets to indicate scope. –[[Rio & Brazil] | [Hilo & Hawaii]] & hotel & !Hilton] Output: Document is relevant or not. No partial matches or ranking.

3 Simple model based on set theory; Queries specified as Boolean expressions: –precise semantics; –neat formalism; –q = ka  (kb   kc). Terms are either present or absent. Thus, wij  {0,1}; Consider: –q = ka  (kb   kc) –vec(qdnf) = (1,1,1)  (1,1,0)  (1,0,0) –vec(qcc) = (1,1,0) is a conjunctive component. Boolean Model

4 q = ka  (kb   kc) sim(q,dj) = 1 if  vec(qcc) | (vec(qcc)  vec(qdnf))  (  ki, gi(vec(dj)) = gi(vec(qcc))) 0 otherwise (1,1,1) (1,0,0) (1,1,0) KaKb Kc Boolean Model

5 Popular retrieval model because: –Easy to understand for simple queries. –Clean formalism. Boolean models can be extended to include ranking. Reasonably efficient implementations possible for normal queries. Boolean Retrieval Model

6 Boolean Models  Problems Retrieval based on binary decision criteria with no notion of partial matching; No ranking of the documents is provided (absence of a grading scale); Very rigid: AND means all; OR means any. Information need has to be translated into a Boolean expression which most users find awkward; The Boolean queries formulated by the users are most often too simplistic; It is difficult to express complex user requests.

7 Boolean Models  Problems As a consequence, Boolean model frequently returns either too few or too many documents in response to a user query. Difficult to control the number of documents retrieved. –All matched documents will be returned. Difficult to rank output. –All matched documents logically satisfy the query. Difficult to perform relevance feedback. –If a document is identified by the user as relevant or irrelevant, how should the query be modified?