Download presentation
Presentation is loading. Please wait.
Published byAudra Hudson Modified over 8 years ago
1
BOOLEAN INFORMATION RETRIEVAL 1Adrienn Skrop
2
Boolean Information Retrieval The Boolean model of IR (BIR) is a classical IR model and, at the same time, the first and most adopted one. It is used by virtually all commercial IR systems today. The BIR is based on: Boolean Logic and classical Sets Theory in that both the documents to be searched and the user's query are conceived as sets of terms. Retrieval is based on whether or not the documents contain the query terms. 2Adrienn Skrop
3
Boolean logic 3Adrienn Skrop
4
Proposition The proposition is a statement (formulation, assertion) which can be assigned either a value T or a value F (there is no third alternative), where T and F are two different values, i.e., T F. For example, T = true, F = false, or T = yes, F = no, or T = white, F = black, or T = 1, F = 0. The values T and F will be referred to as truth values. A proposition cannot be true and false at the same time (principle of noncontradiction). 4Adrienn Skrop
5
Example “I am reading this text.” is a true proposition. The sentence “The sun is shining.” is also a proposition because either the value T or F can be assigned to it. The sentence „The cooks wearing red hats are playing football at the North Pole.“ becomes a proposition if a truth value can be assigned to it. In general, it is not a necessary quality of an assertion or proposition to be true. For example, the proposition “It is raining.” may be true or false. However, there are propositions that are ‘absolutely’ true (e.g., “The year 2001 is the first year of the 21st century.”) 5Adrienn Skrop
6
Negation The negation of a proposition P is a proposition denoted by P and pronounced “not P”. If P is true, then P is false, and if P is false, then P is true. Hence, ( P) is always P (law of double negation). Truth table of logical negation “I am not reading this text.” is a false proposition, and is the negation of the proposition “I am reading this text.”. 6 P PP TF FT Adrienn Skrop
7
Conjunction Given two propositions: P, Q. The proposition denoted by P Λ Q (pronunciation: “P and Q”) is called conjunction. The conjunction is true if and only if both P and Q are true, and false otherwise. Thus, P Λ ( P) is always false (law of contradiction). Truth table of logical conjunction 7 PQ P Λ Q TTT TFF FTF FFF Adrienn Skrop
8
Example “I am reading this text. Λ It is raining.” is a proposition, and its truth value can be assigned by the Reader. “I am thinking at myself. Λ A bicycle has two wheels.” is a proposition (the Reader can assign a truth value to it), albeit one would rarely link its two constituent propositions into one sentence in everyday speech. 8Adrienn Skrop
9
Disjunction Given two propositions: P, Q. The proposition denoted by P V Q (pronunciation: “P or Q”) is called disjunction. The disjunction is false if and only if both P and Q are false, and true otherwise. Thus, P V ( P) is always true (law of excluded third). Truth table of logical disjunction 9 PQ P V Q TTT TFT FTT FFF Adrienn Skrop
10
Example “I am reading this text. V It is raining.” is a true proposition (regardless of whether it is actually raining or not). 10Adrienn Skrop
11
Sets theory 11Adrienn Skrop
12
Sets The notion of set is a fundamental one. It does not have a mathematical definition. A set is a collection of distinct objects. The objects in a set are called elements. If an object x is an element of a set S (equivalent formulation: x belongs to S), this is denoted as x S. x S means that x does not belong to S. It is very important to note that: An element can occur at most once in a set. The order of the elements in a set is unimportant. 12Adrienn Skrop
13
Sets A set can be given by enumerating its elements between brackets, e.g., A = {a 1, a 2,...,a n }, or by giving a property P(x) all elements must share as follows: A = {x | P(x)}. A set having a fixed number of elements is finite, and infinite otherwise. The empty set contains no elements and is denoted by . 13Adrienn Skrop
14
Example ℕ = {1, 2,…,n,…} denotes the set of natural numbers. ℤ ={..., 2, 1, 0, 1, 2,...} denotes the set of integer numbers. ℚ denotes the set of rational numbers, ℝ denotes the set of real numbers, ℂ denotes the set of complex numbers, {thought, ape, quantum, Rembrandt} is a set. {mammal | water content of mammal’s milk is less than 20%} is a set 14Adrienn Skrop
15
Union The union of sets A and B is denoted by the symbol and defined as follows: A B = {x | (x A) V (x B)}. 15Adrienn Skrop
16
Union example {thought, ape, quantum, Rembrandt} {1, 2} = {thought, ape, quantum, Rembrandt, 1, 2}. Note that the operation of union is a purely formal one (just like the other set operations): it does not require that the elements of the sets be compatible with each other, or have the same nature, in any way. 16Adrienn Skrop
17
Union Set union satisfies the following properties (as can be easily checked using the definitions of sets equality and union): Commutativity: A B = B A, for any two sets A, B; Associativity: A (B C)= (A B) C, for any three sets A, B, C; Idempotency: A A = A, for any set A. 17Adrienn Skrop
18
Intersection The intersection of sets A and B is denoted by the symbol and defined as follows: A B = {x | (x A) Λ (x B)}. Visualisation of set intersection A B 18Adrienn Skrop
19
Disjoint sets If A B = , sets A and B are said to be disjoint sets Visualisation of the disjoint sets A and B: 19Adrienn Skrop
20
Intersection example {thought, ape, quantum, Rembrandt} {thought, Rembrandt, 1, 2} = {thought, Rembrandt}. Note that the result of the intersection consists of the elements which are exactly the same. 20Adrienn Skrop
21
Intersection Set intersection satisfies the following properties (as can be easily checked using the definitions of sets equality and intersection): Commutativity: A B = B A, for any two sets A, B; Associativity: A (B C)= (A B) C, for any three sets A, B, C; Idempotency: A A = A, for any set A. 21Adrienn Skrop
22
Difference The difference of sets A and B (in this order) is denoted by the symbol \, and is defined as follows A \ B = {x | (x A) Λ (x B)}. Note: in general, A \ B B \ A (i.e., set difference does not commute) 22Adrienn Skrop
23
Powerset The powerset (A) of a set A is defined as follows: (A) = {X | X A}, i.e., the set of all subsets of A. The empty set is a member of the powerset of any set A, i.e., (A). Example ({thought, ape, quantum}) = { , {thought}, {ape}, {quantum}, {thought, ape}, {thought, quantum}, {ape, quantum}, {thought, ape, quantum}}. 23Adrienn Skrop
24
Notations Given a finite set T = {t 1, t 2,...,t j,...,t m } of elements called terms (e.g. words or expressions which may be stemmed describing or characterising documents such as, for example, keywords given for a journal article) 24Adrienn Skrop
25
Notation Given a finite set D = {D 1,...,D i,...,D n }, D i (T) of elements called documents. These documents are formally conceived, for retrieval purposes, as being represented by sets of terms. 25Adrienn Skrop
26
Notation Given a Boolean expression Q called a query. For example (the terms are A,B,C,D,E): 26Adrienn Skrop
27
Boolean Information Retrieval Retrieval is defined as follows: 1. The set S i of documents are obtained that contain or not the term under focus: for term A: S i = {D | A D} for negated term A, i.e., ¬A: S i = {D | A ∉ D} 2. Those documents are retrieved in response to Q, which belong to the set obtained as a result of the corresponding sets operations: intersection corresponds to logical AND, union corresponds to logical OR. 27Adrienn Skrop
28
BIR example 1 Q = A OR (B AND C) S 1 results set for term A, S 2 results set for term B, S 3 results set for term C, The retrieved set in response to Q: S 1 (S 2 S 3 ) 28Adrienn Skrop
29
BIR example 2 Let the set of original documents be O = {O 1, O 2, O 3 } where: O 1 = Bayes' Principle: The principle that, in estimating a parameter, one should initially assume that each possible value has equal probability (a uniform prior distribution). O 2 = Bayesian Decision Theory: A mathematical theory of decision-making which presumes utility and probability functions, and according to which the act to be chosen is the Bayes act, i.e. the one with highest Subjective Expected Utility. If one had unlimited time and calculating power with which to make every decision, this procedure would be the best way to make any decision. 29Adrienn Skrop
30
BIR example 2 O 3 = Bayesian Epistemology: A philosophical theory which holds that the epistemic status of a proposition (i.e. how well proven or well established it is) is best measured by a probability and that the proper way to revise this probability is given by Bayesian conditionalisation or similar procedures. A Bayesian epistemologist would use probability to define, and explore the relationship between, concepts such as epistemic status, support or explanatory power. 30Adrienn Skrop
31
BIR example 2 Let the set T of terms be: T = {t 1 = Bayes' Principle, t 2 = probability, t 3 = decision-making, t 4 = Bayesian Epistemology}. Then, the set D of documents is as follows: D = {D 1, D 2, D 3 } where D 1 = {Bayes' Principle, probability}, D 2 = {probability, decision-making}, D 3 = {probability, Bayesian Epistemology}.. 31Adrienn Skrop
32
BIR example 2 Let the query Q be: Q = probability Λ decision-making Firstly, the following sets S 1 and S 2 of documents D i are obtained: S 1 = {D i | probability D i } = {D 1, D 2, D 3 }, S 2 = {D i | decision-making D i } = {D 2 }. Finally, the following documents D i are retrieved in response to Q: {D i |D i S 1 S 2 } = {D 1, D 2, D 3 } {D 2 } = {D 2 }. 32Adrienn Skrop
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.