Evaluation of IR Performance Dr. Bilal IS 530 Fall 2005
Searching for Information Imprecise Incomplete Tentative Challenging
IR Performance Precision Ratio = the number of relevant documents retrieved the total number of documents retrieved
IR Performancel Recall Ratio = the number of relevant documents retrieved the total number of relevant documents
Why Do We Miss Items? Indexing errors Wrong search terms Wrong database Language variations Other???
Why Do We Get Unwanted Items? Indexing errors Wrong search terms Homographs Incorrect term relations Other???
Boolean Operators OR increases recall AND increases precision NOT increases precision
Recall and Precision in Practice Inversely related Search strategies designed for high precision or high recall (or medium) Needs of users dictate search strategy towards recall or precision Practice helps changing queries to favor recall or precision
Recall and Precision 1.0 Recall 1.0 Precision
Problems with Relevance, Recall, and Precision Yes or no decision Things are more or less relevant In practice not easy to measure Not focused on user needs
Relevance A match between a query and information retrieved Is a judgment Can be judged by anyone who is informed of the query and views the retrieved information
Relevance (cont.) Judgments may differ Is the base for information retrieval evaluation methods (recall and precision) Documents can be ranked by likely relevance
Pertinence Based on information need rather than request and documents Can only be judged by user May differ from relevance judgments
Pertinence (cont.) Transient, varies with many factors Not often used in evaluation May be used as a measure of satisfaction
High Precision Searching Controlled vocabulary Limits: Specific fields, major descriptors, Date, language, etc. AND operator Proximity Careful with truncation
High Recall Searching OR logic Keyword searching No limits Truncate Broader terms
Related Concepts Topicality Aboutness Utility Pertinence Satisfaction
Hints for Improving Performance Good interview User presence, if possible Preliminary search and user response Evaluation during search (you or you and user) User feedback Search refinement as you progress