Information Retrieval Evaluation and the Retrieval Process.

Slides:



Advertisements
Similar presentations
Chapter 11 user support. Issues –different types of support at different times –implementation and presentation both important –all need careful design.
Advertisements

Chapter 5: Introduction to Information Retrieval
UCLA : GSE&IS : Department of Information StudiesJF : 276lec1.ppt : 5/2/2015 : 1 I N F S I N F O R M A T I O N R E T R I E V A L S Y S T E M S Week.
Kalervo Järvelin – Issues in Context Modeling – ESF IRiX Workshop - Glasgow THEORETICAL ISSUES IN CONTEXT MODELLING Kalervo Järvelin
Search Engines and Information Retrieval
Evaluation.  Allan, Ballesteros, Croft, and/or Turtle Types of Evaluation Might evaluate several aspects Evaluation generally comparative –System A vs.
Information Retrieval February 24, 2004
INFO 624 Week 3 Retrieval System Evaluation
© Tefko Saracevic, Rutgers University 1 EVALUATION in searching IR systems Digital libraries Reference sources Web sources.
Retrieval Evaluation. Brief Review Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.
Classroom Assessment FOUN 3100 Fall Assessment is an integral part of teaching.
Reference Collections: Task Characteristics. TREC Collection Text REtrieval Conference (TREC) –sponsored by NIST and DARPA (1992-?) Comparing approaches.
© Tefko Saracevic1 Search strategy & tactics Governed by effectiveness&feedback.
1 Software Testing and Quality Assurance Lecture 15 - Planning for Testing (Chapter 3, A Practical Guide to Testing Object- Oriented Software)
Information Retrieval: Human-Computer Interfaces and Information Access Process.
MANAGEMENT USES OF INFORMATION Pertemuan 02 Matakuliah: F0204 / SISTEM AKUNTANSI Tahun: 2007.
WXGB6106 INFORMATION RETRIEVAL Week 3 RETRIEVAL EVALUATION.
Information Retrieval for High-Quality Systematic Reviews: The Basics 6.0.
International Atomic Energy Agency INIS Training Seminar Principles of Information Retrieval and Query Formulation 07 – 11 October 2013 Vienna, Austria.
PROGRAMMING LANGUAGES The Study of Programming Languages.
Research Problem.
Search Engines and Information Retrieval Chapter 1.
Evaluation Experiments and Experience from the Perspective of Interactive Information Retrieval Ross Wilkinson Mingfang Wu ICT Centre CSIRO, Australia.
Philosophy of IR Evaluation Ellen Voorhees. NIST Evaluation: How well does system meet information need? System evaluation: how good are document rankings?
IR Evaluation Evaluate what? –user satisfaction on specific task –speed –presentation (interface) issue –etc. My focus today: –comparative performance.
The Reference Process IS 530 Fall 2005 Dr. D. Bilal.
Jane Reid, AMSc IRIC, QMUL, 16/10/01 1 Evaluation of IR systems Jane Reid
Human Computer Interaction
Scientific Research in Biotechnology 5.03 – Demonstrate the use of the scientific method in the planning and development of an experimental SAE.
Information in the Digital Environment Information Seeking Models Dr. Dania Bilal IS 530 Spring 2006.
Implicit Acquisition of Context for Personalization of Information Retrieval Systems Chang Liu, Nicholas J. Belkin School of Communication and Information.
THE INFLUENCE OF DESIGN OF A WEB-BASED EDUCATIONAL TOOL ON SATISFACTION AND LEARNING PERFORMANCE Manuel J. Sánchez-Franco Ángel F. Villarejo-Ramos Begoña.
The Reference Process IS 530 Spring 2006 Dr. D. Bilal.
Chapter 5 Tax Research McGraw-Hill/Irwin Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.
Lecture 3 / Chapter 3 User Needs and Behavior Bob Griffin - IMD290 Information Architecture.
Searching the web Enormous amount of information –In 1994, 100 thousand pages indexed –In 1997, 100 million pages indexed –In June, 2000, 500 million pages.
Structure of IR Systems INST 734 Module 1 Doug Oard.
What is Science? or 1.Science is concerned with understanding how nature and the physical world work. 2.Science can prove anything, solve any problem,
Understanding Users Cognition & Cognitive Frameworks
Information in the Digital Environment Information Seeking Models Dr. Dania Bilal IS 530 Spring 2005.
Constructivism A learning theory for today’s classroom.
Basic Implementation and Evaluations Aj. Khuanlux MitsophonsiriCS.426 INFORMATION RETRIEVAL.
Basics of Information Retrieval and Query Formulation Bekele Negeri Duresa Nuclear Information Specialist.
Measuring How Good Your Search Engine Is. *. Information System Evaluation l Before 1993 evaluations were done using a few small, well-known corpora of.
Jane Reid, AMSc IRIC, QMUL, 30/10/01 1 Information seeking Information-seeking models Search strategies Search tactics.
Performance Measurement. 2 Testing Environment.
Information Retrieval Transfer Cycle Dania Bilal IS 530 Fall 2007.
What Does the User Really Want ? Relevance, Precision and Recall.
 How would you rate your memory? Does this number vary from day to day? Morning to evening?
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter. 3: Retrieval Evaluation 1/2/2016Dr. Almetwally Mostafa 1.
Evaluation of Information Retrieval Systems Xiangming Mu.
Information Retrieval Quality of a Search Engine.
MGT 3213 – 07. © 2009 Cengage Learning. All rights reserved.
Research Skills for Your Essay Where to begin…. Starting the search task for real Finding and selecting the best resources are the key to any project.
IB Design and Technology Evaluation Evaluation and Designing.
WHIM- Spring ‘10 By:-Enza Desai. What is HCIR? Study of IR techniques that brings human intelligence into search process. Coined by Gary Marchionini.
Model-Facilitated Learning Overview Gordon Graber 2008.
Information Retrieval in Practice
Text Based Information Retrieval
Modern Information Retrieval
IR Theory: Evaluation Methods
Part Three SOURCES AND COLLECTION OF DATA
Understanding the Report Process and Research Methods
Introduction into Knowledge and information
Q4 Measuring Effectiveness
Evaluation of IR Performance
Chapter 11 user support.
M. Kezunovic (P.I.) S. S. Luo D. Ristanovic Texas A&M University
6 Chapter Training Evaluation.
Presentation transcript:

Information Retrieval Evaluation and the Retrieval Process

Why evaluate an IR system? To select between alternative systems To determine if a system meets expressed and unexpressed needs of current users and non-users To improve IR systems and determine if improvement actually occurred To develop cost models

4 levels of evaluation - Lancaster Effectiveness Benefits Cost effectiveness Cost benefits

Effectiveness What a system does well, e.g., percentage of reference questions answered accurately, the recall and precision of a literature search There are a number of measures of effectiveness

Measuring Effectiveness

Measures of Effectiveness Recall Precision Relevance Pertinence or utility Novelty ratio Fallout and noise Timeliness Coverage Generality

Recall and Precision Ratios Recall a/(a+c): proportion of relevant items retrieved out of the total number of relevant items contained in a database Precision a/(a+b): a signal-to-noise ratio-- proportion of retrieved materials that are relevant to a query Used together, the 2 ratios express the filtering capacity of the system Recall and precision tend to be inversely related

Relevance and Pertinence Relevance (or generality ratio) (a+c)/(a+b+c+d): the number or proportion of materials in a system that are relevant to a query. Can be hard to ascertain without scanning the entire database. Pertinence: the relationship between a document and an information need. Utility refers to the subset of a that is actually used.

Novelty Ratio, Fallout and Noise Novelty ratio: a subset of a that is actually new to the person evaluating relevance Fallout and Noise: the subset b of retrieved items that are not relevant

Timeliness, Coverage, and Generality Timeliness and coverage: factors that affect assessments of relevance and pertinence Generality: the number of documents related to a particular request in the entire database. The more dense the ratio, the easier a search should be Accuracy

Criteria Commonly Used to Evaluate Retrieval Performance Recall Precision User effort  Amount of time a user spends conducting a search  Amount of time a user spends negotiating his inquiry and then separating relevant from irrelevant items Response time Benefits Search costs Cost effectiveness Cost benefits

Objective vs. Subjective Knowledge Factual or artifactual knowledge vs. how knowledge is constructed or modeled within an individual’s mind Subjective knowledge (and therefore relevance judgments) varies from person to person, e.g., individual aesthetic judgments or problem solving methods

Benefits What good a system does, e.g., how an information system benefits its users Hard to measure

Search Costs Economics of using different databases Using natural language indexing can shift effort onto the searcher

Cost Effectiveness Relationship of cost criteria to quality criteria, e.g., unit cost per relevant or new item retrieved

Cost Benefits Cost savings through use of one information system over another Increased, or avoidance of loss of, productivity Improved decision-making or reduction of personnel needed to make decisions Avoidance of duplication of effort

Components of an Evaluation 1. Defining the scope of the evaluation - Formative vs. summative 2. Designing the evaluation program 3. Execution of the evaluation 4. Analysis and interpretation of the results 5. Modifying the system based on the results 6. Iteration if necessary (go back to step 3)

Real Life vs. Experimental Systems Experiments and benchmark tests -  standardized collections, queries, and relevance judgments  tested against multiple systems  evaluated on recall and precision  biases often built into system design Predictive evaluation -  expert reviews  usage simulation such as walthroughs Real life -  observing users’ interactions with system  eliciting users’ opinions

Classic IR Model - Bates Document --> Document representation  matched up with Query <-- Information need

Problems with Classic IR Model Users cannot use their own language Different users have different needs Users have different information needs at different times Users are not always able to read and write Information need may evolve during the search process Some users are not concerned about precision and recall Users may want to eliminate known items Users may want more cues to assist in assessing relevance

Other factors influencing use Accessibility - physical, intellectual, and psychological - and ease of use are the most important determinants of whether an information service is used Principle of Least Effort Perceived technical quality also affects the choice of first source Perceptions of accessibility ar einfluenced by experience

Berrypicking Search queries are not static, but evolve Searchers gather information in bits and pieces Searchers use a variety of search techniques Searchers use a variety of other sources as well as databases

Search Strategies Footnote chasing Citation searching Journal run Area scanning Subject searches in bibliographies, abstracts, and indexes Author searching

Making Retrieval More Effective The more techniques used, the more effective a search is likely to be Users should be able to search in ways that are already familiar or that they have found to be effective A visual representation of the contents of a system may aid users in orienting themselves