The Structure of Information Retrieval Systems LBSC 708A/CMSC 838L Douglas W. Oard and Philip Resnik Session 1: September 4, 2001.

Slides:



Advertisements
Similar presentations
Designing and Developing Online Courses. Course Life Cycle Design Develop Implement Evaluate Revise.
Advertisements

Webinar Description This webinar will focus on how to use Jing, a free screen capture/casting software. Jing allows users to create image and brief video.
What is a CAT?. Introduction COMPUTER ADAPTIVE TEST + performance task.
Search and Ye Shall Find (maybe) Seminar on Emergent Information Technology August 20, 2007 Douglas W. Oard.
The Basics!. Know the requirements of the ICT IGCSE Understand the need for extensive reading & research Be able to describe the difference between data.
Information Retrieval: Human-Computer Interfaces and Information Access Process.
Ranked Retrieval INST 734 Module 3 Doug Oard. Agenda  Ranked retrieval Similarity-based ranking Probability-based ranking.
Information Retrieval in Practice
Search Engines and Information Retrieval
1 CS 430 / INFO 430 Information Retrieval Lecture 8 Query Refinement: Relevance Feedback Information Filtering.
Future Access to the Scientific and Cultural Heritage – A shared Responsibility Birte Christensen-Dalsgaard State and University Library.
The Vector Space Model LBSC 796/CMSC828o Session 3, February 9, 2004 Douglas W. Oard.
Search Session 12 LBSC 690 Information Technology.
INFO 624 Week 3 Retrieval System Evaluation
Chapter 2 Succeeding as a Systems Analyst
Information Retrieval Interaction CMSC 838S Douglas W. Oard April 27, 2006.
Information Retrieval: Human-Computer Interfaces and Information Access Process.
Information Access Douglas W. Oard College of Information Studies and Institute for Advanced Computer Studies Design Understanding.
© Tefko Saracevic, Rutgers University1 PRINCIPLES OF SEARCHING 17:610:530 (01) Tefko Saracevic SCILS, Rm. 306 (732) /Ext. 8222
10th Workshop "Software Engineering Education and Reverse Engineering" Ivanjica, Serbia, 5-12 September 2010 First experience in teaching HCI course Dusanka.
The Big Six Theory Information Literacy
Overview of Search Engines
The Project AH Computing. Functional Requirements  What the product must do!  Examples attractive welcome screen all options available as clickable.
Business and Management Research
Evaluation of digital collections' user interfaces Radovan Vrana Faculty of Humanities and Social Sciences Zagreb, Croatia
SYSC System Analysis and Design 1 Part I – Introduction.
Module 3 Differentiating Student Responses to Instruction.
The ID process Identifying needs and establishing requirements Developing alternative designs that meet those requirements Building interactive versions.
Senior Projects Mr. Cook. I Search Project What is It? A required major project for 2nd semester Authentic research of a “burning question or topic”
OBJECT ORIENTED SYSTEM ANALYSIS AND DESIGN. COURSE OUTLINE The world of the Information Systems Analyst Approaches to System Development The Analyst as.
Internet Information Retrieval Sun Wu. Course Goal To learn the basic concepts and techniques of internet search engines –How to use and evaluate search.
Evaluation INST 734 Module 5 Doug Oard. Agenda  Evaluation fundamentals Test collections: evaluating sets Test collections: evaluating rankings Interleaving.
1 Just-in-Time Interactive Question Answering Language Computer Corporation Sanda Harabagiu, PI John Lehmann John Williams Paul Aarseth.
Intro: FIT1001 Computer Systems S Important Notice for Lecturers This file is in skeleton form only Lecturers are expected to modify / enhance.
Year 9 Humanities Personal Project Term 2. Contents  The task and outcome The task and outcome  The purpose The purpose  Becoming an effective learner.
 hd.jpg hd.jpg Information Retrieval and Interaction.
4 1 SEARCHING THE WEB Using Search Engines and Directories Effectively New Perspectives on THE INTERNET.
User Interfaces 4 BTECH: IT WIKI PAGE:
Final Exam Review Session 14 LBSC 790 / INFM 718B Building the Human-Computer Interface.
Information in the Digital Environment Information Seeking Models Dr. Dania Bilal IS 530 Spring 2005.
Structure of IR Systems INST 734 Module 1 Doug Oard.
Endangered Species A Collaborative Teaching Unit.
BSc Honours Project Introduction CSY4010 Amir Minai Module Leader.
Jane Reid, AMSc IRIC, QMUL, 30/10/01 1 Information seeking Information-seeking models Search strategies Search tactics.
Structure of IR Systems LBSC 796/INFM 718R Session 1, September 10, 2007 Doug Oard.
SYSC System Analysis and Design 1 Part I – Introduction.
Structure of IR Systems LBSC 796/INFM 718R Session 1, January 26, 2011 Doug Oard.
CM220 College Composition II Friday, January 29, Unit 1: Introduction to Effective Academic and Professional Writing Unit 1 Lori Martindale, Instructor.
Augmenting (personal) IR Readings Review Evaluation Papers returned & discussed Papers and Projects checkin time.
Learning Objectives Understand the concepts of Information systems.
Introduction: What is AI? CMSC Introduction to Artificial Intelligence January 3, 2002.
1 CS 430 / INFO 430 Information Retrieval Lecture 12 Query Refinement and Relevance Feedback.
Bloom’s Taxonomy The Concept of “Levels of Thinking”
November 8, 2005NSF Expedition Workshop Supporting E-Discovery with Search Technology Douglas W. Oard College of Information Studies and Institute for.
Introduction: What is AI? CMSC Introduction to Artificial Intelligence January 7, 2003.
Cross-Language Information Retrieval Applied Natural Language Processing October 29, 2009 Douglas W. Oard.
Workshop 2014 Cam Xuyen, October 14, 2014 Testing/ assessment/ evaluation BLOOM’S TAXONOMY.
Session 5: How Search Engines Work. Focusing Questions How do search engines work? Is one search engine better than another?
Evaluation of an Information System in an Information Seeking Process Lena Blomgren, Helena Vallo and Katriina Byström The Swedish School of Library and.
Advanced Higher Computing Science
Course Overview - Database Systems
Information Storage and Retrieval Fall Lecture 1: Introduction and History.
What is Information Retrieval (IR)?
Business and Management Research
Data Mining: Concepts and Techniques Course Outline
Authors: Peiling Wang and Dagobert Soergel Reviewer: Douglas W. Oard
Business and Management Research
AP World History Introduction.
Structure of IR Systems
Presentation transcript:

The Structure of Information Retrieval Systems LBSC 708A/CMSC 838L Douglas W. Oard and Philip Resnik Session 1: September 4, 2001

Agenda 5:30-6:00 Introductions 6:00-6:15 What is “Information Retrieval?” 6:15-6:40 Applications 6:40-6:55 (Break) 6:55-7:20 Applications 7:20-7:50 System design 7:50-7:55 (Stretch break) 7:55-8:15 Course outline

Introductions Pair up with a partner from another department Get to know each other in 5 minutes (total) Tell us about your partner in 30 seconds –Name –Degree program (department, Master’s/Ph.D./Visitor) –Information retrieval experience –One thing they would like to learn

What do We Mean by “Information?” How is it different from “data”? –Information is data in context Databases contain data and produce information IR systems contain and provide information How is it different from “knowledge”? –Knowledge is a basis for making decisions Many “knowledge bases” contain decision rules

What Do We Mean by “Retrieval?” Find something that you want –The information need may or may not be explicit Known item search –Find the class home page Answer seeking –Is Lexington or Louisville the capital of Kentucky? Directed exploration –Who makes videoconferencing systems?

Source: Global Reach English Global Internet User Population Chinese

IR is More Than Web Searching! Form into four groups to discuss: 1: A system to search a collection of oral history interviews 2: Construction of a personalized newspaper 3: Software to find music recordings in an online CD store 4: Searching all the Xerox copies ever made at an office

What To Do Form 2-3 person subgroups to discuss: (10 min) –How would you describe what to search for? What makes one object “better” than another? –How would you recognize when you have found it? How would you explain the way you made a choice? –What kind of technology might be helpful? Speech recognition, optical character recognition, … Get together to discuss what you learned (10 min) –Build a 5 minute Powerpoint presentation

Supporting the Search Process Source Selection Search Query Selection Ranked List Examination Document Delivery Document Query Formulation IR System Query Reformulation and Relevance Feedback Source Reselection NominateChoose Predict

Supporting the Search Process Source Selection Search Query Selection Ranked List Examination Document Delivery Document Query Formulation IR System Indexing Index Acquisition Collection     

Design Strategies Foster human-machine synergy –Exploit complementary strengths –Accommodate shared weaknesses Divide-and-conquer –Divide task into stages with well-defined interfaces –Continue dividing until problems are easily solved Co-design related components –Iterative process of joint optimization

Human-Machine Synergy Machines are good at: –Doing simple things accurately and quickly –Scaling to larger collections in sublinear time People are better at: –Accurately recognizing what they are looking for –Evaluating intangibles such as “quality” Both are pretty bad at: –Mapping consistently between words and concepts

Divide and Conquer Strategy: use encapsulation to limit complexity Approach: –Define interfaces (input and output) for each component Query interface: input terms, output representation –Define the functions performed by each component Remove common words, weight rare terms higher, … –Repeat the process within components as needed Result: a hierarchical decomposition

Co-design Design of one component may affect another –Effect may be direct or indirect Approach: –Develop alternatives for each interacting component –Assess the effects of each practical combination on efficiency, effectiveness, and usability –Repeat the process until a suitable combination is found

Some Examples of Codesign in IR Source Selection Search Query Selection Ranked List Examination Document Delivery Document Query Formulation IR System Indexing Index Acquisition Collection

Course Goals Appreciate IR system capabilities and limitations Understand IR system design & implementation –For a broad range of applications and media Evaluate IR system performance Identify current IR research problems

Course Design Text/readings provide background and detail –At least one recommended reading is required Class provides organization and direction –We will not cover every important detail Assignments and project provide experience –The TA can help CLIS students with the project Final exam helps focus your effort

Grading Assignments (30% total) –Mastery of concepts and experience using tools –708A: “homework,” 838L: “programming” Term project (30%) –3 options, described on course Web page Final exam (40%) –Two different in-class exams

Handy Things to Know Classes will be videotaped –Available in the CLIS library if you miss class Office hours are by appointment –Send an or ask after class Everything is on the web –At We are most easily reached by

Do This Week Do the reading before class –Don’t fall behind! Start on assignment 1 –Due in 2 weeks! Explore the Web site –Start thinking about the term project