10/4/01 IS202: Information Organization & Retrieval Interfaces for Information Retrieval Ray Larson & Warren Sack IS202: Information Organization and Retrieval.

Slides:



Advertisements
Similar presentations
Recuperação de Informação B Cap. 10: User Interfaces and Visualization 10.1,10.2,10.3 November 17, 1999.
Advertisements

Information Retrieval: Human-Computer Interfaces and Information Access Process.
Jane Reid, AMSc IRIC, QMUL, 13/11/01 1 IR interfaces Purpose: to support users in information-seeking tasks Issues: –Functionality –Usability Motivations.
Information Retrieval Visualization CPSC 533c Class Presentation Qixing Zheng March 22, 2004.
Information Retrieval in Practice
© Tefko Saracevic1 Interaction in information retrieval There is MUCH more to searching than knowing computers, networks & commands, as there is more.
Information Retrieval Review
Search and Retrieval: More on Term Weighting and Document Ranking Prof. Marti Hearst SIMS 202, Lecture 22.
Interfaces for Retrieval Results. Information Retrieval Activities Selecting a collection –Talked about last class –Lists, overviews, wizards, automatic.
© Anselm Spoerri Lecture 10 Visual Tools for Text Retrieval (cont.)
More Interfaces for Retrieval. Information Retrieval Activities Selecting a collection –Lists, overviews, wizards, automatic selection Submitting a request.
SIMS 202 Information Organization and Retrieval Prof. Marti Hearst and Prof. Ray Larson UC Berkeley SIMS Tues/Thurs 9:30-11:00am Fall 2000.
9/18/2001Information Organization and Retrieval Vector Representation, Term Weights and Clustering (continued) Ray Larson & Warren Sack University of California,
Automating Discovery from Biomedical Texts Marti Hearst & Barbara Rosario UC Berkeley Agyinc Visit August 16, 2000.
© Tefko Saracevic, Rutgers University1 Interaction in information retrieval There is MUCH more to searching than knowing computers, networks & commands,
SIMS 296a-3: Aids for Source Selection Carol Butler Fall ‘98.
SLIDE 1IS 202 – FALL 2004 Lecture 13: Midterm Review Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am -
IS202: Information Organization & Retrieval Recommender Systems Ray Larson & Warren Sack IS202: Information Organization and Retrieval Fall 2001 UC Berkeley,
Interfaces for Selecting and Understanding Collections.
INFM 700: Session 7 Unstructured Information (Part II) Jimmy Lin The iSchool University of Maryland Monday, March 10, 2008 This work is licensed under.
Lecture 7 Date: 23rd February
SIMS 202 Information Organization and Retrieval Prof. Marti Hearst and Prof. Ray Larson UC Berkeley SIMS Tues/Thurs 9:30-11:00am Fall 2000.
INFO 624 Week 3 Retrieval System Evaluation
SIMS 202 Information Organization and Retrieval Prof. Marti Hearst and Prof. Ray Larson UC Berkeley SIMS Tues/Thurs 9:30-11:00am Fall 2000.
SIMS 296a-3: UI Background Marti Hearst Fall ‘98.
HCI Part 2 and Testing Session 9 INFM 718N Web-Enabled Databases.
SIMS 202 Information Organization and Retrieval Prof. Marti Hearst and Prof. Ray Larson UC Berkeley SIMS Tues/Thurs 9:30-11:00am Fall 2000.
Information Retrieval: Human-Computer Interfaces and Information Access Process.
SLIDE 1IS 202 – FALL 2003 Lecture 23: Interfaces for Information Retrieval II Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday.
Marti Hearst SIMS 247 SIMS 247 Lecture 20 Visualizing Text & Text Collections (cont.) April 2, 1998.
© Anselm Spoerri Lecture 13 Housekeeping –Term Projects Evaluations –Morse, E., Lewis, M., and Olsen, K. (2002) Testing Visual Information Retrieval Methodologies.
SIMS 202 Information Organization and Retrieval Prof. Marti Hearst and Prof. Ray Larson UC Berkeley SIMS Tues/Thurs 9:30-11:00am Fall 2000.
WMES3103: INFORMATION RETRIEVAL WEEK 10 : USER INTERFACES AND VISUALIZATION.
Course Wrap-Up IS 485, Professor Matt Thatcher. 2 C.J. Minard ( )
4. Interaction Design Overview 4.1. Ergonomics 4.2. Designing complex interactive systems Situated design Collaborative design: a multidisciplinary.
SIMS 213: User Interface Design & Development Marti Hearst Thurs, Jan 22, 2004.
UCB CS Research Fair Search Text Mining Web Site Usability Marti Hearst SIMS.
The Wharton School of the University of Pennsylvania OPIM 101 2/16/19981 The Information Retrieval Problem n The IR problem is very hard n Why? Many reasons,
Introduction to HCI Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development January 21, 1999.
ISP 433/633 Week 12 User Interface in IR. Why care about User Interface in IR Human Search using IR depends on –Search in IR and search in human memory.
SLIDE 1IS 202 – FALL 2002 Lecture 24: Interfaces for Information Retrieval Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and.
Information Architecture Donna Maurer Usability Specialist.
Overview of Search Engines
Computer –the machine the program runs on –often split between clients & servers Human-Computer Interaction (HCI) Human –the end-user of a program –the.
1 Adapting the TileBar Interface for Visualizing Resource Usage Session 602 Adapting the TileBar Interface for Visualizing Resource Usage Session 602 Larry.
CSC 480 Software Engineering Lecture 19 Nov 11, 2002.
Interacting with IT Systems Fundamentals of Information Technology Session 5.
Using Metadata in Search Prof. Marti Hearst SIMS 202, Lecture 27.
Information Seeking Behavior Prof. Marti Hearst SIMS 202, Lecture 25.
Web Searching Basics Dr. Dania Bilal IS 530 Fall 2009.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Thanks to Bill Arms, Marti Hearst Documents. Last time Size of information –Continues to grow IR an old field, goes back to the ‘40s IR iterative process.
Searching the web Enormous amount of information –In 1994, 100 thousand pages indexed –In 1997, 100 million pages indexed –In June, 2000, 500 million pages.
Recuperação de Informação B Cap. 10: User Interfaces and Visualization , , 10.9 November 29, 1999.
Software Engineering User Interface Design Slide 1 User Interface Design.
Interaction LBSC 734 Module 4 Doug Oard. Agenda Where interaction fits Query formulation Selection part 1: Snippets  Selection part 2: Result sets Examination.
CS276A Text Information Retrieval, Mining, and Exploitation Lecture 9 5 Nov 2002.
Augmenting (personal) IR Readings Review Evaluation Papers returned & discussed Papers and Projects checkin time.
Chapter. 3: Retrieval Evaluation 1/2/2016Dr. Almetwally Mostafa 1.
Usability Olaa Motwalli CIS764, DR Bill – KSU. Overview Usability factors. Usability guidelines.  Software application.  Website. Common mistakes. Good.
Search and Retrieval: Query Languages Prof. Marti Hearst SIMS 202, Lecture 19.
User Interfaces for Information Access Prof. Marti Hearst SIMS 202, Lecture 26.
Relevance Feedback Prof. Marti Hearst SIMS 202, Lecture 24.
SEMINAR ON INTERNET SEARCHING PRESENTED BY:- AVIPSA PUROHIT REGD NO GUIDED BY:- Lect. ANANYA MISHRA.
SIMS 202, Marti Hearst Content Analysis Prof. Marti Hearst SIMS 202, Lecture 15.
SIMS 202, Marti Hearst Final Review Prof. Marti Hearst SIMS 202.
Information Retrieval in Practice
SIMS 202 Information Organization and Retrieval
Visualizing Document Collections
Document Clustering Matt Hughes.
Presentation transcript:

10/4/01 IS202: Information Organization & Retrieval Interfaces for Information Retrieval Ray Larson & Warren Sack IS202: Information Organization and Retrieval Fall 2001 UC Berkeley, SIMS lecture authors: Marti Hearst, Ray Larson, Warren Sack

10/4/01 IS202: Information Organization & Retrieval Today What is HCI? Interfaces for IR using the standard model of IR Interfaces for IR using new models of IR and/or different models of interaction

IS202: Information Organization & Retrieval Human-Computer Interaction (HCI) Human –the end-user of a program Computer –the machine the program runs on Interaction –the user tells the computer what they want –the computer communicates results (slide adapted What is HCI? from James Landay)

IS202: Information Organization & Retrieval What is HCI? HumanTechnology Task Design Organizational & Social Issues (slide by James Landay)

10/4/01 IS202: Information Organization & Retrieval

10/4/01 IS202: Information Organization & Retrieval Shneiderman on HCI Well-designed interactive computer systems promote: –Positive feelings of success, competence, and mastery. –Allow users to concentrate on their work, rather than on the system.

IS202: Information Organization & Retrieval Usability Design Goals Ease of learning –faster the second time and so on... Recall –remember how from one session to the next Productivity –perform tasks quickly and efficiently Minimal error rates –if they occur, good feedback so user can recover High user satisfaction –confident of success (slide by James Landay)

IS202: Information Organization & Retrieval Who builds UIs? A team of specialists –graphic designers –interaction / interface designers –technical writers –marketers –test engineers –software engineers (slide by James Landay)

IS202: Information Organization & Retrieval How to Design and Build UIs Task analysis Rapid prototyping Evaluation Implementation Design Prototype Evaluate Iterate at every stage! (slide adapted from James Landay)

IS202: Information Organization & Retrieval Task Analysis Observe existing work practices Create examples and scenarios of actual use Try out new ideas before building software

10/4/01 IS202: Information Organization & Retrieval Task = Information Access The standard interaction model for information access –(1) start with an information need –(2) select a system and collections to search on –(3) formulate a query –(4) send the query to the system –(5) receive the results –(6) scan, evaluate, and interpret the results –(7) stop, or –(8) reformulate the query and go to step 4

10/4/01 IS202: Information Organization & Retrieval HCI Interface questions using the standard model of IR Where does a user start? Faced with a large set of collections, how can a user choose one to begin with? How will a user formulate a query? How will a user scan, evaluate, and interpret the results? How can a user reformulate a query?

10/4/01 IS202: Information Organization & Retrieval Interface design: Is it always HCI or the highway? No, there are other ways to design interfaces, including using methods from –Art –Architecture –Sociology –Anthropology –Narrative theory –Geography

10/4/01 IS202: Information Organization & Retrieval Information Access: Is the standard IR model always the model? No, other models have been proposed and explored including –Berrypicking (Bates, 1989) –Sensemaking (Russell et al., 1993) –Orienteering (O’Day and Jeffries, 1993) –Intermediaries (Maglio and Barrett, 1996) –Social Navigation (Dourish and Chalmers, 1994) –Agents (e.g., Maes, 1992) –And don’t forget experiments like (Blair and Maron, 1985)

10/4/01 IS202: Information Organization & Retrieval IR+HCI Question 1: Where does the user start?

10/4/01 IS202: Information Organization & Retrieval Dialog box for choosing sources in old lexis-nexis interface

10/4/01 IS202: Information Organization & Retrieval Where does a user start? Supervised (Manual) Category Overviews –Yahoo! –HiBrowse –MeSHBrowse Unsupervised (Automated) Groupings –Clustering –Kohonen Feature Maps

10/4/01 IS202: Information Organization & Retrieval

10/4/01 IS202: Information Organization & Retrieval Incorporating Categories into the Interface Yahoo is the standard method Problems: –Hard to search, meant to be navigated. –Only one category per document (usually)

10/4/01 IS202: Information Organization & Retrieval More Complex Example: MeSH and MedLine MeSH Category Hierarchy –Medical Subject Headings –~18,000 labels –manually assigned –~8 labels/article on average –avg depth: 4.5, max depth 9 Top Level Categories: anatomydiagnosisrelated disc animalspsychtechnology diseasebiologyhumanities drugsphysics

10/4/01 IS202: Information Organization & Retrieval MeshBrowse (Korn & Shneiderman95) Only the relevant subset of the hierarchy is shown at one time.

10/4/01 IS202: Information Organization & Retrieval HiBrowse (Pollitt 97) Browsing several different subsets of category metadata simultaneously.

10/4/01 IS202: Information Organization & Retrieval Large Category Sets Problems for User Interfaces Too many categories to browse Too many docs per category Docs belong to multiple categories Need to integrate search Need to show the documents

10/4/01 IS202: Information Organization & Retrieval Text Clustering Finds overall similarities among groups of documents Finds overall similarities among groups of tokens Picks out some themes, ignores others

10/4/01 IS202: Information Organization & Retrieval Scatter/Gather Cutting, Pedersen, Tukey & Karger 92, 93, Hearst & Pedersen 95 How it works –Cluster sets of documents into general “themes”, like a table of contents –Display the contents of the clusters by showing topical terms and typical titles –User chooses subsets of the clusters and re-clusters the documents within –Resulting new groups have different “themes” Originally used to give collection overview Evidence suggests more appropriate for displaying retrieval results in context

10/4/01 IS202: Information Organization & Retrieval Another use of clustering Use clustering to map the entire huge multidimensional document space into a huge number of small clusters. “Project” these onto a 2D graphical representation –Group by doc: SPIRE/Kohonen maps –Group by words: Galaxy of News/HotSauce/Semio

10/4/01 IS202: Information Organization & Retrieval Clustering Multi-Dimensional Document Space (image from Wise et al 95)

10/4/01 IS202: Information Organization & Retrieval Kohonen Feature Maps on Text (from Chen et al., JASIS 49(7))

10/4/01 IS202: Information Organization & Retrieval Summary: Clustering Advantages: –Get an overview of main themes –Domain independent Disadvantages: –Many of the ways documents could group together are not shown –Not always easy to understand what they mean –Different levels of granularity

10/4/01 IS202: Information Organization & Retrieval IR+HCI Question 2: How will a user formulate a query?

10/4/01 IS202: Information Organization & Retrieval Query Specification Interaction Styles (Shneiderman 97) –Command Language –Form Fill –Menu Selection –Direct Manipulation –Natural Language What about gesture, eye-tracking, or implicit inputs like reading habits?

10/4/01 IS202: Information Organization & Retrieval Command-Based Query Specification command attribute value connector … –find pa shneiderman and tw user# What are the attribute names? What are the command names? What are allowable values?

10/4/01 IS202: Information Organization & Retrieval Form-Based Query Specification (Altavista)

10/4/01 IS202: Information Organization & Retrieval Form-Based Query Specification (Melvyl)

10/4/01 IS202: Information Organization & Retrieval Form-based Query Specification (Infoseek)

10/4/01 IS202: Information Organization & Retrieval Direct Manipulation Spec. VQUERY (Jones 98)

10/4/01 IS202: Information Organization & Retrieval Menu-based Query Specification (Young & Shneiderman 93)

10/4/01 IS202: Information Organization & Retrieval IR+HCI Question 3: How will a user scan, evaluate, and interpret the results?

10/4/01 IS202: Information Organization & Retrieval Display of Retrieval Results Goal: minimize time/effort for deciding which documents to examine in detail Idea: show the roles of the query terms in the retrieved documents, making use of document structure

10/4/01 IS202: Information Organization & Retrieval Putting Results in Context Interfaces should –give hints about the roles terms play in the collection –give hints about what will happen if various terms are combined –show explicitly why documents are retrieved in response to the query –summarize compactly the subset of interest

10/4/01 IS202: Information Organization & Retrieval Putting Results in Context Visualizations of Query Term Distribution –KWIC, TileBars, SeeSoft Visualizing Shared Subsets of Query Terms –InfoCrystal, VIBE, Lattice Views Table of Contents as Context –Superbook, Cha-Cha, DynaCat Organizing Results with Tables –Envision, SenseMaker Using Hyperlinks –WebCutter

10/4/01 IS202: Information Organization & Retrieval KWIC (Keyword in Context) An old standard, ignored by internet search engines –used in some intranet engines, e.g., Cha-Cha

10/4/01 IS202: Information Organization & Retrieval TileBars vGraphical Representation of Term Distribution and Overlap vSimultaneously Indicate: –relative document length –query term frequencies –query term distributions –query term overlap

10/4/01 IS202: Information Organization & Retrieval Query terms: What roles do they play in retrieved documents? DBMS (Database Systems) Reliability Mainly about both DBMS & reliability Mainly about DBMS, discusses reliability Mainly about, say, banking, with a subtopic discussion on DBMS/Reliability Mainly about high-tech layoffs TileBars Example

10/4/01 IS202: Information Organization & Retrieval

10/4/01 IS202: Information Organization & Retrieval SeeSoft: Showing Text Content using a linear representation and brushing and linking (Eick & Wills 95)

10/4/01 IS202: Information Organization & Retrieval David Small: Virtual Shakespeare

10/4/01 IS202: Information Organization & Retrieval

10/4/01 IS202: Information Organization & Retrieval

10/4/01 IS202: Information Organization & Retrieval Other Approaches Show how often each query term occurs in retrieved documents –VIBE (Korfhage ‘91) –InfoCrystal (Spoerri ‘94)

10/4/01 IS202: Information Organization & Retrieval VIBE (Olson et al. 93, Korfhage 93)

10/4/01 IS202: Information Organization & Retrieval InfoCrystal (Spoerri 94)

10/4/01 IS202: Information Organization & Retrieval Problems with InfoCrystal –can’t see overlap of terms within docs –quantities not represented graphically –more than 4 terms hard to handle –no help in selecting terms to begin with

10/4/01 IS202: Information Organization & Retrieval Cha-Cha (Chen & Hearst 98) Shows “table-of-contents”-like view, like Superbook Takes advantage of human-created structure within hyperlinks to create the TOC

10/4/01 IS202: Information Organization & Retrieval IR+HCI Question 4: How can a user reformulate a query?

Information need Index Pre-process Parse Collections Rank Query text input Query Modification

10/4/01 IS202: Information Organization & Retrieval Query Modification Problem: how to reformulate the query? –Thesaurus expansion: Suggest terms similar to query terms –Relevance feedback: Suggest terms (and documents) similar to retrieved documents that have been judged to be relevant

10/4/01 IS202: Information Organization & Retrieval Using Relevance Feedback Known to improve results –in TREC-like conditions (no user involved) What about with a user in the loop?

10/4/01 IS202: Information Organization & Retrieval

10/4/01 IS202: Information Organization & Retrieval Terms available for relevance feedback made visible (from Koenemann & Belkin, 1996)

10/4/01 IS202: Information Organization & Retrieval How much of the guts should the user see? Opaque (black box) –(like web search engines) Transparent –(see available terms after the r.f. ) Penetrable –(see suggested terms before the r.f.) Which do you think worked best?

10/4/01 IS202: Information Organization & Retrieval Effectiveness Results Subjects with R.F. did 17-34% better performance than no R.F. Subjects with penetration case did 15% better as a group than those in opaque and transparent cases.

10/4/01 IS202: Information Organization & Retrieval Summary: HCI Interface questions using the standard model of IR Where does a user start? Faced with a large set of collections, how can a user choose one to begin with? How will a user formulate a query? How will a user scan, evaluate, and interpret the results? How can a user reformulate a query?

10/4/01 IS202: Information Organization & Retrieval Standard Model Assumptions: –Maximizing precision and recall simultaneously –The information need remains static –The value is in the resulting document set

10/4/01 IS202: Information Organization & Retrieval Problem with Standard Model: Users learn during the search process: –Scanning titles of retrieved documents –Reading retrieved documents –Viewing lists of related topics/thesaurus terms –Navigating hyperlinks Some users don’t like long disorganized lists of documents

10/4/01 IS202: Information Organization & Retrieval “Berrypicking” as an Information Seeking Strategy (Bates 90) Standard IR model –assumes the information need remains the same throughout the search process Berrypicking model –interesting information is scattered like berries among bushes –the query is continually shifting –People are learning as they go

10/4/01 IS202: Information Organization & Retrieval A sketch of a searcher… “moving through many actions towards a general goal of satisfactory completion of research related to an information need.” (after Bates 89) Q0 Q1 Q2 Q3 Q4 Q5

10/4/01 IS202: Information Organization & Retrieval Implications Interfaces should make it easy to store intermediate results Interfaces should make it easy to follow trails with unanticipated results

10/4/01 IS202: Information Organization & Retrieval Information Access: Is the standard IR model always the model? No, other models have been proposed and explored including –Berrypicking (Bates, 1989) –Sensemaking (Russell et al., 1993) –Orienteering (O’Day and Jeffries, 1993) –Intermediaries (Maglio and Barrett, 1996) –Social Navigation (Dourish and Chalmers, 1994) –Agents (e.g., Maes, 1992) –And don’t forget experiments like (Blair and Maron, 1985)

10/4/01 IS202: Information Organization & Retrieval Next Time Abbe Don, Guest speaker –Information architecture and novel interfaces for information access. –See Apple Guides paper listed on IS202 assignments page, along with other readings –Also, here is a request from Abbe: look at the following websites – – – go at least "3 levels" deep to get a sense of how the sites are organized.