SIMS 202 Information Organization and Retrieval Prof. Marti Hearst and Prof. Ray Larson UC Berkeley SIMS Tues/Thurs 9:30-11:00am Fall 2000.

Slides:



Advertisements
Similar presentations
Recuperação de Informação B Cap. 10: User Interfaces and Visualization 10.1,10.2,10.3 November 17, 1999.
Advertisements

Chapter 11 Designing the User Interface
Bringing Order to the Web: Automatically Categorizing Search Results Hao Chen SIMS, UC Berkeley Susan Dumais Adaptive Systems & Interactions Microsoft.
User Interface Design Yonsei University 2 nd Semester, 2013 Sanghyun Park.
Information Retrieval: Human-Computer Interfaces and Information Access Process.
10/4/01 IS202: Information Organization & Retrieval Interfaces for Information Retrieval Ray Larson & Warren Sack IS202: Information Organization and Retrieval.
Search Engines and Information Retrieval
Search and Retrieval: More on Term Weighting and Document Ranking Prof. Marti Hearst SIMS 202, Lecture 22.
SIMS 213: User Interface Design & Development Marti Hearst Tues, Feb 25, 2003.
SIMS 213: User Interface Design & Development Marti Hearst Thurs, March 3, 2005.
9/18/2001Information Organization and Retrieval Vector Representation, Term Weights and Clustering (continued) Ray Larson & Warren Sack University of California,
Automating Discovery from Biomedical Texts Marti Hearst & Barbara Rosario UC Berkeley Agyinc Visit August 16, 2000.
Usability presented by the OSU Libraries’ u-team.
Evaluation Adam Bodnar CPSC 533C Monday, April 5, 2004.
SIMS 296a-3: Aids for Source Selection Carol Butler Fall ‘98.
Interfaces for Selecting and Understanding Collections.
SIMS 202 Information Organization and Retrieval Prof. Marti Hearst and Prof. Ray Larson UC Berkeley SIMS Tues/Thurs 9:30-11:00am Fall 2000.
SIMS 296a-3: UI Background Marti Hearst Fall ‘98.
SIMS 202 Information Organization and Retrieval Prof. Marti Hearst and Prof. Ray Larson UC Berkeley SIMS Tues/Thurs 9:30-11:00am Fall 2000.
Information Retrieval: Human-Computer Interfaces and Information Access Process.
1 SIMS 247: Information Visualization and Presentation Marti Hearst March 3, 2004.
1 CS 430 / INFO 430 Information Retrieval Lecture 24 Usability 2.
SIMS 213: User Interface Design & Development Marti Hearst Tues Feb 13, 2001.
WMES3103: INFORMATION RETRIEVAL WEEK 10 : USER INTERFACES AND VISUALIZATION.
9/14/2000Information Organization and Retrieval Vector Representation, Term Weights and Clustering Ray Larson & Marti Hearst University of California,
Course Wrap-Up IS 485, Professor Matt Thatcher. 2 C.J. Minard ( )
Usability 2004 J T Burns1 Usability & Usability Engineering.
© Lethbridge/Laganière 2001 Chapter 7: Focusing on Users and Their Tasks1 7.1 User Centred Design (UCD) Software development should focus on the needs.
SIMS 213: User Interface Design & Development Marti Hearst Thurs, Jan 20, 2005.
SIMS 213: User Interface Design & Development Marti Hearst Thurs, Jan 22, 2004.
SIMS 213: User Interface Design & Development Marti Hearst Thurs, Jan 18, 2007.
Evaluation: Inspections, Analytics & Models
Introduction to HCI Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development January 21, 1999.
ISP 433/633 Week 12 User Interface in IR. Why care about User Interface in IR Human Search using IR depends on –Search in IR and search in human memory.
SLIDE 1IS 202 – FALL 2002 Lecture 24: Interfaces for Information Retrieval Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and.
Discount Usability Engineering Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development March 2, 1999.
Chapter 13: Designing the User Interface
Damian Gordon.  Summary and Relevance of topic paper  Definition of Usability Testing ◦ Formal vs. Informal methods of testing  Testing Basics ◦ Five.
Web Design Process CMPT 281. Outline How do we know good sites from bad sites? Web design process Class design exercise.
CSI-553 Internet Information Presented by: Ignacio Castro June 28, 2006 Internet Usability.
Usability Methods: Cognitive Walkthrough & Heuristic Evaluation Dr. Dania Bilal IS 588 Spring 2008 Dr. D. Bilal.
1. Learning Outcomes At the end of this lecture, you should be able to: –Define the term “Usability Engineering” –Describe the various steps involved.
CASE Tools And Their Effect On Software Quality Peter Geddis – pxg07u.
©2011 1www.id-book.com Analytical evaluation Chapter 15.
Heuristic Evaluation “Discount” Usability Testing Adapted from material by Marti Hearst, Loren Terveen.
Principles of User Centred Design Howell Istance.
Search Engines and Information Retrieval Chapter 1.
Computer –the machine the program runs on –often split between clients & servers Human-Computer Interaction (HCI) Human –the end-user of a program –the.
Using Metadata in Search Prof. Marti Hearst SIMS 202, Lecture 27.
Information Seeking Behavior Prof. Marti Hearst SIMS 202, Lecture 25.
Object-Oriented Software Engineering Practical Software Development using UML and Java Chapter 7: Focusing on Users and Their Tasks.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Chapter 6: NavigationCopyright © 2004 by Prentice Hall 6. Navigation Design Site-level navigation: making it easy for the user to get around the site Page-level.
SEG3120 User Interfaces Design and Implementation
Recuperação de Informação B Cap. 10: User Interfaces and Visualization , , 10.9 November 29, 1999.
Software Engineering User Interface Design Slide 1 User Interface Design.
User Interface Design & Usability for the Web Card Sorting You should now have a basic idea as to content requirements, functional requirements and user.
Interaction LBSC 734 Module 4 Doug Oard. Agenda Where interaction fits Query formulation Selection part 1: Snippets  Selection part 2: Result sets Examination.
Usability 1 Usability evaluation Without users - analytical techniques With users - survey and observational techniques.
Chapter 6 CASE Tools Software Engineering Chapter 6-- CASE TOOLS
Project Sharing  Team discussions (15 minutes) –Share results of your work on the Project Scope Proposal –Discuss your choice of methods and results –Prepare.
UWMS Data Mining Workshop Content Analysis: Automated Summarizing Prof. Marti Hearst SIMS 202, Lecture 16.
Chapter 15: Analytical evaluation. Aims: Describe inspection methods. Show how heuristic evaluation can be adapted to evaluate different products. Explain.
Object-Oriented Software Engineering Practical Software Development using UML and Java Chapter 7: Focusing on Users and Their Tasks.
Chapter. 3: Retrieval Evaluation 1/2/2016Dr. Almetwally Mostafa 1.
Prof. James A. Landay Richard Davis Kate Everitt University of Washington Autumn 2004 UW Undergraduate HCI Projects A CSE 490jl Overview December 9, 2004.
Usability Engineering Dr. Dania Bilal IS 587 Fall 2007.
Design Evaluation Overview Introduction Model for Interface Design Evaluation Types of Evaluation –Conceptual Design –Usability –Learning Outcome.
SIMS 202, Marti Hearst Content Analysis Prof. Marti Hearst SIMS 202, Lecture 15.
Document Clustering Matt Hughes.
Presentation transcript:

SIMS 202 Information Organization and Retrieval Prof. Marti Hearst and Prof. Ray Larson UC Berkeley SIMS Tues/Thurs 9:30-11:00am Fall 2000

Today l Review Basic Human-Computer Interaction Principles l Starting Points for Search

UI and Viz in IA: Chapter Contents

Slide by James Landay Human-Computer Interaction (HCI) l Human –the end-user of a program –the others in the organization l Computer –the machine the program runs on l Interaction –the user tells the computer what they want –the computer communicates results

Slide by James Landay What is HCI? HumansTechnology Task Design Organizational & Social Issues

Shneiderman on HCI l Well-designed interactive computer systems promote: –Positive feelings of success, competence, and mastery. –Allow users to concentrate on their work, rather than on the system.

Slide by James Landay Usability Design Goals l Ease of learning –faster the second time and so on... l Recall –remember how from one session to the next l Productivity –perform tasks quickly and efficiently l Minimal error rates –if they occur, good feedback so user can recover l High user satisfaction –confident of success

Adapted from slide by James Landay Usability Slogans (from Nielsen’s Usability Engineering) l Your best guess is not good enough l The user is always right l The user is not always right l Users are not designers l Designers are not users l Less is more l Details matter

Adapted from slide by James Landay Design Guidelines l Set of design rules to follow l Apply at multiple levels of design l Are neither complete nor orthogonal l Have psychological underpinnings (ideally)

Slide by James Landay Who builds UIs? l A team of specialists (ideally) –graphic designers –interaction / interface designers –technical writers –marketers –test engineers –software engineers

Adapted from slide by James Landay How to Design and Build UIs l Task analysis l Rapid prototyping l Evaluation l Implementation Design Prototype Evaluate Iterate at every stage!

Slide by James Landay Task Analysis l Observe existing work practices l Create examples and scenarios of actual use l Try out new ideas before building software

Slide by James Landay Rapid Prototyping l Build a mock-up of design l Low fidelity techniques –paper sketches –cut, copy, paste –video segments l Interactive prototyping tools –Visual Basic, HyperCard, Director, etc. l UI builders –NeXT, etc.

Slide by James Landay Evaluation l Test with real users (participants) l Build models l Low-cost techniques –expert evaluation –walkthroughs

Information Seeking Behavior l Two parts of a process: »search and retrieval »analysis and synthesis of search results l This is a fuzzy area; we will look at several different working theories.

Standard Model l Assumptions: –Maximizing precision and recall simultaneously –The information need remains static –The value is in the resulting document set

Problem with Standard Model: l Users learn during the search process: –Scanning titles of retrieved documents –Reading retrieved documents –Viewing lists of related topics/thesaurus terms –Navigating hyperlinks l Some users don’t like long disorganized lists of documents

“Berry-Picking” as an Information Seeking Strategy (Bates 90) l Standard IR model –assumes the information need remains the same throughout the search process l Berry-picking model –interesting information is scattered like berries among bushes –the query is continually shifting

A sketch of a searcher… “moving through many actions towards a general goal of satisfactory completion of research related to an information need.” (after Bates 89) Q0 Q1 Q2 Q3 Q4 Q5

Implications l Interfaces should make it easy to store intermediate results l Interfaces should make it easy to follow trails with unanticipated results l Makes evaluation more difficult.

Search Tactics and Strategies l Search Tactics –Bates 79 l Search Strategies –Bates 89 –O’Day and Jeffries 93

Tactics vs. Strategies l Tactic: short term goals and maneuvers –operators, actions l Strategy: overall planning –link a sequence of operators together to achieve some end

Information Search Tactics (after Bates 79) l Source-level tactics –navigate to and within sources l Term and Search Formulation tactics –designing search formulation –selection and revision of specific terms within search formulation l Monitoring tactics –keep search on track –(should really be called a strategy)

Term Tactics l Move around a thesaurus –(more on this in 2 nd half of class)

Source-level Tactics l “Bibble”: – look for a pre-defined result set – e.g., a good link page on web l Survey: –look ahead, review available options –e.g., don’t simply use the first term or first source that comes to mind l Cut: –eliminate large proportion of search domain –e.g., search on rarest term first

Source-level Tactics (cont.) l Stretch –use source in unintended way –e.g., use patents to find addresses l Scaffold –take an indirect route to goal –e.g., when looking for references to obscure poet, look up contemporaries

Monitoring Tactics (strategy-level) l Check –compare original goal with current state l Weigh –make a cost/benefit analysis of current or anticipated actions l Pattern –recognize common strategies l Correct Errors l Record –keep track of (incomplete) paths

Additional Considerations (Bates 79) l Need a Sort tactic l More detail is needed about short-term cost/benefit decision rule strategies l When to stop? –How to judge when enough information has been gathered? –How to decide when to give up an unsuccesful search? –When to stop searching in one source and move to another?

After the Search l How to synthesize information is part of the information use process l One “theory” is called sensemaking –Russell at al. paper –Dan Russell is speaking today at 4pm! Room 110. Different topic.

Post-Search Analysis Types (O’Day & Jeffries 93) l Trends l Comparisons l Aggregation and Scaling l Identifying a Critical Subset l Assessing l Interpreting l The rest: »cross-reference »summarize »find evocative visualizations »miscellaneous

SenseMaking (Russell et al. 93) l The process of encoding retrieved information to answer task-specific questions l Combine –internal cognitive resources –external retrieved resources l Create a good representation –an iterative process –contend with a cost/benefit tradoff

The SenseMaking Loop,From Russell et al., 93

Observed Activities of Business Analysts Working From Russell et al.,93

The SenseMaking Process,From Russell et al.,InterCHI 93.

Sensemaking (Russell et al. 93) l An anytime activity –At any point a workable solution is available –Usually more time -> better solution –Usually more properties -> better solution

Sensemaking (Russell et al. 93) l A good strategy –Maximizes long term rate of gain –Example: »new technology brings more info faster »this causes a uniform increase in useful and useless information »best strategy: throw out bad stuff faster

Sensemaking (Russell et al. 93) l Most of the effort is in the synthesis of a good representation –covers the data –increase usability –decrease cost-of-use

UI and Viz in IA: Chapter Contents

Starting Points for Search l Types: –Lists –Overviews »Categories »Clusters »Links/Hyperlinks –Examples, Wizards, Guided Tours

Starting Points for Search l Faced with a prompt or an empty entry form … how to start? –Lists of sources –Overviews »Clusters »Category Hierarchies/Subject Codes »Co-citation links –Examples, Wizards, and Guided Tours –Automatic source selection

List of Sources l Have to guess based on the name l Requires prior exposure/experience

Dialog box for chosing sources in old lexis-nexis interface

Overviews in the User Interface l Supervised (Manual) Category Overviews –Yahoo! –HiBrowse –MeSHBrowse l Unsupervised (Automated) Groupings –Clustering –Kohonen Feature Maps

Incorporating Categories into the Interface l Yahoo is the standard method l Problems: –Hard to search, meant to be navigated. –Only one category per document (usually)

Evidence l Web search engines are heavily using –Link analysis –Page popularity –Interwoven categories l These all find dominant home pages

More Complex Example: MeSH and MedLine l MeSH Category Hierarchy –Medical Subject Headings –~18,000 labels –manually assigned –~8 labels/article on average –avg depth: 4.5, max depth 9 l Top Level Categories: anatomydiagnosisrelated disc animalspsychtechnology diseasebiologyhumanities drugsphysics

Category Labels l Advantages: –Interpretable –Capture summary information –Describe multiple facets of content –Domain dependent, and so descriptive l Disadvantages –Do not scale well (for organizing documents) –Domain dependent, so costly to acquire –May mis-match users’ interests

MeshBrowse (Korn & Shneiderman95) Grow the category structure gradually and in response to semantic similarity

HiBrowse (Pollitt 97) Show combinations of categories given that some categories already seen

Large Category Sets l Problems for User Interfaces » Too many categories to browse » Too many docs per category » Docs belong to multiple categories » Need to integrate search » Need to show the documents

Text Clustering l Finds overall similarities among groups of documents l Finds overall similarities among groups of tokens l Picks out some themes, ignores others

Scatter/Gather Cutting, Pedersen, Tukey & Karger 92, 93, Hearst & Pedersen 95 l How it works –Cluster sets of documents into general “themes”, like a table of contents –Display the contents of the clusters by showing topical terms and typical titles –User chooses subsets of the clusters and re- clusters the documents within –Resulting new groups have different “themes” l Originally used to give collection overview l Evidence suggests more appropriate for displaying retrieval results in context

S/G Example: query on “star” Encyclopedia text 14 sports 8 symbols47 film, tv 68 film, tv (p) 7 music 97 astrophysics 67 astronomy(p)12 steller phenomena 10 flora/fauna 49 galaxies, stars 29 constellations 7 miscelleneous Clustering and re-clustering is entirely automated

Using Clustering in Document Ranking l Cluster entire collection l Find cluster centroid that best matches the query l This has been explored extensively –it is expensive –it doesn’t work well

Two Queries: Two Clusterings AUTO, CAR, ELECTRICAUTO, CAR, SAFETY The main differences are the clusters that are central to the query 8 control drive accident … 25 battery california technology … 48 import j. rate honda toyota … 16 export international unit japan 3 service employee automatic … 6 control inventory integrate … 10 investigation washington … 12 study fuel death bag air … 61 sale domestic truck import … 11 japan export defect unite …

Another use of clustering l Use clustering to map the entire huge multidimensional document space into a huge number of small clusters. l “Project” these onto a 2D graphical representation –Group by doc: SPIRE/Kohonen maps –Group by words: Galaxy of News/HotSauce/Semio

Clustering Multi-Dimensional Document Space (image from Wise et al 95)

Kohonen Feature Maps on Text (from Chen et al., JASIS 49(7))

UWMS Data Mining Workshop Study of Kohonen Feature Maps l H. Chen, A. Houston, R. Sewell, and B. Schatz, JASIS 49(7) l Comparison: Kohonen Map and Yahoo l Task: –“Window shop” for interesting home page –Repeat with other interface l Results: –Starting with map could repeat in Yahoo (8/11) –Starting with Yahoo unable to repeat in map (2/14)

UWMS Data Mining Workshop Study (cont.) l Participants liked: –Correspondence of region size to # documents –Overview (but also wanted zoom) –Ease of jumping from one topic to another –Multiple routes to topics –Use of category and subcategory labels

UWMS Data Mining Workshop Study (cont.) l Participants wanted: –hierarchical organization –other ordering of concepts (alphabetical) –integration of browsing and search –corresponce of color to meaning –more meaningful labels –labels at same level of abstraction –fit more labels in the given space –combined keyword and category search –multiple category assignment (sports+entertain)

Visualization of Clusters –Huge 2D maps may be inappropriate focus for information retrieval »Can’t see what documents are about »Documents forced into one position in semantic space »Space is difficult to use for IR purposes »Hard to view titles –Perhaps more suited for pattern discovery »problem: often only one view on the space

Summary: Clustering l Advantages: –Get an overview of main themes –Domain independent l Disadvantages: –Many of the ways documents could group together are not shown –Not always easy to understand what they mean –Different levels of granularity

Next Time Interfaces for Query Specification