Content Metadata and Search Remarks to the Dublin Core Workshop Marti Hearst SIMS, UC Berkeley September 28, 2003.

Slides:



Advertisements
Similar presentations
Copyright © 2003 Pearson Education, Inc.
Advertisements

The DART-Europe E-theses Portal Martin Moyle Digital Curation Manager UCL Library Services, UK ETD 2009, University of Pittsburgh, June.
Methodology and Explanation XX50125 Lecture 1: Part I. Introduction to Evaluation Methods Part 2. Experiments Dr. Danaë Stanton Fraser.
1 Designing a training programme Module 6 Sessions 9&10.
How to Schedule (and Put off Procrastinating)
Configuration management
1 Overview Background Goals Methodology Participants Findings Recommendations.
Database Modeling Past and Present
1 Queen Elizabeth High School HOME of the KNIGHTS.
Page 1 / 18 Internet Traffic Monitor IM Page 2 / 18 Outline Product Overview Product Features Product Application Web UI.
Macromedia Dreamweaver MX 2004 – Design Professional Dreamweaver GETTING STARTED WITH.
05/10/2011http:// 1/15 Connected! How we Integrated our Collections in WordPress using the EMu API Paul Trafford
05/19/04 1 A Lessons Learned Process Celebrate the Successes Learn From the Woes Natalie Scott, PMP Sr. Project Manager.
Lindsey Main 1, 2 Lindsey Main 1, 2 Kathleen McGraw 2 Kathleen McGraw 2 User Services Department at UNC Chapel Hill Health Sciences Library  supports.
1 Distributed Agents for User-Friendly Access of Digital Libraries DAFFODIL Effective Support for Using Digital Libraries Norbert Fuhr University of Duisburg-Essen,
Chapter 12 User Interface Design
User Research Findings. 1 Overview Background Study goals Methodology Participants Findings Recommendations.
R2 Library Features and Functionality Overview. The R2 Library  The R2 Library is an electronic database that enables access to digital book content.
Searching Pubmed Database استخدام قاعدة المعلومات Pubmed د. سيناء عبد المحسن العقيل قسم الصيدلة الإكلينيكية برنامج مهارات البحث العلمي.
Web E’s goal is for you to understand how to create an initial interaction design and how to evaluate that design by studying a sample. Web F’s goal is.
1 Using Words to Search a Thousand Images Hierarchical Faceted Metadata in Search & Browsing Marti Hearst SIMS, UC Berkeley Research funded by: NSF CAREER.
SimPL Diagnostic Peritoneal Lavage Simulator Col(s) Mark W. Bowyer, MD, FACS Alan Liu, PhD National Capital Area Medical Simulation Center Uniformed Services.
Measuring Information Architecture CHI 01 Panel Position Statement Marti Hearst UC Berkeley.
1 Ideas for Integrating Browsing and Search in the CDL Marti Hearst SIMS, UC Berkeley
Universal Access: More People. More Situations Content or Graphics Content or Graphics? An Empirical Analysis of Criteria for Award-Winning Websites Rashmi.
1 Using Words to Search a Thousand Images Hierarchical Faceted Metadata in Search & Browsing Marti Hearst SIMS, UC Berkeley Research funded by: NSF CAREER.
Social Tagging and Search Marti Hearst UC Berkeley.
Faceted Metadata in Search Interfaces Marti Hearst UC Berkeley School of Information This Research Supported by NSF IIS
1 Flexible Search and Navigation using Faceted Metadata Prof. Marti Hearst Dr. Rashmi Sinha, Ame Elliott, Jennifer English, Kirsten Swearingen, Ping Yee.
Measuring Information Architecture Marti Hearst UC Berkeley.
Retrieval Evaluation. Brief Review Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.
Measuring Information Architecture Marti Hearst UC Berkeley.
© Tefko Saracevic, Rutgers University1 digital libraries and human information behavior Tefko Saracevic, Ph.D. School of Communication, Information and.
A metadata-based approach Marti Hearst Associate Professor BT Visit August 18, 2005.
Faceted Metadata in Search Interfaces Marti Hearst UC Berkeley School of Information This Research Supported by NSF IIS
Incorporating Metadata into Search User Interfaces Marti Hearst UC Berkeley.
Faceted Metadata in Search Interfaces Marti Hearst UC Berkeley School of Information This Research Supported by NSF IIS
1 CS 430 / INFO 430 Information Retrieval Lecture 24 Usability 2.
Confidential © 2005 eBay Inc. All rights reserved. eBay and the eBay logo are registered trademarks of eBay Inc. Faceted Metadata for Information Architecture.
Faceted Metadata for Information Architecture and Search Marti Hearst, SIMS at UC Berkeley Preston Smalley & Corey Chandler, eBay User Experience & Design.
Automating Assessment of Web Site Usability Marti Hearst Melody Ivory Rashmi Sinha University of California, Berkeley.
NEC Symposium 2000 Automating Assessment of Web Site Usability Marti Hearst University of California, Berkeley.
Facets of a Metaproject: a case in human interface design research Human Factors and Interface Design Ransom Byers April 25, 2005.
Faceted Metadata in Image Search & Browsing Using Words to Browse a Thousand Images Ka-Ping Yee, Kirsten Swearingen, Kevin Li, Marti Hearst Group for User.
UCB CS Research Fair Search Text Mining Web Site Usability Marti Hearst SIMS.
UIs for Faceted Navigation Recent Advances and Remaining Open Problems HCIR’08 Marti Hearst, UC Berkeley (including some slides from Corey Chandler of.
Measuring Information Architecture Marti Hearst UC Berkeley.
SIMS 213: User Interface Design & Development Marti Hearst Thurs, March 18, 2004.
Incorporating Metadata into Search UIs Marti Hearst UC Berkeley.
1 Flexible Search and Navigation using Faceted Metadata Prof. Marti Hearst University of California, Berkeley Search Engines Meeting, April 2002 Research.
Mining the Web for Design Guidelines Marti Hearst, Melody Ivory, Rashmi Sinha UC Berkeley.
1 Using Words to Search a Thousand Images Hierarchical Faceted Metadata in Search & Browsing Marti Hearst SIMS, UC Berkeley Research funded by: NSF CAREER.
© 2004 Keynote Systems Customer Experience Management (CEM) Bonny Brown, Ph.D. Director, Research & Public Services.
1 The Gateway to Information: Simplifying Access to Library Resources Fred Roecker Head Instruction The Ohio State University Libraries
Put it to the Test: Usability Testing of Library Web Sites Nicole Campbell, Washington State University.
The Descent of Hierarchy, and Selection in Relational Semantics* Barbara Rosario, Marti Hearst, Charles Fillmore UC Berkeley *with apologies to Charles.
Scent Trails: Integrating Browsing and Searching on the Web Christopher Olson et al. Blake Adams November 4, 2003.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Definition of a taxonomy “System for naming and organizing things into groups that share similar characteristics” Taxonomy Architectures Applications.
INFO Week 8 Subject Indexing & Knowledge Representation Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.
Session A305 Findability: Information Not Location Mike Creech Web Content Manager Ken Varnum Web Systems Manager University.
Assess usability of a Web site’s information architecture: Approximate people’s information-seeking behavior (Monte Carlo simulation) Output quantitative.
By Dr Hidayathulla Shaikh. Objectives  At the end of the lecture student should be able to –  Define survey  Mention uses of survey  Discuss types.
MeSH: Medical Subject Headings Anne Allen, Heather Braum, Paula Davidson, Ellen Rose LI 804: Organization of Information.
User Characterization in Search Personalization
Data Synthesis and Analysis
VELTI Evaluation Methodology
Lesson Objectives Aims You should know about: – Web Technologies
The Descent of Hierarchy, and Selection in Relational Semantics*
Presentation transcript:

Content Metadata and Search Remarks to the Dublin Core Workshop Marti Hearst SIMS, UC Berkeley September 28, 2003

M. HearstFaceted Metadata in Search Resource Finding and the Web Web search vs. collection search –When a single page is all thats needed, web search is fine Although validity is an issue –Unsolved problem: How to make source-focused search more intuitive on the web? One idea (untested): task-based search

M. HearstFaceted Metadata in Search What about Content? Dublin Core takes stances on the content- neutral aspects of metadata Q: What about content? –The Metadata Marsh Getting agreement on metadata terms is difficult Even worse when talking about content! A: Domain-specific solutions –Dont worry about cross-domain consistency (a necessary drawback) –Success: b-to-b protocols

M. HearstFaceted Metadata in Search Hypothesis (as yet untested): Assuming weve focused on a domain, agreement on category assignment can converge much more quickly by: 1.Focusing on the applications that will use the category system. 2.Designing metadata to be used in interfaces that show items represented by many different categories in a highly flexible, but intuitive, manner.

M. HearstFaceted Metadata in Search One Example: Flamenco Project Goal: create intuitive, inviting search interfaces that make use of hierarchical faceted metadata Challenge: How to provide flexibility and power without overwhelming? (Answer: careful interface design)

6 The Flamenco Project Team Brycen Chun Ame Elliott Jennifer English Kevin Li Rashmi Sinha Kirsten Swearingen Ping Yee Research funded by: NSF CAREER Grant IIS IBM Faculty Fellowship

M. HearstFaceted Metadata in Search Our Approach Integrate the search seamlessly into the information architecture. –Use proper HCI methodologies. Use faceted metadata: –More flexible than canned hyperlinks –Less complex than full search –Help users see where to go next and return to what happened previously Whats new? –Putting hierarchical facets into a useable interface.

M. HearstFaceted Metadata in Search Metadata: data about data Facets: orthogonal categories Time/DateTopicGeoRegion

M. HearstFaceted Metadata in Search Hierarchical Faceted Metadata Example: Biological Subject Headings 1. Anatomy [A] 2. Organisms [B] 3. Diseases [C] 4. Chemicals and Drugs [D] 5. Analytical, Diagnostic and Therapeutic Techniques and Equipment [E] 6. Psychiatry and Psychology [F] 7. Biological Sciences [G] 8. Physical Sciences [H] 9. Anthropology, Education, Sociology and Social Phenomena [I] 10. Technology and Food and Beverages [J] 11. Humanities [K] 12. Information Science [L] 13. Persons [M] 14. Health Care [N] 15. Geographic Locations [Z]

M. HearstFaceted Metadata in Search Hierarchical Faced Metadata 1. Anatomy [A] Body Regions [A01] 2. [B] Musculoskeletal System [A02] 3. [C] Digestive System [A03] 4. [D] Respiratory System [A04] 5. [E] Urogenital System [A05] 6. [F] …… 7. [G] 8. Physical Sciences [H] 9. [I] 10. [J] 11. [K] 12. [L] 13. [M]

M. HearstFaceted Metadata in Search Hierarchical Faceted Metadata 1. Anatomy [A] Body Regions [A01] Abdomen [A01.047] 2. [B] Musculoskeletal System [A02] Back [A01.176] 3. [C] Digestive System [A03] Breast [A01.236] 4. [D] Respiratory System [A04] Extremities [A01.378] 5. [E] Urogenital System [A05] Head [A01.456] 6. [F] …… Neck [A01.598] 7. [G] …. 8. Physical Sciences [H] 9. [I] 10. [J] 11. [K] 12. [L] 13. [M]

M. HearstFaceted Metadata in Search Hierarchical Faceted Metadata 1. Anatomy [A] Body Regions [A01] Abdomen [A01.047] 2. [B] Musculoskeletal System [A02] Back [A01.176] 3. [C] Digestive System [A03] Breast [A01.236] 4. [D] Respiratory System [A04] Extremities [A01.378] 5. [E] Urogenital System [A05] Head [A01.456] 6. [F] …… Neck [A01.598] 7. [G] …. 8. Physical Sciences [H] Electronics 9. [I] Astronomy 10. [J] Nature 11. [K] Time 12. [L] Weights and Measures 13. [M] ….

M. HearstFaceted Metadata in Search Hierarchical Faceted Metadata 1. Anatomy [A] Body Regions [A01] Abdomen [A01.047] 2. [B] Musculoskeletal System [A02] Back [A01.176] 3. [C] Digestive System [A03] Breast [A01.236] 4. [D] Respiratory System [A04] Extremities [A01.378] 5. [E] Urogenital System [A05] Head [A01.456] 6. [F] …… Neck [A01.598] 7. [G] …. 8. Physical Sciences [H] Electronics Amplifiers 9. [I] Astronomy Electronics, Medical 10. [J] Nature Transducers 11. [K] Time 12. [L] Weights and Measures 13. [M] ….

M. HearstFaceted Metadata in Search Hierarchical Faceted Metadata 1. Anatomy [A] Body Regions [A01] Abdomen [A01.047] 2. [B] Musculoskeletal System [A02] Back [A01.176] 3. [C] Digestive System [A03] Breast [A01.236] 4. [D] Respiratory System [A04] Extremities [A01.378] 5. [E] Urogenital System [A05] Head [A01.456] 6. [F] …… Neck [A01.598] 7. [G] …. 8. Physical Sciences [H] Electronics Amplifiers 9. [I] Astronomy Electronics, Medical 10. [J] Nature Transducers 11. [K] Time 12. [L] Weights and Measures Calibration 13. [M] …. Metric System Reference Standard

M. HearstFaceted Metadata in Search The Interface Design Chess metaphor –Opening –Middle game –End game

M. HearstFaceted Metadata in Search

M. HearstFaceted Metadata in Search

M. HearstFaceted Metadata in Search

M. HearstFaceted Metadata in Search

M. HearstFaceted Metadata in Search

M. HearstFaceted Metadata in Search

M. HearstFaceted Metadata in Search

M. HearstFaceted Metadata in Search

M. HearstFaceted Metadata in Search

M. HearstFaceted Metadata in Search The Interface Design Tightly Integrated Search Supports Expand as well as Refine Dynamically Generated Pages –Paths can be taken in any order –Links are idempotent Consistent Color Coding Consistent Backup and Bookmarking Standard HTML –No javascript

M. HearstFaceted Metadata in Search What is Tricky About This? It is easy to do it poorly –Yahoo directory structure It is hard to be not overwhelming –Most users prefer simplicity unless complexity really makes a difference It is hard to make it flow –Can it feel like browsing the shelves? –Yes, but we iterated the design 3 times

M. HearstFaceted Metadata in Search Usability Study Participants & Collection –32 Art History Students –~35,000 images from SF Fine Arts Museum Study Design –Within-subjects Each participant sees both interfaces Balanced in terms of order and tasks –Participants assess each interface after use –Afterwards they compare them directly Data recorded in behavior logs, server logs, paper- surveys; one or two experienced testers at each trial. Used 9 point Likert scales. Session took about 1.5 hours; pay was $15/hour

M. HearstFaceted Metadata in Search The Baseline System Floogle Take the best of the existing keyword- based image search systems

M. HearstFaceted Metadata in Search sword

M. HearstFaceted Metadata in Search

M. HearstFaceted Metadata in Search

M. HearstFaceted Metadata in Search

M. HearstFaceted Metadata in Search Hypotheses We attempted to design tasks to test the following hypotheses: –Participants will experience greater search satisfaction, feel greater confidence in the results, produce higher recall, and encounter fewer dead ends using FC over Baseline –FC will perceived to be more useful and flexible than Baseline –Participants will feel more familiar with the contents of the collection after using FC –Participants will use FC to create multi-faceted queries

M. HearstFaceted Metadata in Search Four Types of Tasks –Unstructured (3): Search for images of interest –Structured Task (11-14): Gather materials for an art history essay on a given topic, e.g. Find all woodcuts created in the US Choose the decade with the most Select one of the artists in this periods and show all of their woodcuts Choose a subject depicted in these works and find another artist who treated the same subject in a different way. –Structured Task (10): compare related images Find images by artists from 2 different countries that depict conflict between groups. –Unstructured (5): search for images of interest

M. HearstFaceted Metadata in Search Other Points Participants were NOT walked through the interfaces. The wording of Task 2 reflected the metadata; not the case for Task 3 Within tasks, queries were not different in difficulty (ts 0.05 according to post-task questions) Flamenco is and order of magnitude slower than Floogle on average. –In task 2 users were allowed 3 more minutes in FC than in Baseline. –Time spent in tasks 2 and 3 were significantly longer in FC (about 2 min more).

M. HearstFaceted Metadata in Search Post-Interface Assessments All significant at p<.05 except simple and overwhelming

M. HearstFaceted Metadata in Search Perceived Uses of Interfaces Baseline FC

M. HearstFaceted Metadata in Search Post-Test Comparison FC Baseline Find images of roses Find all works from a given period Find pictures by 2 artists in same media Which Interface Preferable For:

M. HearstFaceted Metadata in Search Post-Test Comparison FC Baseline Overall Assessment: More useful for your tasks Easiest to use Most flexible More likely to result in dead ends Helped you learn more Overall preference Find images of roses Find all works from a given period Find pictures by 2 artists in same media Which Interface Preferable For:

M. HearstFaceted Metadata in Search Study Results Summary Strongly positive results for the faceted metadata interface. Moderate use of multiple facets. Strong preference over the current state of the art. –Chair of Architecture Dept: It felt like I was browsing the shelves! –This kind of enthusiasm is not seen in similarity- based image search interfaces. Hypotheses are supported.

M. HearstFaceted Metadata in Search Study Summary Usability studies done on 3 collections: –Recipes: 13,000 items –Architecture Images: 40,000 items –Fine Arts Images: 35,000 items Conclusions: –Users like and are successful with the dynamic faceted hierarchical metadata, especially for browsing tasks –Very positive results, in contrast with studies on earlier iterations –Note: it seems you have to care about the contents of the collection to like the interface

M. HearstFaceted Metadata in Search Advantages of the Approach Supports different search types –Highly constrained known-item searches –Open-ended, browsing tasks –Can easily switch from one mode to the other midstream –Can both expand and refine Allows different people to add content without breaking things Can make use of standard technology

M. HearstFaceted Metadata in Search Metadata Availability Many collections already have rich metadata associated with them. Automated methods are improving. Have applied this to: –Tobacco documents archive –MEDLINE

M. HearstFaceted Metadata in Search Back to the Hypothesis This kind of tool may be helpful for resolving metadata creation wars. –Multiple paths to get to the same item –Different views on different subsets of items –No need to force everything into one hierarchy What do you think?