Faceted Metadata in Image Search & Browsing Using Words to Browse a Thousand Images Ka-Ping Yee, Kirsten Swearingen, Kevin Li, Marti Hearst Group for User.

Slides:



Advertisements
Similar presentations
Collections Management Software for Museums and Archives r e d i s c o v e r y s o f t w a r e. c o m O V E R V I E W P R E S E N T A T I O N.
Advertisements

Content Metadata and Search Remarks to the Dublin Core Workshop Marti Hearst SIMS, UC Berkeley September 28, 2003.
Haystack: Per-User Information Environment 1999 Conference on Information and Knowledge Management Eytan Adar et al Presented by Xiao Hu CS491CXZ.
Chapter 2. Slide 1 CULTURAL SUBJECT GATEWAYS CULTURAL SUBJECT GATEWAYS Subject Gateways  Started as links of lists  Continued as Web directories  Culminated.
Leveraging Your Taxonomy to Increase User Productivity MAIQuery and TM Navtree.
Diary studies Rikard Harr November 2010 © Rikard Harr Outline The Diary study: benefits, challenges and alternatives The papers: aims and use of.
1 Using Words to Search a Thousand Images Hierarchical Faceted Metadata in Search & Browsing Marti Hearst SIMS, UC Berkeley Research funded by: NSF CAREER.
Search and Retrieval: More on Term Weighting and Document Ranking Prof. Marti Hearst SIMS 202, Lecture 22.
SIMS 213: User Interface Design & Development Marti Hearst Thurs, March 3, 2005.
Measuring Information Architecture CHI 01 Panel Position Statement Marti Hearst UC Berkeley.
1 Ideas for Integrating Browsing and Search in the CDL Marti Hearst SIMS, UC Berkeley
DENIM: Finding a Tighter Fit with Web Design Practice James Lin, Mark W. Newman, Jason I. Hong, James A. Landay April 6, 2000 CHI 2000, The Hague
Automated Reference Assistance: Reference for a New Generation Denise Troll Covey Associate University Librarian Carnegie Mellon CNI Meeting – April 2002.
Interfaces for Selecting and Understanding Collections.
1 Using Words to Search a Thousand Images Hierarchical Faceted Metadata in Search & Browsing Marti Hearst SIMS, UC Berkeley Research funded by: NSF CAREER.
Faceted Metadata in Search Interfaces Marti Hearst UC Berkeley School of Information This Research Supported by NSF IIS
Flamenco Image Browser: Using Metadata to Improve Image Search During Architectural Design Ame Elliott Group for User Interface Research (GUIR) & Dept.
1 Flexible Search and Navigation using Faceted Metadata Prof. Marti Hearst Dr. Rashmi Sinha, Ame Elliott, Jennifer English, Kirsten Swearingen, Ping Yee.
Measuring Information Architecture Marti Hearst UC Berkeley.
COMP6703 : eScience Project III ArtServe on Rubens Emy Elyanee binti Mustapha Supervisor: Peter Stradzins Client: Professor Michael.
Retrieval Evaluation. Brief Review Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.
Measuring Information Architecture Marti Hearst UC Berkeley.
A metadata-based approach Marti Hearst Associate Professor BT Visit August 18, 2005.
Faceted Metadata in Search Interfaces Marti Hearst UC Berkeley School of Information This Research Supported by NSF IIS
Incorporating Metadata into Search User Interfaces Marti Hearst UC Berkeley.
Faceted Metadata in Search Interfaces Marti Hearst UC Berkeley School of Information This Research Supported by NSF IIS
1 CS 430 / INFO 430 Information Retrieval Lecture 24 Usability 2.
Faceted Metadata for Information Architecture and Search Marti Hearst, SIMS at UC Berkeley Preston Smalley & Corey Chandler, eBay User Experience & Design.
Automating Assessment of Web Site Usability Marti Hearst Melody Ivory Rashmi Sinha University of California, Berkeley.
Facets of a Metaproject: a case in human interface design research Human Factors and Interface Design Ransom Byers April 25, 2005.
Measuring Information Architecture Marti Hearst UC Berkeley.
SIMS 213: User Interface Design & Development Marti Hearst Thurs, March 18, 2004.
1 Flexible Search and Navigation using Faceted Metadata Prof. Marti Hearst University of California, Berkeley Search Engines Meeting, April 2002 Research.
Mining the Web for Design Guidelines Marti Hearst, Melody Ivory, Rashmi Sinha UC Berkeley.
1 Using Words to Search a Thousand Images Hierarchical Faceted Metadata in Search & Browsing Marti Hearst SIMS, UC Berkeley Research funded by: NSF CAREER.
SIMS 213: User Interface Design & Development Marti Hearst Thurs, March 14, 2002.
Community Planning Training 1-1. Community Plan Implementation Training 1- Community Planning Training 1-3.
Chapter 7 Requirement Modeling : Flow, Behaviour, Patterns And WebApps.
An Introduction to Visual Analysis Katy Gregg & Desiree Paulin Seponski QUAL 8420 March 26, 2009.
Introduction to Interactive Media 02. The Interactive Media Development Process.
Multimedia Databases (MMDB)
Evaluation of Adaptive Web Sites 3954 Doctoral Seminar 1 Evaluation of Adaptive Web Sites Elizabeth LaRue by.
Put it to the Test: Usability Testing of Library Web Sites Nicole Campbell, Washington State University.
Knowledge Representation and Indexing Using the Unified Medical Language System Kenneth Baclawski* Joseph “Jay” Cigna* Mieczyslaw M. Kokar* Peter Major.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
What is Usability? Usability Is a measure of how easy it is to use something: –How easy will the use of the software be for a typical user to understand,
Search - on the Web and Locally Related directly to Web Search Engines: Part 1 and Part 2. IEEE Computer. June & August 2006.
Perception of Content, Structure, and Presentation Changes in Web-based Hypertext Luis Francisco-Revilla Frank M. Shipman III Richard Furuta Unmil Karadkar.
©2003 Paula Matuszek CSC 9010: Text Mining Applications Document Summarization Dr. Paula Matuszek (610)
XP New Perspectives on The Internet, Sixth Edition— Comprehensive Tutorial 3 1 Searching the Web Using Search Engines and Directories Effectively Tutorial.
The Internet 8th Edition Tutorial 4 Searching the Web.
Searching the web Enormous amount of information –In 1994, 100 thousand pages indexed –In 1997, 100 million pages indexed –In June, 2000, 500 million pages.
Population Census Data Dissemination through Internet H. Furuta Lecturer/Statistician SIAP 1 Training Course on Analysis and Dissemination of Population.
Software Engineering User Interface Design Slide 1 User Interface Design.
Measuring How Good Your Search Engine Is. *. Information System Evaluation l Before 1993 evaluations were done using a few small, well-known corpora of.
CASE (Computer-Aided Software Engineering) Tools Software that is used to support software process activities. Provides software process support by:- –
Search Engine using Web Mining COMS E Web Enhanced Information Mgmt Prof. Gail Kaiser Presented By: Rupal Shah (UNI: rrs2146)
ASSOCIATIVE BROWSING Evaluating 1 Jinyoung Kim / W. Bruce Croft / David Smith for Personal Information.
Supporting the design of interactive systems a perspective on supporting people’s work Hans de Graaff 27 april 2000.
Assess usability of a Web site’s information architecture: Approximate people’s information-seeking behavior (Monte Carlo simulation) Output quantitative.
Bringing Order to the Web : Automatically Categorizing Search Results Advisor : Dr. Hsu Graduate : Keng-Wei Chang Author : Hao Chen Susan Dumais.
VELTI Evaluation Methodology
Usability Evaluation, part 2
The Use of Facets in Web Search Engines
Tools of Software Development
Linked Open Data Project
User interface design.
Incorporating Metadata into Search User Interfaces
Presentation transcript:

Faceted Metadata in Image Search & Browsing Using Words to Browse a Thousand Images Ka-Ping Yee, Kirsten Swearingen, Kevin Li, Marti Hearst Group for User Interface Research UC Berkeley CHI 2003 Research funded by: NSF CAREER Grant IIS IBM Faculty Fellowship

M. HearstFaceted Metadata in Search Outline How do people search and browse for images? Current approaches: –Keywords –Spatial similarity Our approach: –Hierarchical Faceted Metadata –Very careful UI design and testing Usability Study Conclusions

M. HearstFaceted Metadata in Search How do people want to search and browse images? Ethnographic studies of people who use images intensely: –Finding specific objects is easy –Find images of the Empire State Building –Browsing is difficult –People want to use rich descriptions.

M. HearstFaceted Metadata in Search Ethnographic Study Markkula & Sormunen ’00 –Journalists and newspaper editors –Choosing photos from a digital archive Searching for specific objects is trivial Stressed a need for browsing Photos need to deal with themes, places, types of objects, views –Had access to a powerful interface, but it had 40 entry forms and was generally hard to use; no one used it.

M. HearstFaceted Metadata in Search Markkula & Sormunen ’00

M. HearstFaceted Metadata in Search Query Study Armitage & Enser ’97 –Analyzed 1,749 queries submitted to 7 image and film archives –Classified queries into a 3x4 facet matrix Rio Carnivals: Geo Location x Kind of Event –Concluded that users want to search images according to combinations of topical categories.

M. HearstFaceted Metadata in Search Ethnographic Study Ame Elliot ’02 –Architects Common activities: –Use images for inspiration Browsing during early stages of design –Collage making, sketching, pinning up on walls This is different than illustrating powerpoint Maintain sketchbooks & shoeboxes of images –Young professionals have ~500, older ~5k No formal organization scheme –None of 10 architects interviewed about their image collections used indexes Do not like to use computers to find images

M. HearstFaceted Metadata in Search Current Approaches to Image Search Keyword based –WebSeek (Smith and Jain ’97) –Commercial web image search systems –Commercial image vendors (Corbis, Getty) –Museum web sites

M. HearstFaceted Metadata in Search Current Approaches to Image Search Using Visual “Content” –Extract color, texture, shape QBIC (Flickner et al. ‘95) Blobworld (Carson et al. ‘99) Piction: images + text (Srihari et al. ’91 ’99) –Two uses: Show a clustered similarity space Show those images similar to a selected one –Usability studies: Rodden et al.: a series of studies Clusters don’t work; showing textual labels is promising.

M. HearstFaceted Metadata in Search Rodden et al., CHI 2001

M. HearstFaceted Metadata in Search Rodden et al., CHI 2001

M. HearstFaceted Metadata in Search Rodden et al., CHI 2001

M. HearstFaceted Metadata in Search How Best to Support Browsing? To support serendipity, want to view images that are related along multiple dimensions. But clusters are not comprehensible. Instead, allow users to “steer” through the multi-dimensional category space in a flexible manner.

M. HearstFaceted Metadata in Search Some Challenges Users don’t like new search interfaces. How to show lots more information without overwhelming or confusing?

M. HearstFaceted Metadata in Search Our Approach Integrate the search seamlessly into the information architecture. –Use proper HCI methodologies. Use faceted metadata: –More flexible than canned hyperlinks –Less complex than full search –Help users see where to go next and return to what happened previously

M. HearstFaceted Metadata in Search Metadata: data about data Facets: orthogonal categories Time/DateTopicGeoRegion 

M. HearstFaceted Metadata in Search Hierarchical Faceted Metadata Example: Biological Subject Headings 1. Anatomy [A] 2. Organisms [B] 3. Diseases [C] 4. Chemicals and Drugs [D] 5. Analytical, Diagnostic and Therapeutic Techniques and Equipment [E] 6. Psychiatry and Psychology [F] 7. Biological Sciences [G] 8. Physical Sciences [H] 9. Anthropology, Education, Sociology and Social Phenomena [I] 10. Technology and Food and Beverages [J] 11. Humanities [K] 12. Information Science [L] 13. Persons [M] 14. Health Care [N] 15. Geographic Locations [Z]

M. HearstFaceted Metadata in Search Hierarchical Faced Metadata 1. Anatomy [A] Body Regions [A01] 2. [B] Musculoskeletal System [A02] 3. [C] Digestive System [A03] 4. [D] Respiratory System [A04] 5. [E] Urogenital System [A05] 6. [F] …… 7. [G] 8. Physical Sciences [H] 9. [I] 10. [J] 11. [K] 12. [L] 13. [M]

M. HearstFaceted Metadata in Search Hierarchical Faceted Metadata 1. Anatomy [A] Body Regions [A01] Abdomen [A01.047] 2. [B] Musculoskeletal System [A02] Back [A01.176] 3. [C] Digestive System [A03] Breast [A01.236] 4. [D] Respiratory System [A04] Extremities [A01.378] 5. [E] Urogenital System [A05] Head [A01.456] 6. [F] …… Neck [A01.598] 7. [G] …. 8. Physical Sciences [H] 9. [I] 10. [J] 11. [K] 12. [L] 13. [M]

M. HearstFaceted Metadata in Search Hierarchical Faceted Metadata 1. Anatomy [A] Body Regions [A01] Abdomen [A01.047] 2. [B] Musculoskeletal System [A02] Back [A01.176] 3. [C] Digestive System [A03] Breast [A01.236] 4. [D] Respiratory System [A04] Extremities [A01.378] 5. [E] Urogenital System [A05] Head [A01.456] 6. [F] …… Neck [A01.598] 7. [G] …. 8. Physical Sciences [H] Electronics 9. [I] Astronomy 10. [J] Nature 11. [K] Time 12. [L] Weights and Measures 13. [M] ….

M. HearstFaceted Metadata in Search Hierarchical Faceted Metadata 1. Anatomy [A] Body Regions [A01] Abdomen [A01.047] 2. [B] Musculoskeletal System [A02] Back [A01.176] 3. [C] Digestive System [A03] Breast [A01.236] 4. [D] Respiratory System [A04] Extremities [A01.378] 5. [E] Urogenital System [A05] Head [A01.456] 6. [F] …… Neck [A01.598] 7. [G] …. 8. Physical Sciences [H] Electronics Amplifiers 9. [I] Astronomy Electronics, Medical 10. [J] Nature Transducers 11. [K] Time 12. [L] Weights and Measures 13. [M] ….

M. HearstFaceted Metadata in Search Hierarchical Faceted Metadata 1. Anatomy [A] Body Regions [A01] Abdomen [A01.047] 2. [B] Musculoskeletal System [A02] Back [A01.176] 3. [C] Digestive System [A03] Breast [A01.236] 4. [D] Respiratory System [A04] Extremities [A01.378] 5. [E] Urogenital System [A05] Head [A01.456] 6. [F] …… Neck [A01.598] 7. [G] …. 8. Physical Sciences [H] Electronics Amplifiers 9. [I] Astronomy Electronics, Medical 10. [J] Nature Transducers 11. [K] Time 12. [L] Weights and Measures Calibration 13. [M] …. Metric System Reference Standard

M. HearstFaceted Metadata in Search Questions we are trying to answer How many facets are allowable? Should facets be mixed and matched? How much is too much? Should hierarchies be progressively revealed, tabbed, some combination? How should free-text search be integrated?

M. HearstFaceted Metadata in Search An Important Trend in Information Architecture Design Generating web pages from databases Implications: –Web sites can adapt to user actions –Web sites can be instrumented

M. HearstFaceted Metadata in Search A Taxonomy of WebSites low high Complexity of Applications Complexity of Data From: The (Short) Araneus Guide to Website development, by Mecca, et al, Proceedings of WebDB’99, Catalog Sites Web-based Information Systems Web- Presence Sites Service- Oriented Sites

M. HearstFaceted Metadata in Search The Interface Design Chess metaphor –Opening –Middle game –End game

M. HearstFaceted Metadata in Search

M. HearstFaceted Metadata in Search

M. HearstFaceted Metadata in Search

M. HearstFaceted Metadata in Search

M. HearstFaceted Metadata in Search

M. HearstFaceted Metadata in Search

M. HearstFaceted Metadata in Search

M. HearstFaceted Metadata in Search

M. HearstFaceted Metadata in Search

M. HearstFaceted Metadata in Search The Interface Design Tightly Integrated Search Supports Expand as well as Refine Dynamically Generated Pages –Paths can be taken in any order Consistent Color Coding Consistent Backup and Bookmarking Standard HTML

M. HearstFaceted Metadata in Search What is Tricky About This? It is easy to do it poorly –Yahoo directory structure It is hard to be not overwhelming –Most users prefer simplicity unless complexity really makes a difference It is hard to “make it flow” –Can it feel like “browsing the shelves”?

M. HearstFaceted Metadata in Search Project History Identify Target Population –Architects, city planners Needs assessment. –Interviewed architects and conducted contextual inquiries. Lo-fi prototyping. –Showed paper prototype to 3 professional architects. Design / Study Round 1. –Simple interactive version. Users liked metadata idea. Design / Study Round 2: –Developed 4 different detailed versions; evaluated with 11 architects; results somewhat positive but many problems identified. Matrix emerged as a good idea. Metadata revision. –Compressed and simplified the metadata hierarchies

M. HearstFaceted Metadata in Search Project History Design / Study Round 3. –New version based on results of Round 2 –Highly positive user response Identified new user population/collection –Students and scholars of art history –Fine arts images Study Round 4 –Compare the metadata system to a strong, representative baseline

M. HearstFaceted Metadata in Search New Usability Study Participants & Collection –32 Art History Students –~35,000 images from SF Fine Arts Museum Study Design –Within-subjects Each participant sees both interfaces Balanced in terms of order and tasks –Participants assess each interface after use –Afterwards they compare them directly Data recorded in behavior logs, server logs, paper- surveys; one or two experienced testers at each trial. Used 9 point Likert scales. Session took about 1.5 hours; pay was $15/hour

M. HearstFaceted Metadata in Search The Baseline System Floogle Take the best of the existing keyword- based image search systems

M. HearstFaceted Metadata in Search Comparison of Common Image Search Systems System Collection# Results /page Catego ries? # Familiar GoogleWeb20No27 AltaVistaWeb15No8 CorbisPhotos9-36No8 GettyPhotos, Art 12-90Yes6 MS OfficePhotos, Clip art 6-100YesN/A ThinkerFine arts images 10Yes4 BASELINEFine arts images 40YesN/A

M. HearstFaceted Metadata in Search sword

M. HearstFaceted Metadata in Search

M. HearstFaceted Metadata in Search

M. HearstFaceted Metadata in Search

M. HearstFaceted Metadata in Search Evaluation Quandary How to assess the success of browsing? –Timing is usually not a good indicator –People often spend longer when browsing is going well. Not the case for directed search –Can look for comprehensiveness and correctness (precision and recall) … –… But subjective measures seem to be most important here.

M. HearstFaceted Metadata in Search Hypotheses We attempted to design tasks to test the following hypotheses: –Participants will experience greater search satisfaction, feel greater confidence in the results, produce higher recall, and encounter fewer dead ends using FC over Baseline –FC will perceived to be more useful and flexible than Baseline –Participants will feel more familiar with the contents of the collection after using FC –Participants will use FC to create multi-faceted queries

M. HearstFaceted Metadata in Search Four Types of Tasks –Unstructured (3): Search for images of interest –Structured Task (11-14): Gather materials for an art history essay on a given topic, e.g. Find all woodcuts created in the US Choose the decade with the most Select one of the artists in this periods and show all of their woodcuts Choose a subject depicted in these works and find another artist who treated the same subject in a different way. –Structured Task (10): compare related images Find images by artists from 2 different countries that depict conflict between groups. –Unstructured (5): search for images of interest

M. HearstFaceted Metadata in Search Other Points Participants were NOT walked through the interfaces. The wording of Task 2 reflected the metadata; not the case for Task 3 Within tasks, queries were not different in difficulty (t’s 0.05 according to post-task questions) Flamenco is and order of magnitude slower than Floogle on average. –In task 2 users were allowed 3 more minutes in FC than in Baseline. –Time spent in tasks 2 and 3 were significantly longer in FC (about 2 min more).

M. HearstFaceted Metadata in Search Results Participants felt significantly more confident they had found all relevant images using FC (Task 2: t(62)=2.18, p<.05; Task 3: t(62)=2.03, p<.05) Participants felt significantly more satisfied with the results (Task 2: t(62)=3.78, p<.001; Task 3: t(62)=2.03, p<.05) Recall scores: –Task2a: In Baseline 57% of participants found all relevant results, in FC 81% found all. –Task 2b: In Baseline 21% found all relevant, in FC 77% found all.

M. HearstFaceted Metadata in Search Post-Interface Assessments All significant at p<.05 except simple and overwhelming

M. HearstFaceted Metadata in Search Perceived Uses of Interfaces Baseline FC

M. HearstFaceted Metadata in Search Post-Test Comparison FC Baseline Find images of roses Find all works from a given period Find pictures by 2 artists in same media Which Interface Preferable For:

M. HearstFaceted Metadata in Search Post-Test Comparison FC Baseline Overall Assessment: More useful for your tasks Easiest to use Most flexible More likely to result in dead ends Helped you learn more Overall preference Find images of roses Find all works from a given period Find pictures by 2 artists in same media Which Interface Preferable For:

M. HearstFaceted Metadata in Search Facet Usage Facets driven largely by task content –Multiple facets 45% of time in structured tasks For unstructured tasks, –Artists (17%) –Date (15%) –Location (15%) –Others ranged from 5-12% –Multiple facets 19% of time From end game, expansion from –Artists (39%) –Media (29%) –Shapes (19%)

M. HearstFaceted Metadata in Search Qualitative Observations Baseline: –Simplicity, similarity to Google a plus –Also noted the usefulness of the category links FC: –Starting page “well-organized”, gave “ideas for what to search for” –Query previews were commented on explicitly by 9 participants –Commented on matrix prompting where to go next 3 were confused about what the matrix shows –Generally liked the grouping and organizing –End game links seemed useful; 9 explicitly remarked positively on the guidance provided there. –Often get requests to use the system in future

M. HearstFaceted Metadata in Search Study Results Summary Strongly positive results for the faceted metadata interface. Moderate use of multiple facets. Strong preference over the current state of the art. –Chair of Architecture Dept: “It felt like I was browsing the shelves!” –This kind of enthusiasm is not seen in similarity- based image search interfaces. Hypotheses are supported.

M. HearstFaceted Metadata in Search Implementation All open source code –Mysql database –Python web server (Webkit) –Python code –Lucene search engine (java)

M. HearstFaceted Metadata in Search Metadata Availability Many collections already have rich metadata associated with them. Automated methods are improving. This tool may be helpful for resolving metadata creation wars.

M. HearstFaceted Metadata in Search Summary Usability studies done on 3 collections: –Recipes: 13,000 items –Architecture Images: 40,000 items –Fine Arts Images: 35,000 items Conclusions: –Users like and are successful with the dynamic faceted hierarchical metadata, especially for browsing tasks –Very positive results, in contrast with studies on earlier iterations –Note: it seems you have to care about the contents of the collection to like the interface

M. HearstFaceted Metadata in Search Advantages of the Approach Supports different search types –Highly constrained known-item searches –Open-ended, browsing tasks –Can easily switch from one mode to the other midstream –Can both expand and refine Allows different people to add content without breaking things Can make use of standard technology

M. HearstFaceted Metadata in Search Other Domains Applying this to –Text Tobacco Documents Archives Medline biomedical texts –Products/Catalogs Don’t have a collection; would like one

M. HearstFaceted Metadata in Search Future Work What about information visualization? How to integrate with relevance feedback (more like this)? How to incorporate user preferences and past behavior? How to combine facets to reflect tasks?

65 Thanks to: Andrea Sahli Rashmi Sinha NSF CAREER Grant IIS IBM Faculty Fellowship Try the Demo: flamenco.berkeley.edu