Incorporating Metadata into Search User Interfaces Marti Hearst UC Berkeley March 2001
Main Ideas Search is changing: More emphasis on flexibly showing next choices Less emphasis on ranking Web design is changing: More emphasis on dynamically determined views Less emphasis on pre-determined links Two key ideas: Task-specific design Harnessing the power of metadata
Outline Background Search Metadata Information architecture Two web-based examples Recipes Sporting Goods Two collection-based examples Medical text Architectural Images Conclusions
From Web Search is Working! Survey finds high user satisfaction Study by npd group
Why is Web Search Working? Web Search is successful at finding good starting points (home pages) Evidence: Search engines using Link analysis Page popularity Directory categories These all find dominant home pages
Consequences Web search engines are providing source selection! So … what happens when the user reaches the site? Follow Links … or … Search
Following Hyperlinks Works great when it is clear where to go next Frustrating when the desired directions are undetectable or unavailable Site Search This is not getting good reviews
An Analogy text search hypertext
Analogy Hypertext: A fixed number of choices of where to go next; A glance at the map tells you where you are; But may not go where you want to go. To get from Topeka to Santa Fe, may have to go through Frostbite Falls Site Search: Can go anywhere; But may get stuck, disoriented, in a crevasse!
Goal: An All-Tertrain Vehicle The best of both techniques A vehicle that magically lays down track to suggest choices of where you want to go next based on what you’ve done so far and what you are trying to do The tracks follow the lay of the land and go everywhere, but cross over the crevasses The tracks allow you to back up easily
How to make an all-tertrain vehicle? Two ideas: Focus on the task. Use metadata explicitly.
The Importance of the Task Searching patent databases vs. Proving non-infringement Browsing newsgroups vs. Finding the denial-of-service hacker Getting all recent news vs. Anticipating the competition Data-centricTask-centric
The Importance of the Task: Indirect Evidence How does Web page download time effect usability? In one study, Jared Spool’s UIE team found: (56kbit modem) Amazon: 36 sec/page (avg) About.com: 8 sec/page (avg) Users rated the sites: Fastest: Amazon Slowest: About.com Why?
The Importance of the Task Perceived speed Strong correlation between perceived speed and whether the users felt they completed their task Strong correlation between perceived speed and whether the users felt they always knew what to do next (scent).
Metadata Metadata is: Data about data Structures and languages for the description of information resources and their elements
Types of Metadata systems & standards Naming and ID systems – URLs, ISBNs Bibliographic description – MARC, Dublin Core, TEI, etc. Music -- SMDL Images and objects – CIMI Numeric Data – DDI, SDSM Geospatial Data – FGDC Collections – EAD
Thesauri (Categories) A collection of selected vocabulary Broader, narrower, related-to relations Describe the content Medical text Anatomy, Disease, Chemicals, Procedures… Architectural images Location, Style, Materials, Period … Recipes! Cuisine, Ingredients, Season, Calories … These are often organized as hierarchical and faceted
New interfaces are mixing and matching thesaurus-style metadata Time/DateTopicRoleGeoRegion The question: how to do this effectively?
What about Yahoo? Let’s try to find UCB
What about Yahoo?
Where is UCB?
Yahoo does use some metadata well Yahoo restaurant guide combines: Region Topic (restaurants) Related Information Other attributes (cuisines) Other topics related in place and time (movies)
Green: restaurants & attributes Red: related in place & time Yellow: geographic region
Combining Information Types Region State City A & E Film Theatre Music Restaurants California Eclectic Indian French Assumed task: looking for evening entertainment
Other Possible Combinations Region + A&E City + Restaurant + Movies City + Weather City + Education: Schools Restaurants + Schools …
Bookstore preview combinations topic + related topics topic + publications by same author topic + books of same type but related topic
Goals for Metadata Usage Well-integrated with search Provides useful hints of where to go next Tailored to task as it develops Personalized Dynamic
Recipe Example
soar.berkeley.edu/recipes
Epicurious Metadata Usage Advantages Creates combinations of metadata on the fly Different metadata choices show the same information in different ways Previews show how many recipes will result Easy to back up Supports several task types ``Help me find a summer pasta,'' (ingredient type with event type), ``How can I use an avocado in a salad?'' (ingredient type with dish type), ``How can I bake sea-bass'' (preparation type and ingredient type)
A View of Web Site Structure (Newman et al. 00) Information design structure, categories of information Navigation design interaction with information structure Graphic design visual presentation of information and navigation (color, typography, etc.) Courtesy of Mark Newman
Information Architecture includes management and more responsibility for content User Interface Design includes testing and evaluation Information Architecture vs. UI (Newman et al. 00) Courtesy of Mark Newman
Recipe Information Architecture Information design Recipes have five types of metadata categories Cuisine, Preparation, Ingredients, Dish, Occasion Each category has one level of subcategories
Recipe Information Architecture Navigation design Home page: show top level of all categories Other pages: A link on an attribute ANDS that attribute to the current query; results are shown according to a category that is not yet part of the query A change-view link does not change the query, but does change which category’s metadata organizes the results
Metadata usage in Epicurious PrepareCuisineIngredientDish Recipe
Metadata usage in Epicurious PrepareCuisineIngredientDish PrepareCuisineDish I Select
Metadata usage in Epicurious PrepareCuisineIngredientDish I > Group by PrepareCuisineDish
Metadata usage in Epicurious PrepareCuisineIngredientDish PrepareCuisineDish I > Group by
Metadata usage in Epicurious PrepareCuisineIngredientDish PrepareCuisineDish I > Group by PrepareCuisine I Select
Metadata Usage in Epicurious Can choose category types in any order But categories never more than one level deep And can never use more than one instance of a category Even though items may be assigned more than one of each category type Items (recipes) are dead-ends Don’t link to “more like this” Not fully integrated with search
Epicurious Metadata Usage Problem: lacks integration with search
Sporting Goods Example
REI example
REI example -- searching
REI example
REI doesn’t seem to be “conscious” of its metadata Doesn’t seem to be integrating the product metadata with the text information Don’t find search hits in “learn and share” Hard-codes relations Camping product attribute linked to the interior of a pre-coded page No “breadcrumbs”
DSG example
Seems to be doing many things right But … maybe too much Extensive, dynamic use of metadata and query previews and postviews Complex relationship between search, information design, and navigation design Hitting against some strange edge cases
The FLAMENCO Project FLexible Access using MEtadata in Novel COmbinations Main goal: Perform systematic studies to determine how metadata should be incorporated into search Answer questions such as: Given a set of user goals and a set of information with certain characteristics (size, inter-connectivity) How many metadata combinations to show? What level of detail to show? How best to preview and postview choices?
The FLAMENCO Project Focusing on very large collections whose items are not easily classified Medical text, image databases However, much should apply to website design as well
Evaluation Methodology Regression Test Select a set of tasks Use these throughout the evaluation Start with a baseline system Evaluate using the test tasks Add a feature Evaluation again Compare to baseline Only retain those changes that improve results
First: determine appropriate functionality Later: Incorporate more sophisticated displays
Application to Biomedical Text
Asthma > Steroids 1.A steroid-induced acute psychosis in a child with athsma. 2.Management of steroid-dependent asthma with methotrexate. 1.A steroid-induced acute psychosis in a child with athsma. 2.Management of steroid-dependent asthma with methotrexate. Steroids Pregnanes Pregnadienes (5) Prednisone (5) Pregnenes Budesonide (4) Corticosterone (3) Other Views Admin & Dosage (50) Drug Effects (20 Therapeutic Use (25) Risk Factors (4) More … User Preferred Musculoskeletal (4) Drug Resistance (6) All Categories (99) 99 Documents: [Sort by author] [Sort by popularity] [Sort by Steroids] [Cluster] 1. Effect of short-course budesonide on the bone turnover of asthmatic children. 2. Effect of prednisone on response to influenza virus vaccine in asthmatic children. … 1. Effect of short-course budesonide on the bone turnover of asthmatic children. 2. Effect of prednisone on response to influenza virus vaccine in asthmatic children. …
Asthma > Steroids 1.A steroid-induced acute psychosis in a child with athsma. 2.Management of steroid-dependent asthma with methotrexate. 1.A steroid-induced acute psychosis in a child with athsma. 2.Management of steroid-dependent asthma with methotrexate. Steroids Pregnanes Pregnadienes (5) Prednisone (5) Pregnenes Budesonide (4) Corticosterone (3) Other Views Admin & Dosage (50) Drug Effects (20 Therapeutic Use (25) Risk Factors (4) More … User Preferred Musculoskeletal (4) Drug Resistance (6) All Categories (99) 99 Documents: [Sort by author] [Sort by popularity] [Sort by Steroids] [Cluster] 1. Effect of short-course budesonide on the bone turnover of asthmatic children. 2. Effect of prednisone on response to influenza virus vaccine in asthmatic children. … 1. Effect of short-course budesonide on the bone turnover of asthmatic children. 2. Effect of prednisone on response to influenza virus vaccine in asthmatic children. …
Asthma > Steroids 1.A steroid-induced acute psychosis in a child with athsma. 2.Management of steroid-dependent asthma with methotrexate. 1.A steroid-induced acute psychosis in a child with athsma. 2.Management of steroid-dependent asthma with methotrexate. Steroids Pregnanes Pregnadienes (5) Prednisone (5) Pregnenes Budesonide (4) Corticosterone (3) Other Views Admin & Dosage (50) Drug Effects (20 Therapeutic Use (25) Risk Factors (4) More … User Preferred Musculoskeletal (4) Drug Resistance (6) All Categories (99) 99 Documents: [Sort by author] [Sort by popularity] [Sort by Steroids] [Cluster] 1. Effect of short-course budesonide on the bone turnover of asthmatic children. 2. Effect of prednisone on response to influenza virus vaccine in asthmatic children. … 1. Effect of short-course budesonide on the bone turnover of asthmatic children. 2. Effect of prednisone on response to influenza virus vaccine in asthmatic children. …
Asthma > Steroids > Admin & Dosage 1.Dosage levels for asthmatic steroids: A survey. Steroids Pregnanes Pregnadienes (3) Prednisone (5) Related Categories Inhalators (40) Emotional Effects (25) Preferred Suppliers (30) User Preferred Musculoskeletal (0) Drug Resistance (2) All Categories (50) 50 Documents: [Sort by author] [Sort by popularity] [Sort by Dosage] [Cluster] 1. Optimal dosage levels for prednisone in the treatment of childhood asthma. 2. … 1. Optimal dosage levels for prednisone in the treatment of childhood asthma. 2. …
Other paths: back up and go forward Asthma > Steroids > Budesonide > Huang Asthma > Huang > Budesonide Asthma > Steroids Asthma > Steroids > Budesonide
Medical example Use dynamic previews Allow user to select metadata in any order At each step, show different types of relevant metadata, based on prior steps and personal history, include # of documents Previews restricted to only those metadata types that might be helpful
Dynamic Metadata Previews How different from Yahoo & Amazon? Dynamically determine what to show next Yahoo’s combos are predefined Amazon’s are also predefined, and limited to taste and general topic only A way to seamlessly integrate Related topics User preferences (personalization) Context-sensitivity
Application to Image Search
Image Search: What is the task? Illustrate my slides? “Find a crevasse” Keyword match works pretty well Find inspiration for an architectural design? General similarity: maybe But more control might be better
How different from medical example? More open-ended Easier to scan many images quickly Tertrain metaphor not used here Not narrowing down a large set Rather, always viewing more images A mechanism for “steering” through the metadata
Our Approach Architecture task: Emphasize images over text Use hypertext-style interface as a reasonable baseline for comparison Find out how much choice is too much Find out whether explicit metadata is better than implicit more-like-this
SPIRO: >40,000 art & architecture images Detailed metadata
SPIRO Query Form
SPIRO query on Subject: church
A Better Example Greatbuildings.com Hyperlinks metadata together But a small collection ~1000 buildings ~4500 images total
Our Approach Create a system that allows experimentation with different interfaces Add functionality in a stepwise fashion Architecture task: Emphasize images over text Use hypertext-style interface as a reasonable baseline for comparison Find out how much choice is too much Find out whether explicit metadata is better than implicit more-like-this
Summary Standard search is too flexible Hyperlinks are too restrictive Metadata is being mixed and matched in interesting ways, but how is not well- understood In information structure In navigation structure In database design
Summary Our goals Systematically determine what works, with the following emphases: Task-centric Integrate metadata with search Dynamic previews Easily retrace steps Develop recommendations that reflect both the task structure and the richness of the information structure
Conclusions Search & hypertext are becoming more interwoven Metadata is being mixed and matched in interesting ways, but how is not well- understood In information structure In navigation structure In database design
Conclusions Our goals Systematically determine what works, with the following emphases: Task-centric Integrate metadata with search Dynamic previews Easily retrace steps Develop recommendations that reflect both the task structure and the richness of the information structure In future: integrate with more sophisticated displays