Considering a Faceted Search-based Model Marti Hearst UCB SIMS NAS CSTB DNS Meeting on Internet Navigation and the Domain Name.

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

Stat-JR: eBooks Richard Parker. Quick overview To recap… Stat-JR uses templates to perform specific functions on datasets, e.g.: – 1LevelMod fits 1-level.
Leveraging Your Taxonomy to Increase User Productivity MAIQuery and TM Navtree.
Incorporating Metadata into Search User Interfaces Ame ElliottJen English Ping YeeKirsten Swearington UC Berkeley
Measuring Information Architecture CHI 01 Panel Position Statement Marti Hearst UC Berkeley.
Automating Discovery from Biomedical Texts Marti Hearst & Barbara Rosario UC Berkeley Agyinc Visit August 16, 2000.
Social Tagging and Search Marti Hearst UC Berkeley.
Measuring Information Architecture Marti Hearst UC Berkeley.
Measuring Information Architecture Marti Hearst UC Berkeley.
A metadata-based approach Marti Hearst Associate Professor BT Visit August 18, 2005.
Multimedia & Website Design Initial Planning (Part 2)
Recommender systems Ram Akella February 23, 2011 Lecture 6b, i290 & 280I University of California at Berkeley Silicon Valley Center/SC.
Incorporating Metadata into Search User Interfaces Marti Hearst UC Berkeley.
Day 9 Navigation Heuristic evaluation. Objectives  Look at some simple rules on navigation  Introduction to Heuristic Evaluation.
IMT530- Organization of Information Resources1 Feedback Like exercises –But want more instructions and feedback on them –Wondering about grading on these.
Measuring Information Architecture Marti Hearst UC Berkeley.
SIMS 213: User Interface Design & Development Marti Hearst Thurs Feb 15, 2001.
Information Retrieval
Ideas for USA.gov Marti Hearst USA.gov & Web Best Practices Team Meeting July 29, 2009.
How to Get The Most Out of Outlook 2003 Michele Schwartzman Division of Customer Support Summer 2006.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Designing for the Web 7 Useful Design Principles.
Domain Names, Internationalization, and Alternatives John C KLENSIN © John C Klensin, 2002.
The Goal & Structure of Textbook. Chapter 6 Topics (learning objects, Modules)
Improving the Catalogue Interface using Endeca Tito Sierra NCSU Libraries.
Social scope: Enabling Information Discovery On Social Content Sites
1 © Netskills Quality Internet Training, University of Newcastle Web Page Design © Netskills, Quality Internet Training University.
Met a-data Resources in Europe: within NSIs and from Dosis Projects Wilfried Grossmann Department of Statistics and Decision Support Systems University.
Web Search. Structure of the Web n The Web is a complex network (graph) of nodes & links that has the appearance of a self-organizing structure  The.
ITIS 1210 Introduction to Web-Based Information Systems Chapter 27 How Internet Searching Works.
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
User Experience and Interface Design for Web Apps
UNIT 14 1 Websites. Introduction 2 A website is a set of related webpages stored on a web server. Webmaster: is a person who sets up and maintains a.
Definition of a taxonomy “System for naming and organizing things into groups that share similar characteristics” Taxonomy Architectures Applications.
LOGO Searching the Web CHAPTER 2 Eastern Mediterranean University School of Computing and Technology Department of Information Technology ITEC229 Client-Side.
XP New Perspectives on The Internet, Sixth Edition— Comprehensive Tutorial 3 1 Searching the Web Using Search Engines and Directories Effectively Tutorial.
The Internet 8th Edition Tutorial 4 Searching the Web.
ITCS373: Internet Technology Lecture 5: More HTML.
Faceted Search Zhao Jing Outline  What is faceted search?  Why use faceted search?  Topics of interests  Faceted Search in Dataspace.
Mining Topic-Specific Concepts and Definitions on the Web Bing Liu, etc KDD03 CS591CXZ CS591CXZ Web mining: Lexical relationship mining.
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Knowledge Representation Semantic Web - Fall 2005 Computer.
Recuperação de Informação B Cap. 10: User Interfaces and Visualization , , 10.9 November 29, 1999.
WIRED Week 3 Syllabus Update (next week) Readings Overview - Quick Review of Last Week’s IR Models (if time) - Evaluating IR Systems - Understanding Queries.
Retrieval 1/2 BDK12-5 Information Retrieval William Hersh, MD Department of Medical Informatics & Clinical Epidemiology Oregon Health & Science University.
Structure of IR Systems INST 734 Module 1 Doug Oard.
MetaLib 4 User Guide. 2 MetaLib 4 Access MetaLib at: – MetaLib may be used at two different levels –
Evaluation of (Search) Results How do we know if our results are any good? Evaluating a search engine  Benchmarks  Precision and recall Results summaries:
CPT 499 Internet Skills for Educators Session Three Class Notes.
User Interface Components Lecture # 5 From: interface-elements.html.
Harvesting Social Knowledge from Folksonomies Harris Wu, Mohammad Zubair, Kurt Maly, Harvesting social knowledge from folksonomies, Proceedings of the.
Internet Research – Illustrated, Fourth Edition Unit B.
JISC/NSF PI Meeting, June Archon - A Digital Library that Federates Physics Collections with Varying Degrees of Metadata Richness Department of Computer.
UI's for inputting and presenting the metadata of hypermedia documents Kai Kuikkaniemi HUT T
Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.
Unit B Constructing Complex Searches Internet Research Third Edition.
Search and Retrieval: Query Languages Prof. Marti Hearst SIMS 202, Lecture 19.
Executive Summary - Human Factors Heuristic Evaluation 04/18/2014.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
 Every word matters. Generally, all the words you put in the query will be used.  Search is always case insensitive. A search for [ new york times ]
Attributes and Values Describing Entities. Metadata At the most basic level, metadata is just another term for description, or information about an entity.
The Successful Website
Module 11: File Structure
Project Objectives Publish to a remote server
By Dr. Abdulrahman H. Altalhi
Attributes and Values Describing Entities.
Incorporating Metadata into Search User Interfaces
Some Assistance May Be Required
Attributes and Values Describing Entities.
The New LexisNexis® Statistical
Lab 2: Information Retrieval
Presentation transcript:

Considering a Faceted Search-based Model Marti Hearst UCB SIMS NAS CSTB DNS Meeting on Internet Navigation and the Domain Name System: Technical Alternatives and Policy Implications July 12, 2001

Outline  The Klensin proposal Synopsis Issues Recommendations  UIs and faceted search

A Proposal  A Search-based access model for the DNS IETF Internet-Draft by John Klensin  A multi-layer approach to naming Faceted descriptions are used to facilitate both flexible naming and inexact search  This talk: What does research tell us about the search issues?

Faceted Classification System (simple, regulated) Free-text Search (unregulated) DNS (unchanged) Faceted System (detailed, unregulated) Klensin’s proposal Search Lookup

Layer 2 Industry Category: Restaurant Geolocation: Miami Language: Spanish Network Location Name: Jose’s Pizza

Faceted System (simple, regulated) Layer 2 Inputs: search values for one or more facets Outputs: appropriate DNS names and all tuples with matched facets Allow for partial (fuzzy) match Jose’s Pizza, Miami Alberto’s Pizza, Miami Jose’s Bistro, Miami Jose’s Pizza, Saratoga Joe’s Pizza, Miami … Jose’s Pizza, Miami Alberto’s Pizza, Miami Jose’s Bistro, Miami Jose’s Pizza, Saratoga Joe’s Pizza, Miami …

Layer 2 Selling Points  Allows sharing of name space among different (commercial) entities  Allows specification according to meaningful attributes

Layer 2 DNS Issues  How to guarantee uniqueness?  How to determine appropriate descriptors?  How to use in a hyperlink?  Requires a user interface for confirmation of correct choice

Layer 2 Descriptor Issues  Emphasis on geolocation may be problematic  May be too spare SFMOMA SFMOMA exhibits SFMOMA exhibit on digital art called

Faceted System (detailed, unregulated) Layer 3 Not centrally coordinated (provided by commercial services) More detailed facets Allow for inheritance Context-sensitive (e.g., restaurant has menu attribute auto repair has services, etc.) Inputs: service-dependent Outputs: layer 2 names

Free-text Search (unregulated) Layer 4 Use standard search to find sites that discuss topics that relate to the query (as web search works today)

Relation to Web Search  Web search is perceived to work better today than two years ago. Why? Finds appropriate starting points Also known as source selection Search for “toyota” no longer returns “Tony’s Toyota pages” as the top-ranked hit Before the web, source selection was a separate operation from free text search Also, queries tended to be longer Web search engines could do this exclusively – but they do other things as well.

Recommendations on Klensin Proposal  A promising, intriguing approach  One tweak: Combine layers 2 and 3 Have a partly regulated portion, and an open portion This however is susceptible to spamming  Not clear if this should be regulated

General Pitfalls of Controlled Vocabularies  Difficult to get agreement on the set of labels  Difficult to assign labels consistently Granularity Salience / Emphasis Context Connotations  New labels always appearing; old ones shift in meaning  Lay people won’t know the system

The Wron How to do it wrong Force into a Hierarchy Let’s try to find UCB

The Wron How to do it wrong

The Wron How to do it wrong

What is the problem?  Two deeply hierarchical facets Region Education  Forced in convoluted ways into one hierarchy with irregular cross links

Two Approaches Statistical approaches map words into metadata terms Create flexible user interfaces that progressively reveal appropriate subparts of the system (How to do so is a topic of our research.)

The Practice  Using descriptors “under the hood” The limited empirical work indicates Combining free text + descriptors works best Some e-commerce sites do this for finding products Can sometimes match queries to standard information needs “buy” + “palm” “review” + “crouching tiger” “berkeley” + “gap”

The Wron walmart.com Uses metadata “under the hood”

The Promise  Using descriptors in the User Interface  Use faceted metadata for navigation Query Previews Tailored Search Forms Tightly Combine Navigation & Search

Facets  Orthogonal sets of descriptors  Gets complicated when they are hierarchical  Example: recipes

Metadata Facets Time/DateTopicTaskGeoRegion  Advantage: Great for Mixing and Matching

Faceted Recipe Metadata PrepareCuisineIngredientDish Recipe

The Wron Sunset.com Not the right way

Dynamic Previews  Avoid empty results sets  Show the possible next steps  A way to seamlessly integrate Related topics User preferences (personalization) Context-sensitivity

The Wron

Metadata Usage in Epicurious  Can choose category types in any order  But categories never more than one level deep  And can never use more than one instance of a category Even though items may be assigned more than one of each category type  Items (recipes) are dead-ends Don’t link to “more like this”  Not fully integrated with search

The Wron Epicurious Metadata Usage Problem: lacks integration with search

The Wron This is fixed in marthastewart.com

The Wron Advanced search more specific than sunset.com; also allows for disjunction; thus less likely to get null results

UIs for faceted metadata Use dynamic previews Allow user to select metadata in any order At each step, show different types of relevant metadata, based on prior steps and personal history, include # of documents Previews restricted to only those metadata types that might be helpful Tightly integrate with keyword search

The Flamenco Research Project Systematically determine what works for integrating metadata into search interfaces Develop recommendations that reflect both the task structure and the richness of the information structure

Summary  Agreement on metadata descriptors assignment is difficult to achieve Descriptors need to be constantly updated Layer 2 is probably not rich enough  Assigning specifiers is quite different than searching for specified items  Fuzzy search can help, but Requires a UI for confirmation of correct choices This will end up looking like a search service Can make search more meaningful and task-based

Summary  Web search engines can do source selection, but Sometimes users do want source selection, But often search hits based on content of pages is often closer to what users want to do We need to be certain not to confuse source selection from content search