1 When/How/Why to use Grouping/Categorizing/Clustering in Search Interfaces Marti Hearst January 21, 2005.

Slides:



Advertisements
Similar presentations
Chapter 5: Introduction to Information Retrieval
Advertisements

Bringing Order to the Web: Automatically Categorizing Search Results Hao Chen SIMS, UC Berkeley Susan Dumais Adaptive Systems & Interactions Microsoft.
Ranked Retrieval INST 734 Module 3 Doug Oard. Agenda  Ranked retrieval Similarity-based ranking Probability-based ranking.
Information Retrieval Visualization CPSC 533c Class Presentation Qixing Zheng March 22, 2004.
Introduction Information Management systems are designed to retrieve information efficiently. Such systems typically provide an interface in which users.
Search and Retrieval: More on Term Weighting and Document Ranking Prof. Marti Hearst SIMS 202, Lecture 22.
Query Operations: Automatic Local Analysis. Introduction Difficulty of formulating user queries –Insufficient knowledge of the collection –Insufficient.
SIMS 213: User Interface Design & Development Marti Hearst Thurs, March 3, 2005.
9/18/2001Information Organization and Retrieval Vector Representation, Term Weights and Clustering (continued) Ray Larson & Warren Sack University of California,
Information Retrieval Concerned with the: Representation of Storage of Organization of, and Access to Information items.
Mobile Web Search Personalization Kapil Goenka. Outline Introduction & Background Methodology Evaluation Future Work Conclusion.
INFO 624 Week 3 Retrieval System Evaluation
1 SIMS 247: Information Visualization and Presentation Marti Hearst March 3, 2004.
WMES3103: INFORMATION RETRIEVAL WEEK 10 : USER INTERFACES AND VISUALIZATION.
1 i247: Information Visualization and Presentation Marti Hearst April 7, 2008.
1 The Failure of Clustering in Search Interfaces … or When/How/Why Clustering can be Successful in Search Interfaces Marti Hearst UC Berkeley Oct 6, 2004.
Usability of Grouping of Retrieval Results Marti Hearst School of Information, UC Berkeley September 1, 2006.
UCB CS Research Fair Search Text Mining Web Site Usability Marti Hearst SIMS.
1 SIMS 247: Information Visualization and Presentation Marti Hearst Nov 2 and Nov 7, 2005.
ISP 433/633 Week 12 User Interface in IR. Why care about User Interface in IR Human Search using IR depends on –Search in IR and search in human memory.
Query Operations: Automatic Global Analysis. Motivation Methods of local analysis extract information from local set of documents retrieved to expand.
Lesson 12 — The Internet and Research
The Confident Researcher: Google Away (Module 2) The Confident Researcher: Google Away 2.
IAT Text ______________________________________________________________________________________ SCHOOL OF INTERACTIVE ARTS + TECHNOLOGY [SIAT]
Evaluation Experiments and Experience from the Perspective of Interactive Information Retrieval Ross Wilkinson Mingfang Wu ICT Centre CSIRO, Australia.
Bringing Order to the Web: Automatically Categorizing Search Results Hao Chen, CS Division, UC Berkeley Susan Dumais, Microsoft Research ACM:CHI April.
Content-Based Image Retrieval
Using Metadata in Search Prof. Marti Hearst SIMS 202, Lecture 27.
Modern Information Retrieval Computer engineering department Fall 2005.
Web Searching Basics Dr. Dania Bilal IS 530 Fall 2009.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Fourth Edition Discovering the Internet Discovering the Internet Complete Concepts and Techniques, Second Edition Chapter 3 Searching the Web.
1 Information Retrieval Acknowledgements: Dr Mounia Lalmas (QMW) Dr Joemon Jose (Glasgow)
Chapter 6: NavigationCopyright © 2004 by Prentice Hall 6. Navigation Design Site-level navigation: making it easy for the user to get around the site Page-level.
Clustering Supervised vs. Unsupervised Learning Examples of clustering in Web IR Characteristics of clustering Clustering algorithms Cluster Labeling 1.
Basic Machine Learning: Clustering CS 315 – Web Search and Data Mining 1.
XP New Perspectives on The Internet, Sixth Edition— Comprehensive Tutorial 3 1 Searching the Web Using Search Engines and Directories Effectively Tutorial.
CSM06 Information Retrieval Lecture 6: Visualising the Results Set Dr Andrew Salway
Dataware’s Document Clustering and Query-By-Example Toolkits John Munson Dataware Technologies 1999 BRS User Group Conference.
Mining Topic-Specific Concepts and Definitions on the Web Bing Liu, etc KDD03 CS591CXZ CS591CXZ Web mining: Lexical relationship mining.
Information Retrieval Effectiveness of Folksonomies on the World Wide Web P. Jason Morrison.
Searching the web Enormous amount of information –In 1994, 100 thousand pages indexed –In 1997, 100 million pages indexed –In June, 2000, 500 million pages.
4 1 SEARCHING THE WEB Using Search Engines and Directories Effectively New Perspectives on THE INTERNET.
WIRED Week 3 Syllabus Update (next week) Readings Overview - Quick Review of Last Week’s IR Models (if time) - Evaluating IR Systems - Understanding Queries.
MetaLib 4 User Guide. 2 MetaLib 4 Access MetaLib at: – MetaLib may be used at two different levels –
Interaction LBSC 734 Module 4 Doug Oard. Agenda Where interaction fits Query formulation Selection part 1: Snippets  Selection part 2: Result sets Examination.
IAT Text ______________________________________________________________________________________ SCHOOL OF INTERACTIVE ARTS + TECHNOLOGY [SIAT]
21/11/20151Gianluca Demartini Ranking Clusters for Web Search Gianluca Demartini Paul–Alexandru Chirita Ingo Brunkhorst Wolfgang Nejdl L3S Info Lunch Hannover,
Sketches and prototypes for the Orlando Six Degrees of Separation Project.
Information Retrieval
UWMS Data Mining Workshop Content Analysis: Automated Summarizing Prof. Marti Hearst SIMS 202, Lecture 16.
Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -
Chapter. 3: Retrieval Evaluation 1/2/2016Dr. Almetwally Mostafa 1.
Bringing Order to the Web : Automatically Categorizing Search Results Advisor : Dr. Hsu Graduate : Keng-Wei Chang Author : Hao Chen Susan Dumais.
Clustering (Search Engine Results) CSE 454. © Etzioni & Weld To Do Lecture is short Add k-means Details of ST construction.
INFORMATION RETRIEVAL MEASUREMENT OF RELEVANCE EFFECTIVENESS 1Adrienn Skrop.
A Self-organizing Semantic Map for Information Retrieval Xia Lin, Dagobert Soergel, Gary Marchionini presented by Yi-Ting.
Third Edition Discovering the Internet Discovering the Internet Complete Concepts and Techniques, Second Edition Chapter 3 Searching the Web.
Session 5: How Search Engines Work. Focusing Questions How do search engines work? Is one search engine better than another?
SEMINAR ON INTERNET SEARCHING PRESENTED BY:- AVIPSA PUROHIT REGD NO GUIDED BY:- Lect. ANANYA MISHRA.
SIMS 202, Marti Hearst Content Analysis Prof. Marti Hearst SIMS 202, Lecture 15.
The Failure of Clustering in Search Interfaces … or When/How/Why Clustering can be Successful in Search Interfaces Marti Hearst UC Berkeley
Plan for Today’s Lecture(s)
Information Organization: Overview
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
Visualizing Document Collections
Data Mining Chapter 6 Search Engines
Document Clustering Matt Hughes.
Information Organization: Overview
Retrieval Performance Evaluation - Measures
Presentation transcript:

1 When/How/Why to use Grouping/Categorizing/Clustering in Search Interfaces Marti Hearst January 21, 2005

2 Main Points Grouping search results is desirable However, getting good groups is difficult Furthermore, incorporation of groups into interfaces has not been done well Good news: improvements are happening

3 Talk Outline Definition of categories and clusters Studies showing failure of clustering in interfaces New developments in results grouping

4 The Need to Group Interviews with lay users often reveal a desire for better organization of retrieval results Useful for suggesting where to look next –People prefer links over generating search terms –But only when the links are for what they want Three main approaches for text and images: –Group items according to pre-defined categories –Group items into automatically-created clusters –Group items according to common keywords (new!) Ojakaar and Spool, Users Continue After Category Links, UIETips Newsletter, http://world.std.com/~uieweb/Articles/

5 Categories Human-created –But often automatically assigned to items Arranged in hierarchy, network, or facets –Can assign multiple categories to items –Or place items within categories Usually restricted to a fixed set –So help reduce the space of concepts Intended to be readily understandable –To those who know the underlying domain –Provide a novice with a conceptual structure There are many already made up! However, until recently, their use in interfaces has been –Under-investigated –Not met their promise

6 Clustering “The art of finding groups in data” –Kaufman and Rousseeuw Groups are formed according to associations and commonalities among the data’s features. –There are dozens of algorithms, more all the time –Most need a way of determining similarity or difference between a pair of items –In text clustering, documents usually represented as a vector of weighted features which are some transformation on the words –Similarity between documents is a weighted measure of feature overlap

7 Clustering Potential benefits: –Find the main themes in a set of documents Potentially useful if the user wants a summary of the main themes in the subcollection Potentially harmful if the user is interested in less dominant themes –More flexible than pre-defined categories There may be important themes that have not been anticipated –Disambiguate ambiguous terms ACL –Clustering retrieved documents tends to group those relevant to a complex query together Hearst, Pedersen, Revisiting the Cluster Hypothesis, SIGIR’96

8 Scatter/Gather Clustering Developed at PARC in the late 80’s/early 90’s Top-down approach –Start with k seeds (documents) to represent k clusters –Each document assigned to the cluster with the most similar seeds To choose the seeds: –Cluster in a bottom-up manner –Hierarchical agglomerative clustering Start with n documents, compare all by pairwise similarity, combine the two most similar documents to make a cluster Now compare both clusters and individual documents to find the most similar pair to combine Continue until k clusters remain Use the centroid of each of these as seeds –Centroid: average of the weighted vectors Can recluster a cluster to produce a hierarchy of clusters Pedersen, Cutting, Karger, Tukey, Scatter/Gather: A Cluster-based Approach to Browsing Large Document Collections, SIGIR 1992

9 Clustering Example: Medical Text Query: “mastectomy” on a breast cancer collection 250 documents retrieved Summary of cluster themes (subjective): –prophylactic mastectomy (preventative) –prostheses and reconstruction –conservative vs radical surgery –side effects of surgery –psychological effects of surgery The first two clusters found themes for which there was no corresponding MESH category Hearst, The Use of Categories and Clusters for Organizing Retrieval Results, in Natural Language Information Retrieval, Kluwer, 1999

10 A Clustering Failure Query: “implant” and “prosthesis” Four clusters returned: –use of implants to administer radiation dosages –complications resulting from breast implants –other issues surrounding breast implants –other kinds of prostheses Reclustering clusters 2 and 3 does not find cohesive subgroups –An examination of the documents indicates that a valid subdivision was possible type of surgical procedure risk factors –This seems to happen when there are too many features in common –Perhaps a better clustering algorithm can help in this case

11 Clustering Interface Problems Big problem: –Clusters used primarily as part of a visualization This just doesn’t work –Every usability study says so –Lots of dots scattered about the screen is meaningless to users –There is no inherent spatial relationship among the documents –Need text to understand content Another big problem: –Clustering images according to an approximation of visual similarity This just doesn’t work –What limited studies have been done say so –Instead: group according to textual categories

12 Visualizing Clustering Results Use clustering to map the entire huge multidimensional document space into a huge number of small clusters. User dimension reduction and then project these onto a 2D/3D graphical representation

13 Clustering Multi-Dimensional Document Space (image from Wise et al 95)

14 Clustering Multi-Dimensional Document Space (image from Wise et al 95)

15 Kohonen Feature Maps on Text (from Chen et al., JASIS 49(7))

16 Is it useful? 4 Clustering Visualization Usability Studies

17 Clustering for Search Study 1 This study compared –a system with 2D graphical clusters –a system with 3D graphical clusters –a system that shows textual clusters Novice users Only textual clusters were helpful (and they were difficult to use well) Kleiboemer, Lazear, and Pedersen. Tailoring a retrieval system for naive users. SDAIR’96

18 Clustering Study 2: Kohonen Feature Maps Comparison: Kohonen Map and Yahoo Task: –“Window shop” for interesting home page –Repeat with other interface Results: –Starting with map could repeat in Yahoo (8/11) –Starting with Yahoo unable to repeat in map (2/14) Chen, Houston, Sewell, Schatz, Internet Browsing and Searching: User Evaluations of Category Map and Concept Space Techniques. JASIS 49(7): (1998)

19 Kohonen Feature Maps (Lin 92, Chen et al. 97)

20 Study 2 (cont.) Participants liked: –Correspondence of region size to # documents –Overview (but also wanted zoom) –Ease of jumping from one topic to another –Multiple routes to topics –Use of category and subcategory labels Chen, Houston, Sewell, Schatz, Internet Browsing and Searching: User Evaluations of Category Map and Concept Space Techniques. JASIS 49(7): (1998)

21 Study 2 (cont.) Participants wanted: –hierarchical organization –other ordering of concepts (alphabetical) –integration of browsing and search –correspondence of color to meaning –more meaningful labels –labels at same level of abstraction –fit more labels in the given space –combined keyword and category search –multiple category assignment (sports+entertain) (These can all be addressed with faceted hierarchical categories) Chen, Houston, Sewell, Schatz, Internet Browsing and Searching: User Evaluations of Category Map and Concept Space Techniques. JASIS 49(7): (1998)

22 Clustering Study 3: NIRVE Each rectangle is a cluster. Larger clusters closer to the “pole”. Similar clusters near one another. Opening a cluster causes a projection that shows the titles.

23 Study 3 This study compared : –3D graphical clusters –2D graphical clusters –textual clusters 15 participants, between-subject design Tasks –Locate a particular document –Locate and mark a particular document –Locate a previously marked document –Locate all clusters that discuss some topic –List more frequently represented topics Visualization of search results: a comparative evaluation of text, 2D, and 3D interfaces Sebrechts, Cugini, Laskowski, Vasilakis and Miller, SIGIR ‘99.

24 Study 3 Results (time to locate targets) –Text clusters fastest –2D next –3D last –With practice (6 sessions) 2D neared text results; 3D still slower –Computer experts were just as fast with 3D Certain tasks equally fast with 2D & text –Find particular cluster –Find an already-marked document But anything involving text (e.g., find title) much faster with text. –Spatial location rotated, so users lost context Helpful viz features –Color coding (helped text too) –Relative vertical locations Visualization of search results: a comparative evaluation of text, 2D, and 3D interfaces Sebrechts, Cugini, Laskowski, Vasilakis and Miller, SIGIR ‘99.

25 Clustering Study 4 Compared several factors Findings: –Topic effects dominate (this is a common finding) –Strong difference in results based on spatial ability –No difference between librarians and other people –No evidence of usefulness for the cluster visualization Aspect windows, 3-D visualizations, and indirect comparisons of information retrieval systems, Swan, &Allan, SIGIR 1998.

26 Summary: Visualizing for Search Using Clusters Huge 2D maps may be inappropriate focus for information retrieval –cannot see what the documents are about –space is difficult to browse for IR purposes –(tough to visualize abstract concepts) Perhaps more suited for pattern discovery and gist-like overviews

27 Clustering Algorithm Problems Doesn’t work well if data is too homogenous or too heterogeneous Often is difficult to interpret quickly –Automatically generated labels are unintuitive and occur at different levels of description Often the top-level can be ok, but the subsequent levels are very poor Need a better way to handle items that fall into more than one cluster

28 How do people want to search and browse images? Ethnographic studies of people who use images intensely find: –Find specific objects is easy Find images of the Empire State Building –Browsing is hard In a usability study with architects, to our surprise we found their response to an image- browsing interface mock-up was they wanted to see more text (categories). Elliott, A. (2001). "Flamenco Image Browser: Using Metadata to Improve Image Search During Architectural Design," in the Proceedings of CHI 2001.

29 An Alternative In the Flamenco project, we have shown that hierarchical faceted metadata, paired with a good interface, is highly effective for browsing image collections –Flamenco.berkeley.edu (But that’s a different talk)

30 Study 5: Comparing Textual Cluster Interfaces to Category Interfaces DynaCat system Decide on important question types in an advance –What are the adverse effects of drug D? –What is the prognosis for treatment T? Make use of MeSH categories Retain only those types of categories known to be useful for this type of query. Pratt, W., Hearst, M, and Fagan, L. A Knowledge-Based Approach to Organizing Retrieved Documents. AAAI-99

31 DynaCat Interface Pratt, W., Hearst, M, and Fagan, L. A Knowledge-Based Approach to Organizing Retrieved Documents. AAAI-99

32 DynaCat Study Design –Three queries –24 cancer patients –Compared three interfaces ranked list, clusters, categories Results –Participants strongly preferred categories –Participants found more answers using categories –Participants took same amount of time with all three interfaces Pratt, W., Hearst, M, and Fagan, L. A Knowledge-Based Approach to Organizing Retrieved Documents. AAAI-99

33 Study 6: Categories vs. Lists One study found users preferred one level of categories over lists, and were faster at finding answers –Only 13 top-level categories shown –Secondary-level categories not very accurate However, the queries appeared to be somewhat setup to optimize the usefulness of the clusters –Example: Query word: “indian” Task: find indian motorcyles Query: “alaska” Task: find yatching adventures in alaska Chen, Dumais, Bringing order to the web: Automatically categorizing search results. CHI 2000

34 What about Textual Displays of Clusters? Text-based clustering is more promising Text-based clustering on the Web –In the early days, Excite had a mockup on about 10 documents that pretended to do Scatter/Gather (when it was called Architext) Quickly removed it and started providing standard search –For a while NorthernLight had a clustering interface Didn’t really get anywhere –The latest entry is Vivisimo Has a lot of problems BUT … there’s a new development from Vivisimo called Clusty Seems to have much improved clustering and interface

35 An Analysis of Vivisimo Query: barcelona Query: dog pregnancy

36

37

38

39 An Analysis of Vivisimo Query: barcelona –Hotels and Travel Guide are both at top level –Also, Barcelona City –But Travel Guide contains Hotels Spain, Spanish –Not really helping to make useful distinctions

40

41

42 An Analysis of Vivisimo Query: pregnant dog –What does the category pregnant mean here? –Why does it have a subcategory of whelping, when there is also a main category of whelping? –And what the relationship to Pregnancy and Birth –The pages shown don’t seem strongly related to one another How to followup? –There is a “find in clusters” box, but not very helpful because no hints about which words might work

43 Search within Results

44 Then along came Clusty … Announced a few months ago Produced by Vivisimo Much better interface Much better clusters

45

46

47

48

49

50 Clusty Improvements Labels tend to be more at the same level of description Subcategories are more cautious, reflecting groups of very similar documents –Do a better job of really showing subcategories Nice interface touches –Better use of color for distinguishing –Small icons are inviting –Incorporation of encyclopedia results high up Search results are better –(Not always – pregnant dog not much better) –Using metasearch –May be throwing out some docs to get more distribution in the types of results found –Looks like they are focusing on term proximity to get more meaningful grouping –Don’t allow very many results

51

52

53

54 Clusty Improvements Doing sense disambiguation for abbreviations like ACL –However, no good followup for how to make use of this –E.g., to search on ACL (meaning comp ling) plus some other concepts –On the other hand, using multiple terms is how most disambiguation is done now ACL + disambiguation Jaguar + prey –So not clear if there is a net benefit Trying to approximate faceted queries –Under Jaguar query, for history, show both history of band with history of car and video game

55

56 Analysis Is it really helping? Or are the categories now too general and overlapping? The main effect seems to be that the search results are better due to the metasearch and term proximity

57

58 More Analysis Reflects the frequency of topics in the data –So no discussion of nukes in the Spain categories –No discussion of hotels in the North Korea categories –Is this good or bad? It depends.

59 Brand New Results!! Mika Kaki: “Findex: Search Result Categories Help Users when Document Ranking Fails” –To appear at CHI in April Two innovations: –Used very simple method to create the groupings, so that it is not opaque to users Based on frequent keywords Allows docs to appear in multiple categories –Did a naturalistic, longitudinal study of use Other things done correctly: –Took care to ensure good response time –Analyzed the results in interesting ways

60

61

62 Study Design 16 academics –8F, 8M –No CS –Frequent searchers 2 months of use Special Log –3099 queries issued –3232 results accessed Two surveys (start and end) Google as search engine; rank order retained

63 Key Findings (all significant) Category use takes almost 2 times longer –First doc selected in 24.4 sec vs 13.7 sec No difference in average number of docs opened per search (1.05 vs. 1.04) However, when categories used, users select >1 doc in 28.6% of the queries (vs 13.6%) Num of searches without 0 result selections is lower when the categories are used Median position of selected doc when: –Using categories: 22 (sd=38) –Just ranking: 2 (sd=8.6)

64

65 Key Findings Category Selections –1915 categories selections in 817 searches –Used in 26.4% of the searches –During the last 4 weeks of use, the proportion of searches using categories stayed above the average (27-39%) –When categories used, selected 2.3 cats on average –Labels of selected cats used 1.9 words on average (average in general was 1.4 words) –Out of 15 cats (default): First quartile at 2 nd cat Median at 5 th Third quartile at 9 th

66 Survey Results Qualitative views improved over time Realization that categories useful only some of the time Freeform responses indicate that categories useful when queries vague, broad or ambiguous Second survey indicated that people felt that their search habits began to change –Consider query formulation less than before (27%) –Use less precise search terms (45%) –Use less time to evaluate results (36%) –Use categories for evaluating results (82%)

67

68 Conclusions from Kaki Study Simplicity of category assignment made groupings understandable –(my view, not stated by them) Keyword-based Categories: –Are beneficial when result ranking fails –Find results lower in the ranking –Reduce empty results –May make it easier to access multiple results –Availability changed user querying behavior

69 Summary Grouping search results is desirable –Often requested by lay users –Very positive results for category interface However, till recently getting good groups is difficult –Two main approaches: Predefined category sets – too hard to get, doesn’t reflect data Automatically created clusters – too hard to understand –An alternative: Frequent keywords, overlapping categories Findex, and Clusty Finally, a believable, well-done study of category use for search results reveals some insight! –Not always useful, but not harmful if understandable (my assertion) and fast –Useful in the situations we have surmised –Interesting result: people change behavior.

70 More Recent Attempts Analyzing retrieval results –KartOO –Grokker

71

72

73

74

75 References Chen, Houston, Sewell, and Schatz, JASIS 49(7) Chen and Yu, Empirical studies of information visualization: a meta- analysis, IJHCS 53(5),2000 Dumais, Cutrell, Cadiz, Jancke, Sarin and Robbins, Stuff I've Seen: A system for personal information retrieval and re-use. SIGIR Hearst, English, Sinha, Swearingen, Yee. Finding the Flow in Web Site Search, CACM 45(9), Hearst, User Interfaces and Visualization, Chapter 10 of Modern Information Retrieval, Baeza-Yates and Rebeiro-Nato (Eds), Addison- Wesley Johnson, Manning, Hagen, and Dorsey. Specialize Your Site's Search. Forrester Research, (Dec. 2001), Cambridge, MA

76 References Sebrechts, Cugini, Laskowski, Vasilakis and Miller, Visualization of search results: a comparative evaluation of text, 2D, and 3D interfaces, SIGIR ‘99. Swan and Allan, Aspect windows, 3-D visualizations, and indirect comparisons of information retrieval systems, SIGIR Yee, Swearingen, Li, Hearst, Faceted Metadata for Image Search and Browsing, Proceedings of CHI 2003