Sackler – May 11, 2003 Organizing Search Results Susan Dumais Microsoft Research.

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

ACL/HLT – June 18, 2008 Using Context to Support Searchers in Searching Susan Dumais Microsoft Research
Tightly Coupling Search and Structure Susan T. Dumais Microsoft Research SIGIR97 - Workshop on Information Reduction July 31, 1997.
Recommender Systems & Collaborative Filtering
Bringing Order to the Web: Automatically Categorizing Search Results Hao Chen SIMS, UC Berkeley Susan Dumais Adaptive Systems & Interactions Microsoft.
IDM 2003 Workshop Stuff I’ve Seen: Susan Dumais Microsoft Research A System for Personal Information Retrieval and.
The Marathi Portal with a Search Engine Center for Indian Language Technology Solutions, IIT Bombay.
PaperLens Understanding Research Trends in Conferences using PaperLens Work by Bongshin Lee, Mary Czerwinski, George Robertson, and Benjamin Bederson Presented.
Information Retrieval in Practice
Personalizing Search via Automated Analysis of Interests and Activities Jaime Teevan Susan T.Dumains Eric Horvitz MIT,CSAILMicrosoft Researcher Microsoft.
Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Access 2007 Product Review. With its improved interface and interactive design capabilities that do not require deep database knowledge, Microsoft Office.
Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.
Mobile Web Search Personalization Kapil Goenka. Outline Introduction & Background Methodology Evaluation Future Work Conclusion.
Google Tools and your Library - the Possibilities are Exponential Google CSE Google CSE Google Scholar Google Scholar Google My Library Google.
Enterprise Search With SharePoint Portal Server V2 Steve Tullis, Program Manager, Business Portal Group 3/5/2003.
OLC Spring Chapter Conferences Metadata, Schmetadata … Tell Me Why I Should Care? OLC Spring Chapter Conferences, 2004 Margaret.
University of Kansas Department of Electrical Engineering and Computer Science Dr. Susan Gauch April 2005 I T T C Dr. Susan Gauch Personalized Search Based.
Stuff I’ve Seen: A System for Personal Information Retrieval and Re-use by Seher Acer Elif Demirli Susan Dumais, Edward Cutrell, JJ Cadiz, Gavin Jancke,
Personalized Ontologies for Web Search and Caching Susan Gauch Information and Telecommunications Technology Center Electrical Engineering and Computer.
Chapter 12 Information Systems. 2 Chapter Goals Define the role of general information systems Explain how spreadsheets are organized Create spreadsheets.
Overview of Search Engines
KNOWLEDGE FOR LIFE Leisure Tourism Database CABI product training Tom Corser.
© 2012 Microsoft Corporation. All rights reserved. Amazing apps. Windows 8 comes with built-in apps for the things you do most to help get your favorite.
By Kyle Rector Senior, EECS, OSU. Agenda Background My Approach Demonstration How it works The Survey Plans for User Evaluation Future Plans.
Chapter 12 Information Systems. 2 Chapter Goals Define the role of general information systems Explain how spreadsheets are organized Create spreadsheets.
Pasewark & Pasewark 1 Outlook Lesson 1 Outlook Basics and Microsoft Office 2007: Introductory.
Golder and Huberman, 2006 Journal of Information Science Usage Patterns of Collaborative Tagging System.
Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
Wikis are websites where pages can be edited using an online document editor. Users can easily edit and share content. Enterprise wikis are platforms.
Topics and Transitions: Investigation of User Search Behavior Xuehua Shen, Susan Dumais, Eric Horvitz.
XP New Perspectives on Browser and Basics Tutorial 1 1 Browser and Basics Tutorial 1.
Bringing Order to the Web: Automatically Categorizing Search Results Hao Chen, CS Division, UC Berkeley Susan Dumais, Microsoft Research ACM:CHI April.
Welcome to Georgia Library Learning Online for K-12 Schools
Business Software What is database software? p. 145 Allows you to create, access, and manage data Add, change, delete, sort, and retrieve data Next.
Nobody’s Unpredictable Ipsos Portals. © 2009 Ipsos Agenda 2 Knowledge Manager Archway Summary Portal Definition & Benefits.
Detecting Semantic Cloaking on the Web Baoning Wu and Brian D. Davison Lehigh University, USA WWW 2006.
Chapter 2 Architecture of a Search Engine. Search Engine Architecture n A software architecture consists of software components, the interfaces provided.
Web Searching Basics Dr. Dania Bilal IS 530 Fall 2009.
Fourth Edition Discovering the Internet Discovering the Internet Complete Concepts and Techniques, Second Edition Chapter 3 Searching the Web.
Personal Information Management Vitor R. Carvalho : Personalized Information Retrieval Carnegie Mellon University February 8 th 2005.
Search Engines. Search Strategies Define the search topic(s) and break it down into its component parts What terms, words or phrases do you use to describe.
Topical Categorization of Large Collections of Electronic Theses and Dissertations Venkat Srinivasan & Edward A. Fox Virginia Tech, Blacksburg, VA, USA.
인지구조기반 마이닝 소프트컴퓨팅 연구실 박사 2 학기 박 한 샘 2006 지식기반시스템 응용.
1 Of Crawlers, Portals, Mice and Men: Is there more to Mining the Web? Jiawei Han Simon Fraser University, Canada ACM-SIGMOD’99 Web Mining Panel Presentation.
LOGO A comparison of two web-based document management systems ShaoxinYu Columbia University March 31, 2009.
Interaction LBSC 734 Module 4 Doug Oard. Agenda Where interaction fits Query formulation Selection part 1: Snippets  Selection part 2: Result sets Examination.
Algorithmic Detection of Semantic Similarity WWW 2005.
CiteSight: Contextual Citation Recommendation with Differential Search Avishay Livne 1, Vivek Gokuladas 2, Jaime Teevan 3, Susan Dumais 3, Eytan Adar 1.
UI's for inputting and presenting the metadata of hypermedia documents Kai Kuikkaniemi HUT T
What is the Internet? The Internet is a network of networks. It gives users access to a wide variety of information from millions of different sources.
Microsoft Partner Conference Integrated Innovation Don Kerr Partner Technology Specialist.
Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.
Bringing Order to the Web : Automatically Categorizing Search Results Advisor : Dr. Hsu Graduate : Keng-Wei Chang Author : Hao Chen Susan Dumais.
September 2003, 7 th EDG Conference, Heidelberg – Roberta Faggian, CERN/IT CERN – European Organization for Nuclear Research The GRACE Project GRid enabled.
1 Personalizing Search via Automated Analysis of Interests and Activities Jaime Teevan, MIT Susan T. Dumais, Microsoft Eric Horvitz, Microsoft SIGIR 2005.
5/29/2001Y. D. Wu & M. Liu1 Content Management for Digital Library May 29, 2001.
Third Edition Discovering the Internet Discovering the Internet Complete Concepts and Techniques, Second Edition Chapter 3 Searching the Web.
Information Retrieval in Practice
What is GroupWise? A tool for communication in our organization A system to send and receive A way to increase productivity A method to get documents.
Recommender Systems & Collaborative Filtering
Search Engine Architecture
Web Application.
SIS: A system for Personal Information Retrieval and Re-Use
AMGA Web Interface Vincenzo Milazzo
Introduction to Information Retrieval
Haystack: an Adaptive Personalized Information Retrieval System
Gizem MISIRLI Gülden OLGUN
Presentation transcript:

Sackler – May 11, 2003 Organizing Search Results Susan Dumais Microsoft Research

Sackler – May 11, 2003 Organizing Search Results Algorithms and interfaces that improve the effectiveness of search Algorithms and interfaces that improve the effectiveness of search Beyond ranked lists Beyond ranked lists Main goal to support search Main goal to support search Also information analysis and discovery Also information analysis and discovery Example applications Example applications SWISH, results classification SWISH, results classification GridViz, results summarization GridViz, results summarization SIS, personal landmarks for context SIS, personal landmarks for context

Sackler – May 11, 2003 Searching with Information Structured Hierarchically (SWISH) Collaborators Collaborators Edward Cutrell, Hao Chen (Berkeley) Edward Cutrell, Hao Chen (Berkeley) Key Themes Key Themes Going beyond long lists of results Going beyond long lists of results Classification algorithms Classification algorithms UI techniques UI techniques More about it More about it /~sdumais /~sdumais /~sdumais /~sdumais

Sackler – May 11, 2003 Query: “jaguar” Organizing Search Results List Organization => Shopping => Automotive => Computers SWISH Category Organization

Sackler – May 11, 2003 LookSmart Directory Structure LookSmart Directory Structure ~400k pages; 17k categories; 7 levels ~400k pages; 17k categories; 7 levels 13 top-level categories; 150 second-level categories 13 top-level categories; 150 second-level categories Top-level Categories Top-level Categories Web Directory Automotive Business & Finance Computers & Internet Entertainment & Media Health & Fitness Hobbies & Interests Home & Family People & Chat Reference & Education Shopping & Services Society & Politics Sports & Recreation Travel & Vacations Buy or Sell a Car Chat Finance & Insurance Magazines & Books Maintenance & Repair Makes, Models & Clubs Motorcycles New Car Showrooms Off-Road, 4X4 & RVs Other Auto Interests Shows & Museums Trucks & Tractors Vintage & Classic

Sackler – May 11, 2003 SWISH System Combines the advantages of Combines the advantages of Directories - Manually crafted structure but small Directories - Manually crafted structure but small Search engines - Broad coverage but limited metadata Search engines - Broad coverage but limited metadata Project search engine results to category structure Project search engine results to category structure Two main components Two main components Text classification models Text classification models UI for integrating search results and structure UI for integrating search results and structure Context (category structure) plus focus (search results) Context (category structure) plus focus (search results)

Sackler – May 11, 2003 SWISH Architecture manually classified web pages SVM model Train (offline) web search results local search results... Classify (online)

Sackler – May 11, 2003 Learning & Classification Support Vector Machine (SVM) Support Vector Machine (SVM) Accurate and efficient for text classification (Dumais et al., Joachims) Accurate and efficient for text classification (Dumais et al., Joachims) Model = weighted vector of words Model = weighted vector of words “Automobile” = motorcycle, vehicle, parts, automobile, harley, car, auto, honda, porsche … “Automobile” = motorcycle, vehicle, parts, automobile, harley, car, auto, honda, porsche … “Computers & Internet” = rfc, software, provider, windows, user, users, pc, hosting, os, downloads... “Computers & Internet” = rfc, software, provider, windows, user, users, pc, hosting, os, downloads... Hierarchical models for LS directory Hierarchical models for LS directory 1 model for top level; N models for second 1 model for top level; N models for second Very useful in conjunction w/ user interaction Very useful in conjunction w/ user interaction

Sackler – May 11, 2003 List OrganizationCategory Organization User Interface Experiments

Sackler – May 11, 2003 HoverInline No Cat Names BrowseHoverInline + Cat Names Group InterfaceList Interface

Sackler – May 11, 2003 Effect of Query Difficulty HARDHARD HARDHARD EASYEASY EASYEASY Group List Easy queries are faster (p<0.01) Group faster than List (p<0.01) Benefit is larger for hard queries (p<0.06)

Sackler – May 11, 2003 SWISH: Summary and Design Implications Text Classification Text Classification Learn accurate category models Learn accurate category models Classify new web pages on- the-fly Classify new web pages on- the-fly Organize search results Organize search results User Interface User Interface Tightly couple search results with category structure Tightly couple search results with category structure User manipulation of presentation of category structure User manipulation of presentation of category structure

Sackler – May 11, 2003 Organizing Search Results, other examples

Sackler – May 11, 2003 GridViz Collaborators Collaborators George Robertson, Edward Cutrell, Jeremy Goecks (Georgia Tech) George Robertson, Edward Cutrell, Jeremy Goecks (Georgia Tech) Key Themes Key Themes Abstract beyond individual results Abstract beyond individual results Highly interactive interface to support understanding of trends and relationships Highly interactive interface to support understanding of trends and relationships More about it More about it

Sackler – May 11, 2003 GridViz Summarize the results of a search Summarize the results of a search Grid-based design Grid-based design Axes represent topic, time, people Axes represent topic, time, people Cells encode frequency, recency Cells encode frequency, recency Supports activities like: Supports activities like: What newsgroups are active (on topic x)? What newsgroups are active (on topic x)? What people are active, authoritative (on topic x)? What people are active, authoritative (on topic x)? When did I last interact w/ people? When did I last interact w/ people?

Sackler – May 11, 2003 GridViz Demo

Sackler – May 11, 2003 User Interface Experiments List View GridViz

Sackler – May 11, 2003 GridViz Summary Abstracting beyond individual results Abstracting beyond individual results Highly interactive interface Highly interactive interface Grid-based design Grid-based design Axes represent people, topic, time Axes represent people, topic, time Cells encode frequency, recency Cells encode frequency, recency Preliminary but promising Preliminary but promising

Sackler – May 11, 2003 Stuff I’ve Seen (SIS) Collaborators Collaborators Edward Cutrell, Raman Sarin, JJ Cadiz, Gavin Jancke, Daniel Robbins, Merrie Ringel (Stanford) Edward Cutrell, Raman Sarin, JJ Cadiz, Gavin Jancke, Daniel Robbins, Merrie Ringel (Stanford) Key Themes Key Themes Your content Your content Information re-use Information re-use Integration across sources Integration across sources More about it More about it … internal for now … internal for now

Sackler – May 11, 2003 Search Today … Many locations, interfaces for finding things (e.g., web, mail, local files, help, history, intranet) Often slow

Sackler – May 11, 2003 Search with SIS Unified index of stuff you’ve seen Unify access to information regardless of source – mail, archives, calendar, files, web pages, etc. Full-text index of content plus metadata attributes (e.g., creation time, author, title, size) Automatic and immediate update of index Rich UI possibilities, since it’s your content Architecture Client side indexing and storage Built using MS Search components

Sackler – May 11, 2003 SIS Demo

Sackler – May 11, 2003 SIS Alpha Observations 800+ internal users Usage logs (incl different interfaces), survey data File types opened 76% 14% Web pages 10% Files Age of items accessed 7% today 22% within the last week 46% within the last month

Sackler – May 11, 2003 SIS Alpha Observations Use of other search tools Non-SIS search for web, , and files decreases Importance of people 25% of the queries involve people’s names Importance of time Date by far the most popular sort field, followed by rank, author, title Even when rank is the default

Sackler – May 11, 2003 SIS UI Innovations Timeline w/ Landmarks Importance of time Importance of time Timeline interface Timeline interface Contextualize results using important landmarks as pointers into human memory Contextualize results using important landmarks as pointers into human memory General: holidays, world events General: holidays, world events Personal: important photos, appointments Personal: important photos, appointments

Sackler – May 11, 2003 Milestones in Time Demo

Sackler – May 11, 2003 Milestones in Timeline

Sackler – May 11, 2003 SIS Summary Unified index of stuff you’ve seen Fast access to full-text and metadata, from heterogeneous sources Automatic and immediate update of index Rich UI possibilities Next steps Better support for tagging -> “flatland” Implicit queries for finding related info, and identifying “Stuff I Should See” Integration with richer activity-based info, Eve

Sackler – May 11, 2003 Organizinging Search Results Algorithms and interfaces to improve search Algorithms and interfaces to improve search Use structure and context Use structure and context Examples and key themes Examples and key themes SWISH … grouping SWISH … grouping GridViz … abstraction GridViz … abstraction SIS … personal content and landmarks SIS … personal content and landmarks Also Also Important attributes: People, topics, time Important attributes: People, topics, time Interaction Interaction Evaluation Evaluation More information More information Christopher Lee of (SIG)IR … Christopher Lee of (SIG)IR …