Haystack: an Adaptive Personalized Information Retrieval System

Slides:



Advertisements
Similar presentations
OAF Workshop, May 13-14, 2002, Pisa.CYCLADES IST CYCLADES An Open Collaborative Virtual Archive Environment Umberto Straccia.
Advertisements

In the Format section, we have activated the Bibliographic style drop down menu. From this page, you can choose a specific journal or format (e.g. BMC.
Knowledge is Empowerment Guide no. 5 Searching MEDLINE Full Text: by Subject, & by Publications. Register in My Ebsco Host & Create Alerts.
Haystack: Per-User Information Environment 1999 Conference on Information and Knowledge Management Eytan Adar et al Presented by Xiao Hu CS491CXZ.
Introduction to ZPORTAL Prepared by Houeida K. Charara Electronic Resources Librarian LAU Libraries ©2010.
CS 101 Sect 7 – Databases (DB) Why databases Difference between a DB and a Web search What is a DB An hands-on case: the JCU Library 1
Leveraging Your Taxonomy to Increase User Productivity MAIQuery and TM Navtree.
OntoBlog: Informal Knowledge Management by Semantic Blogging Aman Shakya 1, Vilas Wuwongse 2, Hideaki Takeda 1, Ikki Ohmukai 1 1 National Institute of.
Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.
L C SL C S Haystack: Per-User Information Environments David Karger.
Google Tools and your Library - the Possibilities are Exponential Google CSE Google CSE Google Scholar Google Scholar Google My Library Google.
Collaborative Filtering Shaun Kaasten CPSC CSCW.
DAVID KARGER. Checkered Past Core Algorithms –graph algorithms, randomization, combinatorial optimization –min-cuts, max-flows, shortest paths, minimum.
Connecting Diverse Web Search Facilities Udi Manber, Peter Bigot Department of Computer Science University of Arizona Aida Gikouria - M471 University of.
What’s The Difference??  Subject Directory  Search Engine  Deep Web Search.
Enterprise & Intranet Search How Enterprise is different from Web search What to think about when evaluating Enterprise Search How Intranet use is different.
Search Engines and Information Retrieval Chapter 1.
1 The BT Digital Library A case study in intelligent content management Paul Warren
Springerlink.com Introduction to SpringerLink springerlink.com.
1 CFUG Book Library Part 2 Troy Pullis 3/3/ Checkout Button Click to request the selected book for checkout. Book Info/Reviews Click to see more.
Nobody’s Unpredictable Ipsos Portals. © 2009 Ipsos Agenda 2 Knowledge Manager Archway Summary Portal Definition & Benefits.
1999 Asian Women's Network Training Workshop Tools for Searching Information on the Web  Search Engines  Meta-searchers  Information Gateways  Subject.
ITIS 1210 Introduction to Web-Based Information Systems Chapter 27 How Internet Searching Works.
Beyond Search Engines: Advanced Web Searching Subject Directories  Librarians’ Index to the Internet  Infomine Finding Databases on a Subject  The Invisible.
NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006.
SUMMON ® 2.0 DISCOVERY REINVENTED. What is Summon 2.0? A new, streamlined, modern interface New and enhanced features providing layers of contextual guidance.
Haystack: Per-User Information Environments David Karger.
2007. Software Engineering Laboratory, School of Computer Science S E Web-Harvest Web-Harvest: Open Source Web Data Extraction tool 이재정 Software Engineering.
Searching Tutorial By: Lola L. Introduction:  When you are using a topic, you might want to use “keyword topics.” Using this might help you find better.
Individualized Knowledge Access David Karger Lynn Andrea Stein Mark Ackerman Ralph Swick.
How do I search the Internet? Narrow your topic and its description; pull out key words and categories.
Access Forms and Queries. Entering Data in Your Table  You can add data to your table in Datasheet view, by typing in the columns and rows.  This.
Evaluating Web Pages Techniques to apply and questions to ask.
Core Integration Web Services Dean Krafft, Cornell University
Human Centric Computing (COMP106) Assignment 2 PROPOSAL 23.
Advanced Semantics and Search Beyond Tag Clouds and Taxonomies Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Individualized Knowledge Access David Karger Lynn Andrea Stein.
An Introduction to NHS Evidence
Collaborative Query Previews in Digital Libraries Lin Fu, Dion Goh, Schubert Foo Division of Information Studies School of Communication and Information.
ENG 110 / HIS 113 Mortola Library.  Understand the nature and potential uses of a variety of secondary sources.  Locate books pertaining to your research.
Evaluating Web Pages Techniques to apply and questions to ask.
Unit 5 Commercial Databases. Can You Find an Answer? n Connect to Social Sciences Abstracts n Search: u Cold war (keyword): ______ items u Cold war (title):______.
How to find journal articles. Thousands of journals; millions of articles … But how do you find the articles you need?
Bringing Order to the Web : Automatically Categorizing Search Results Advisor : Dr. Hsu Graduate : Keng-Wei Chang Author : Hao Chen Susan Dumais.
1 Manual LIMO Content  What’s LIMO?  Content of LIMO  Getting started in LIMO  Performing Searches  Using the Search Results  Managing.
Learning Objectives 1.Students will be able to identify and implement three different strategies for when they are getting too many sources in their search.
CS276B Text Information Retrieval, Mining, and Exploitation Practical 1 Jan 14, 2003.
Haystack: Per-User Information Environments David Karger.
Discovery and Metadata March 9, 2004 John Weatherley
Information Architecture
LEARNING SERVICES. LEARNING SERVICES Learning Resources As a student of Edge Hill University you have a wealth of resources available to help you complete.
Summon® 2.0 Discovery Reinvented
David Huynh, Stefano Mazzocchi, David Karger Piggy Bank: Experience the Semantic Web inside your web browser Web Semantics: Science, Services and Agents.
Science Reference Center
Search Engines and Search techniques
Supporting Collaboration Between the Public Health Community and NHS Libraries Mo Hussain PhD Knowledge and Evidence Specialist, East of England, Public.
Science Reference Center
Library Reserve System
SPECIALIZED APPLICATION SOFTWARE
Information Integration for Digital Libraries
Search Techniques and Advanced tools for Researchers
Finding Magazine and Journal Articles in
Louisiana: Our History.
Introduction into Knowledge and information
Planning Your Research Project
Web archives as a research subject
Tutorial Introduction to help.ebsco.com.
Metadata supported full-text search in a web archive
Lesson 2: Gathering and Organizing Information Using ICT KEY QUESTION: HOW DO YOU GATHER AND ORGANIZE INFORMATION USING THE COMPUTER AND INTERNET?
Presentation transcript:

Haystack: an Adaptive Personalized Information Retrieval System David Karger, Lynn Stein, Eytan Adar, Mark Asdoorian, Aidan Low, Jing Qian, Orion Richardson MIT Laboratory for Computer Science and AI Laboratory, Cambridge MA 02139, USA What is Haystack? The Bookshelf Metaphor Integration An information retrieval system focused on exploiting interaction with individuals complements large search engines treats different people differently Interesting research issues: Heterogeneous Data: Deal with the variety of content individuals tend to collect User Interface: offer ubiquitous access Big Brother: Develop user interface tools to gather all possible information about users Learning: Develop mechanisms for letting past user reactions influence future system actions Collaboration: Share data and metadata among a large community of users Search Engines are Like Libraries Massive corpora: Mostly irrelevant, often out of date Anonymous: Treat all users exactly the same Rigid: Use librarians’ one-for-all ontology People prefer to start with bookshelves My data: Information gathered personally. High quality, easy to understand. Annotated. My organization: owner-chosen subject partition. Best items near desk. Then they turn to colleagues Trust: Colleagues are authorities on other topics; can recommend good data Leverage: Colleagues have organized their data; makes searching easy Haystack archives all user content, adds metadata Plug in components extract data from content Textifiers: ascii, html, postscript, ocr.... Field finders: author, title, date, summary.... Haystack mediates user-selected search tools Text search: mg, verity, isearch, grep.... Database: LORE Haystack can be used without thinking access during all standard activities (mail, web, edit) application-specific stubs talk to kernel keyword search for file to edit archive and annotate current web page Target Queries Adaptation Collaboration What research is being done on multicast? Goal: improved performance over time Annotations by user user can add to/change all data/metadata requires active intervention, so undesirable Observation of query process user performs query; haystack returns results user selects relevant result haystack records connection for future queries adapt using machine learning techniques Observation of general activity (proposed) items that are used a lot items that are used together items used after a search Where is the email about Haystack that I sent Lynn last month? Leverage individual users’ Haystacks simple RPC to query other Haystacks’ data gather data from several; combine evidence Exploits self interest individuals seek/organize info for own gain organization provides benefit to others Need to identify “expert colleagues” Haystacks of people I contact often Haystack with much overlapping data Haystacks that gave good answers in past referrals Collaborative Filtering model & techniques CF finds “stuff you’ll like” Haystack finds “stuff you’ll like for this query” Which DARPA BAAs should I read? Current Status Prototype completed Fall 1997 some functionality in all categories; limited extensibility Kernel reimplemented from scratch, due Summer 1998 Check out Web Site at http://www.ai.mit.edu/projects/haystack