L C SL C S Haystack: Per-User Information Environments David Karger.

Slides:



Advertisements
Similar presentations
In the Format section, we have activated the Bibliographic style drop down menu. From this page, you can choose a specific journal or format (e.g. BMC.
Advertisements

Unference: UI (not AI) as key to the Semantic Web David Karger.
Haystack: Per-User Information Environment 1999 Conference on Information and Knowledge Management Eytan Adar et al Presented by Xiao Hu CS491CXZ.
Accessing and Using the e-Book Collection from EBSCOhost ® When an arrow appears, click to proceed to the next slide at your own pace. To go back, click.
© 2010 Delmar, Cengage Learning Chapter 1 Getting Started with Dreamweaver.
Exploring the Deep Web Brunvand, Amy, Kate Holvoet, Peter Kraus, and David Morrison. "Exploring the Deep Web." PPT--Download University of Utah.
Introducing new web content management tools for Priority...
Copyright 2004 Monash University IMS5401 Web-based Systems Development Topic 2: Elements of the Web (g) Interactivity.
The KB on its way to Web 2.0 Lower the barrier for users to remix the output of services. Theo van Veen, ELAG 2006, April 26.
. Website and file organization. How websites work.
Searching The Web Search Engines are computer programs (variously called robots, crawlers, spiders, worms) that automatically visit Web sites and, starting.
Introducing Symposia : “ The digital repository that thinks like a librarian”
XLink: Open Linking Standard XML / XSL separate  data semantics  presentation semantics Need to also separate out  navigation semantics Single unique.
Searching and Researching the World Wide: Emphasis on Christian Websites Developed from the book: Searching and Researching on the Internet and World Wide.
Connecting Diverse Web Search Facilities Udi Manber, Peter Bigot Department of Computer Science University of Arizona Aida Gikouria - M471 University of.
University of Kansas Data Discovery on the Information Highway Susan Gauch University of Kansas.
1 Agenda Overview Review Roles Lists Libraries Columns.
Lesson 46: Using Information From the Web copy and paste information from a Web site print a Web page download information from a Web site customize Web.
Internet and Social Networking Research Tools for Academic Writing Copyright © 2014 Todd A. Whittaker
Adobe Dreamweaver CS3 Revealed CHAPTER ONE: GETTING STARTED WITH DREAMWEAVER.
Aurora: A Conceptual Model for Web-content Adaptation to Support the Universal Accessibility of Web-based Services Anita W. Huang, Neel Sundaresan Presented.
Classroom User Training June 29, 2005 Presented by:
Chapter 16 The World Wide Web Chapter Goals ( ) Compare and contrast the Internet and the World Wide Web Describe general Web processing.
LBTO IssueTrak User’s Manual Norm Cushing version 1.3 August 8th, 2007.
16-1 The World Wide Web The Web An infrastructure of distributed information combined with software that uses networks as a vehicle to exchange that information.
Topshare websites consists of two area’s: A public domain and a Secure domain. The public domains are regular website, viewable for everyone with a internet.
Adventures in Radio UserLand Lincoln Cushing, UC Berkeley Institute of Industrial Relations Library.
MetaLib Tutorial. 2 MetaLib may be used at two different levels – guest users and logged on users. Guest users have limited access to the databases as.
Building Search Portals With SP2013 Search. 2 SharePoint 2013 Search  Introduction  Changes in the Architecture  Result Sources  Query Rules/Result.
Copyright © 2008 Pearson Prentice Hall. All rights reserved. 1 Exploring Microsoft Office Word 2007 Chapter 8 Word and the Internet Robert Grauer, Keith.
LIS510 lecture 3 Thomas Krichel information storage & retrieval this area is now more know as information retrieval when I dealt with it I.
How did the internet develop?. What is Internet? The internet is a network of computers linking many different types of computers all over the world.
Microsoft Internet Explorer and the Internet Using Microsoft Explorer 5.
Session 1 SESSION 1 Working with Dreamweaver 8.0.
Chapter 8 Browsing and Searching the Web. Browsing and Searching the Web FAQs: – What’s a Web page? – What’s a URL? – How does a browser work? – How do.
Chapter Chapter 3 Internet Agents. Chapter Contents Background Web Search Agents Information Filtering Agents Notification Agents Other Service.
Introduction to Nutch CSCI 572: Information Retrieval and Search Engines Summer 2010.
Personal Information Management Vitor R. Carvalho : Personalized Information Retrieval Carnegie Mellon University February 8 th 2005.
Creating Research Alerts Sarah Lester & Mike Nack, Stanford Engineering Library Winter Quarter 2011.
Haystack: Per-User Information Environments David Karger.
Key Applications Module Lesson 21 — Access Essentials
ASSISTED BROWSING THROUGH SEMISTRUCTURED DATA PROBLEM The development of the RDF standard highlights the fact that a great deal of useful information is.
ITCS373: Internet Technology Lecture 5: More HTML.
XP Practical PC, 3e Chapter 8 1 Browsing and Searching the Web.
2007. Software Engineering Laboratory, School of Computer Science S E Web-Harvest Web-Harvest: Open Source Web Data Extraction tool 이재정 Software Engineering.
Individualized Knowledge Access David Karger Lynn Andrea Stein Mark Ackerman Ralph Swick.
MetaLib 4 User Guide. 2 MetaLib 4 Access MetaLib at: – MetaLib may be used at two different levels –
Knowledge Management Platform Communities of Practice User Guide for CoP users Copyright © 2010 Group Technology Solutions. All Rights Reserved.
CONTENTS  Definition And History  Basic services of INTERNET  The World Wide Web (W.W.W.)  WWW browsers  INTERNET search engines  Uses of INTERNET.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
January 2006Colby College ITS Setting Up Course Pages.
Individualized Knowledge Access David Karger Lynn Andrea Stein.
The World Wide Web. What is the worldwide web? The content of the worldwide web is held on individual pages which are gathered together to form websites.
Website Design, Development and Maintenance ONLY TAKE DOWN NOTES ON INDICATED SLIDES.
Augmenting (personal) IR Readings Review Evaluation Papers returned & discussed Papers and Projects checkin time.
Text Information Management ChengXiang Zhai, Tao Tao, Xuehua Shen, Hui Fang, Azadeh Shakery, Jing Jiang.
Microsoft Office 2008 for Mac – Illustrated Unit D: Getting Started with Safari.
COMP 143 Web Development with Adobe Dreamweaver CC.
Haystack: Per-User Information Environments David Karger.
TechKnowlogy Conference August 2, 2011 Using GoogleDocs for Collaboration.
Weebly Elements, Continued
Web Mining Ref:
Augmenting (personal) IR
Lisa Ruff Business Productivity/Accessibility TS Microsoft Federal
Oracle Sales Cloud Sales campaign
Unit# 5: Internet and Worldwide Web
Haystack: an Adaptive Personalized Information Retrieval System
INTELLIGENT BROWSERS Cenk Ursavas.
Presentation transcript:

L C SL C S Haystack: Per-User Information Environments David Karger

L C SL C S Motivation

David Karger — MIT Laboratory for Computer Science and Artificial Intelligence Laboratory Individualized Information Retrieval One size does NOT fit all –Library is to bookshelf as google is to …. Best IR tools must adapt to their individual users –Hold content that is appropriate to that user –Organize it to help that user navigate and organize it –Adapt over time to how that user wants things done –Like a bookshelf, or a personal secretary

David Karger — MIT Laboratory for Computer Science and Artificial Intelligence Laboratory Haystack Approach Data Model –Define a rich data model that lets user represent all interesting info –Rich search capabilities –Machine readable so that agents can augment/share/exchange info User Interface –Strengthen UI tools to show rich data model to user –And let them navigate/manipulate it Adaptability –People are lazy, unwilling to “waste time” telling system what to do, even if it could help them later –System must introspect about user actions, deduce user needs and preferences, and self-adjuss to provide better behavior

L C SL C S Data Model A semantic web of information

David Karger — MIT Laboratory for Computer Science and Artificial Intelligence Laboratory The Haystack Data Model W3C RDF/DAML standard Arbitrary objects, connected by named links –A semantic web –Links can be linked No fixed schema –User extensible –Add annotations –Create brand new attributes Doc D. Karger Haystack title author Outstanding quality says HTML type

David Karger — MIT Laboratory for Computer Science and Artificial Intelligence Laboratory Agent Environment Various types rooted in RDF containers –Extract structured data from traditional formats –Extend RDF through analysis/integration of other RDF –Take actions (notify user gui, fetch web info, send ) Various Triggers –Scheduled actions –Actions triggered by arrival/creation of new RDF patterns Belief Server –Agents will disagree –User specifies which are more trustworthy –Belief server filters each disagreement User is ultimate arbiter (via user interface)

David Karger — MIT Laboratory for Computer Science and Artificial Intelligence Laboratory Database Needs Power –Support general purpose SQL-style queries over arbitrary RDF Speed –Haystack stores all state in data model –So issues huge number of tiny, trivial queries to model –Traditional databases assume real work of query will dominate intialization/marshalling costs –So traditional databases don’t work for haystack Wanted: all-in-one data repository

David Karger — MIT Laboratory for Computer Science and Artificial Intelligence Laboratory Gathering Data Active user input –Interfaces let user add data, note relationships Mining data from prior data –Plug-in services opportunistically extract data Passive observation of user –Plug-ins to other interfaces record user actions Other Users

David Karger — MIT Laboratory for Computer Science and Artificial Intelligence Laboratory Data Extraction Services Web Observer Proxy RDF Store Mail Observer Proxy Machine Learning Services Web Viewer Haystack UI Spider

L C SL C S User Interface Uniform Access to All Information

David Karger — MIT Laboratory for Computer Science and Artificial Intelligence Laboratory Current Barriers to Information Flow Partitions by Location –Some data on this computer, some on that –Remote access always noticeable, distracting Partitions by Application –Mail reader for this, web browser for that, text editor for those –Todo list, but without needed elements Invisibility –Where did I put that file? –Tendency for objects to have single (inappropriate) location (folder) Missing attributes –Too lazy to add keywords that would aid searching later

David Karger — MIT Laboratory for Computer Science and Artificial Intelligence Laboratory Goal: Task-Based Interface When working on X, all information relevant to X (and no other) should be at my fingertips –Planning the day: todo list, news articles, urgent , seminars –Editing a paper: relevant citations, from coauthors, prior versions –Hacking: code modules, documentation, working notes, threads Location, source and format of data irrelevant

David Karger — MIT Laboratory for Computer Science and Artificial Intelligence Laboratory Sign of Need: Usage as todo list –Anything not yet “done” kept there –Reminder to ourselves –Single interface containing numerous document types Overflowing Inboxes –Navigate only by brute-force scanning –Unsafe file/categorize anything: out of sight, out of mind

David Karger — MIT Laboratory for Computer Science and Artificial Intelligence Laboratory Options Folders –Out of sight, out of mind –Still need applications to see data –Which is the right folder? Desktops –Allow arbitrary data types –But coupling between applications & data types too light –A smear of many tasks, so hard to focus *Hundreds of icons, tens of windows, huge menus *No partitioning RDF (our choice) –Treat information uniformly –Let each information object present itself in contect

David Karger — MIT Laboratory for Computer Science and Artificial Intelligence Laboratory The Big Picture

David Karger — MIT Laboratory for Computer Science and Artificial Intelligence Laboratory User Interface Architecture Views: Data about how to display data Views are persistent, manipulable data Data to be displayed UI data Mapping View Underlying information UI data Mapping 2 View 2

David Karger — MIT Laboratory for Computer Science and Artificial Intelligence Laboratory Semantic User Interface Present information by assembling different views together Information manipulation decoupled from presentation –Lower barrier of entry for development –New data types can be added without designing new UIs Uniform support for features like context menus –Actions apply to objects on screen in various “roles” –E.g. as word, as name of mail message, as member of collection View for Favorites collection View for cnn.com View for yahoo.com View for ~/documents/thesis.pdf

David Karger — MIT Laboratory for Computer Science and Artificial Intelligence Laboratory

Tasks Become Modeless Data

David Karger — MIT Laboratory for Computer Science and Artificial Intelligence Laboratory Persistence of Views Views are data like all other data Stored persistently, manipulated by user User can customize a view –View for particular task can be cloned from another –Can evolve over time to need of task –To an extent previously limited to sophisticated UI designer Views can be shared (future work) –Once someone determines “right” way to look at data, others can benefit

L C SL C S Adaptation Learning from the User over Time (Future Work)

David Karger — MIT Laboratory for Computer Science and Artificial Intelligence Laboratory Approach Haystack is ideally positioned to adapt to user –RDF data model provides rich attribute set for learning –In particular, can record user actions with information *(which flexible UI can capture) –Extensive record can be built up over time Introspect on that information –Make Haystack adapt to needs, skills, and preferences of that user

David Karger — MIT Laboratory for Computer Science and Artificial Intelligence Laboratory Observe User Instrument all interfaces, report user actions to haystack –Mail sent, files edited, web pages browsed Discover quality –What does the user visit often? Discover semantic relationships –What gets used at the same time? Discover search intent –Which results were actually used?

David Karger — MIT Laboratory for Computer Science and Artificial Intelligence Laboratory Learning from Queries Searching involves a dialogue –First query doesn’t work –So look at the results, change the query –Iterate till home in on desired results Haystack remembers the dialogue –instead of first query attempt, use last one –record items user picked as good matches –on future, similar searches, have better query plus examples to compare to candidate results –Use data to modify queries to big search engines, filter results coming back

David Karger — MIT Laboratory for Computer Science and Artificial Intelligence Laboratory Mediation Haystack can be a lens for viewing data from the rest of the world –Stored content shows what user knows/likes –Selectively spider “good” sites –Filter results coming back *Compare to objects user has liked in the past –Can learn over time Example - personalized news service

David Karger — MIT Laboratory for Computer Science and Artificial Intelligence Laboratory News Service

David Karger — MIT Laboratory for Computer Science and Artificial Intelligence Laboratory News Service Scavenges articles from your favorite news sources –Html parsing/extracting services Over time, learns types of articles that interest you –Prioritizes those for display Uses attributes other than article content –Current system based entirely on URL of story

David Karger — MIT Laboratory for Computer Science and Artificial Intelligence Laboratory Personalized News Service

David Karger — MIT Laboratory for Computer Science and Artificial Intelligence Laboratory Underway Projects Mail Auto-classifier Generalized querying/relevance feedback based on Haystack’s rich attribute set

L C SL C S Collaboration Haystack’s Ulterior Motive

David Karger — MIT Laboratory for Computer Science and Artificial Intelligence Laboratory Hidden Knowledge People know a lot that they are –Willing to share –But too lazy to publish Haystack passively collects that knowledge –Without interfering with user Once there, share it! –RDF---uniform language for data exchange Challenges –As people individualize systems, semantics diverge –Who is the “expert” on a topic? (collaborative filtering)

David Karger — MIT Laboratory for Computer Science and Artificial Intelligence Laboratory Example Info on probabilistic models in data mining –My haystack doesn’t know, but “probability” is in lots of I got from Tommi Jaakola –Tommi told his haystack that “Bayesian” refers to “probability models” –Tommi has read several papers on Bayesian methods in data mining –Some are by Daphne Koller –I read/liked other work by Koller –My Haystack queries “Daphne Koller Bayes” on Yahoo –Tommi’s haystack can rank the results for me…