Haystack: Per-User Information Environments David Karger.

Slides:



Advertisements
Similar presentations
Managing References : Mendeley
Advertisements

Introduction Lesson 1 Microsoft Office 2010 and the Internet
Classification & Your Intranet: From Chaos to Control Susan Stearns Inmagic, Inc. E-Libraries E204 May, 2003.
Unference: UI (not AI) as key to the Semantic Web David Karger.
Haystack: Per-User Information Environment 1999 Conference on Information and Knowledge Management Eytan Adar et al Presented by Xiao Hu CS491CXZ.
1 Distributed Agents for User-Friendly Access of Digital Libraries DAFFODIL Effective Support for Using Digital Libraries Norbert Fuhr University of Duisburg-Essen,
Accessing and Using the e-Book Collection from EBSCOhost ® When an arrow appears, click to proceed to the next slide at your own pace. To go back, click.
An introduction to Cambridge Collections Online… Full online access to collections of classic and newly- published scholarly titles in PDF format Contains.
Tailoring Needs Chapter 3. Contents This presentation covers the following: – Design considerations for tailored data-entry screens – Design considerations.
Information and Business Work
Copyright 2004 Monash University IMS5401 Web-based Systems Development Topic 2: Elements of the Web (g) Interactivity.
Tagging Systems Austin Wester. Tags A keywords linked to a resource (image, video, web page, blog, etc) by users without using a controlled vocabulary.
L C SL C S Haystack: Per-User Information Environments David Karger.
Human Centric Computing Assignment 2 Proposal 15.
COMP106 Assignment 2 - Direct Manipulation Library System - Proposal 9.
Searching The Web Search Engines are computer programs (variously called robots, crawlers, spiders, worms) that automatically visit Web sites and, starting.
Searching and Researching the World Wide: Emphasis on Christian Websites Developed from the book: Searching and Researching on the Internet and World Wide.
Lesson 46: Using Information From the Web copy and paste information from a Web site print a Web page download information from a Web site customize Web.
Section 13.1 Add a hit counter to a Web page Identify the limitations of hit counters Describe the information gathered by tracking systems Create a guest.
Lecturer: Ghadah Aldehim
Designing for the Web 7 Useful Design Principles.
Web 2.0: Concepts and Applications 4 Organizing Information.
Aurora: A Conceptual Model for Web-content Adaptation to Support the Universal Accessibility of Web-based Services Anita W. Huang, Neel Sundaresan Presented.
Classroom User Training June 29, 2005 Presented by:
Managing your References Sue Bird Bodleian Bio- & Environmental Sciences October 2010.
Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche.
Adventures in Radio UserLand Lincoln Cushing, UC Berkeley Institute of Industrial Relations Library.
Tutorial 1 Getting Started with Adobe Dreamweaver CS3
1 CMPT 275 High Level Design Phase Architecture. Janice Regan, Objectives of Design  The design phase takes the results of the requirements analysis.
XP New Perspectives on Browser and Basics Tutorial 1 1 Browser and Basics Tutorial 1.
Using a Web Browser What does a Web Browser do? A web browser enables you to surf the World Wide Web. What are the most popular browsers?
1 Direct Manipulation Proposal 17 Direct Manipulation is when physical actions are used instead of commands. E.g. In a word document when the user inputs.
Building Search Portals With SP2013 Search. 2 SharePoint 2013 Search  Introduction  Changes in the Architecture  Result Sources  Query Rules/Result.
Basics of Information Retrieval Lillian N. Cassel Some of these slides are taken or adapted from Source:
Nobody’s Unpredictable Ipsos Portals. © 2009 Ipsos Agenda 2 Knowledge Manager Archway Summary Portal Definition & Benefits.
Microsoft Internet Explorer and the Internet Using Microsoft Explorer 5.
Internet Searching Made Easy Last Updated: Lesson Plan Review Lesson 1: Finding information on the Internet –Web address –Using links –Search.
Meta Tagging / Metadata Lindsay Berard Assisted by: Li Li.
Chapter 8 Browsing and Searching the Web. Browsing and Searching the Web FAQs: – What’s a Web page? – What’s a URL? – How does a browser work? – How do.
Personal Information Management Vitor R. Carvalho : Personalized Information Retrieval Carnegie Mellon University February 8 th 2005.
MULTIMEDIA DEFINITION OF MULTIMEDIA
Research & Learning For Libraries and Patrons that need to stay Ahead of the Learning Curve Presenter Name Here Books24x7® for Libraries.
Haystack: Per-User Information Environments David Karger.
ITCS373: Internet Technology Lecture 5: More HTML.
Internet Presentation. What is the Internet? The worlds largest computer network. A collection of local, regional and national computer networks linked.
1 Metadata –Information about information – Different objects, different forms – e.g. Library catalogue record Property:Value: Author Ian Beardwell Publisher.
2007. Software Engineering Laboratory, School of Computer Science S E Web-Harvest Web-Harvest: Open Source Web Data Extraction tool 이재정 Software Engineering.
Individualized Knowledge Access David Karger Lynn Andrea Stein Mark Ackerman Ralph Swick.
Human Centric Computing (COMP106) Assignment 2 PROPOSAL 23.
WEB 2.0 PATTERNS Carolina Marin. Content  Introduction  The Participation-Collaboration Pattern  The Collaborative Tagging Pattern.
XP Browser and Basics COM111 Introduction to Computer Applications.
Living Online Lesson 3 Using the Internet IC3 Basics Internet and Computing Core Certification Ambrose, Bergerud, Buscge, Morrison, Wells-Pusins.
Individualized Knowledge Access David Karger Lynn Andrea Stein.
Website Design, Development and Maintenance ONLY TAKE DOWN NOTES ON INDICATED SLIDES.
Microsoft Office 2008 for Mac – Illustrated Unit D: Getting Started with Safari.
Google Apps and Tools for the Classroom
Multimedia Web site development Plan your site Steps for creating web pages.
Big6 Research and Problem Solving Skills 6 th Grade Project Creating a Travel Brochure.
1 UNIT 13 The World Wide Web. Introduction 2 Agenda The World Wide Web Search Engines Video Streaming 3.
1 UNIT 13 The World Wide Web. Introduction 2 The World Wide Web: ▫ Commonly referred to as WWW or the Web. ▫ Is a service on the Internet. It consists.
COMP 143 Web Development with Adobe Dreamweaver CC.
Semantic Web Technologies Readings discussion Research presentations Projects & Papers discussions.
GO! with Microsoft Office 2016
Computing Fundamentals
GO! with Microsoft Access 2016
A Brief Introduction to the Internet
WorldCat: Broad Web visibility for our collection
Download from Zotero Home Page
Haystack: an Adaptive Personalized Information Retrieval System
Microsoft Office Illustrated Fundamentals
Presentation transcript:

Haystack: Per-User Information Environments David Karger

Motivation

Web Search Tools Indices search by keyword Taxonomies classify by subject Cool site of the day A lot like libraries... Library catalogues Dewey digital New book shelf, suggested reading Is a universal library enough?

Library/web Limitations Huge Too many answers, mostly irrelevant Only published material Miss info known to few, leading-edge content Rigid All get same search results Even if come back and try again The library is the last place we look

Start with Bookshelf I try solving problems using my data: Information gathered personally High quality, easy for me to understand Not limited to publicly available content My organization: Personal annotations and metadata Choose own subject arrangement Optimize for my kind of searching Adapts to my needs

Then Turn to a Friend Leverage They organize information for their own use Let them find things for me too Shared vocabulary They know me and what I want Personal expertise They know things not in any library Trust Their recommendations are good

Last to Library/web Answer usually there But hard to find Wish: rearrange to suit my needs Wish: help from my friends in looking E.g. NY public library catalogue

Lessons Individualized access: The best tools adapt to individual ways of organizing and seeking data. Individualized knowledge: People know much more than they publish. That knowledge is useful to them and others. End user: understands their data the best, so should control organization and presentation

Problems with Current Tools Applications designed by few for use by many Developers decide what information is important Provide model to hold that information Provide interfaces to view/manipulate that info Users discover uses/needs for other info Tool cannot store, cannot support interaction Users discover connections between info If connected info is in different applications, neither app can record connection People could do a lot more with information, if environment let them record/use what they know

Haystack Approach Data Model Define rich data model that lets user represent all interesting info Rich search capabilities Machine readable so agents can augment/share/exchange info User Interface Strengthen UI tools to show rich data model to user And let them navigate/manipulate it Collaboration As system gathers information from one user, share with others Rich data model maximizes useful knowledge transfer

Data Model A semantic web of information

Motivation Tremendous amount of information is relational Named relationships Written by, married to, traveling to, owned by… Collections Directories, bookmarks, menus, albums Families, workgroups, Web links People can take huge advantage of navigating relationships Network of relationships much more “structured” than a textual description, but much less regular than a spreadsheet/database

The Haystack Data Model W3C RDF/DAML standard Arbitrary objects, connected by named links A semantic web Links can be linked No fixed schema User extensible Add annotations Create brand new attributes Doc D. Karger Haystack title author Outstanding quality says HTML type

RDF Lowers Barriers Location Independent Universal Locators, even for local data (as may become non-local) Application Independent Simple, common language suitable for variety of information types Enables interlinking and exchange of information from all apps Extensible Can add attributes as needed, leave them out if unimportant Enables powerful search Based on broad variety of attributes Support for data agents Extract information from raw data Make available for search and other forms of navigation

Where does data come from? Pull from outside sources Web, databases, news feeds… Active user input Interfaces let user add data, note relationships Mining data from prior data Plug-in services opportunistically extract data Passive observation of user Plug-ins to other interfaces record user actions Other Users

Data Extraction Services Web Services RDF Store Mail Observer Machine Learning Services Web Viewer Haystack UI Spider

User Interface Uniform Access to All Information

Current Barriers to Information Flow Partitions by Location Some data on this computer, some on that Remote access always noticeable, distracting Partitions by Application Mail reader for this, web browser for that, text editor for those To-do list, but without needed elements Invisibility Where did I put that file? Tendency for objects to have single (inappropriate) location (folder) Missing attributes Too lazy to add keywords that would aid searching later

Goal: Task-Based Interface When working on X, all information relevant to X (and no other) should be at my fingertips Planning the day: to-do list, news articles, urgent , seminars Editing a paper: relevant citations, from coauthors, prior versions Hacking: code modules, documentation, working notes, threads Location, source and format of data irrelevant

Sign of Need: Usage as to-do list Anything not yet “done” kept there Reminder to ourselves Single interface containing numerous document types Overflowing Inboxes Navigate only by brute-force scanning Unsafe file/categorize anything: out of sight, out of mind

The Big Picture

User Interface Architecture Views: Data about how to display data Views are persistent, manipulable data Data to be displayed UI data Mapping View Underlying information UI data Mapping 2 View 2

Semantic User Interface Present information by assembling different views together Information manipulation decoupled from presentation New views can be added without mucking with data types New data types can be added without designing new UIs Uniform support for features like context menus Actions apply to objects on screen in various “roles” E.g. as word, as title of mail message, as member of collection View for Favorites collection View for cnn.com View for yahoo.com View for ~/documents/thesis.pdf

Persistence of Views Views are data like all other data Stored persistently, manipulated by user User can customize a view View for particular task can be cloned from another Can evolve over time to need of task To an extent previously limited to sophisticated UI designer Views can be shared Once someone determines “right” way to look at data, others can benefit

Role of Schemata Benefits Help people look at information the right way Help creators avoid creation mistakes Risks of Enforcement Deters lazy users from entering data Prevents creative users from stretching the boundaries Is there a middle ground? Can schemata be “advisory”? One or many? If each user makes own schema, how translate?

Brief look

Collaboration Haystack’s Ulterior Motive

Summary Rich data Model Lets user represent all interesting info Supports sophisticated searches Accessible to information agents User Interface Extensibly shows rich data model to user Lets them navigate/manipulate it Collaboration As system gathers information from one user, share with others Rich data model maximizes useful knowledge transfer

More Info (initial release available for download)

Hidden Knowledge People know a lot that they are Willing to share But too lazy to publish Haystack passively collects that knowledge Without interfering with user Once there, share it! RDF---uniform language for data exchange Challenges As people individualize systems, semantics diverge Who is the “expert” on a topic? (collaborative filtering)

Adaptation Learning from the User over Time

Approach Haystack is ideally positioned to adapt to user RDF data model provides rich attribute set for learning In particular, can record user actions with information (the flexible UI can capture easily) Extensive record can be built up over time Introspect on that information Make Haystack adapt to needs, skills, and preferences of that user

Observe User Instrument all interfaces, report user actions to haystack Mail sent, files edited, web pages browsed Discover quality What does the user visit often? Discover semantic relationships What gets used at the same time? Discover search intent Which results were actually used?

Learning from Queries Searching involves a dialogue First query doesn’t work So look at the results, change the query Iterate till home in on desired results Haystack remembers the dialogue instead of first query attempt, use last one record items user picked as good matches on future, similar searches, have better query plus examples to compare to candidate results Use data to modify queries to big search engines, filter results coming back

Example I want info on probabilistic models in data mining My haystack doesn’t know, but “probability” is in lots of I got from Tommi Jaakola Tommi told his haystack that “Bayesian” refers to “probability models” Tommi has read several papers on Bayesian methods in data mining Some are by Daphne Koller I read/liked other work by Koller My Haystack queries “Daphne Koller Bayes” on Yahoo Tommi’s haystack can rank the results for me…

Mediation Haystack can be a lens for viewing data from the rest of the world Stored content shows what user knows/finds useful Selectively spider “good” sites Filter results coming back Compare to objects user has found useful in the past Can learn over time Example - personalized news service