Download presentation
Presentation is loading. Please wait.
Published byLesley Harrington Modified over 8 years ago
1
Individualized Knowledge Access David Karger Lynn Andrea Stein
2
Web Search Tools wIndices l search by keyword wTaxonomies l classify by subject wCool site of the day A lot like libraries... w Library catalogues w Dewey Digital w New book shelf, suggested reading Is a universal library enough?
3
Library/Web Limitations wHuge: l too many answers, mostly irrelevant wOnly published material l miss info known to few, leading-edge content wRigid: l all get same search results l even if come back and try again The library is the last place we look
4
Bookshelves First wMy data: l information gathered personally l high quality, easy for me to understand l not limited to publicly available content l annotations wMy organization: l choose own subject arrangement l optimize for my kind of searching wAdapts to my needs
5
Then a Friend wLeverage l they organize information for their access l so quickly find things for me wPersonal expertise l they know things not in any library wTrust l their recommendations are good wShared vocabulary l they know me and what I want
6
Last the Library wAnswer usually there l but hard to find l would be nice to rearrange to my needs wFor hardest problems, need librarian l they have broad knowledge of library l but not as deep as an expert on question
7
Lessons wIndividualized access: The best tools adapt to individual ways of organizing and seeking data. wIndividualized knowledge: People know much more than they publish. That knowledge is useful.
8
Haystack: a Tool for Oxygen wIndependent but interacting repositories that adapt to their individual users wIndividualize access l My data collection, organization l My search tools, with answers for me wLeverage individual knowledge l Collaborative retrieval with others l Motivate people to organize their data for their own benefit and thus for others’
9
Example wHave probabilistic models been used in data mining? l My haystack doesn’t know, but “probability” is in lots of mail I got from Tommi Jaakola l Tommi told his haystack that “Bayesian” refers to “probability models” l Tommi has read several papers on Bayesian methods in data mining l His haystack suggests them to mine
10
Research Threads wHeterogeneous data and metadata l archive whatever user wants wHuman-Computer Interaction l let user express/use own organizational rules l observe user to detect unexpressed knowledge wMachine learning l use gathered data to improve performance wCollaborative filtering l use others’ decisions to help me
11
My data wHaystack archives anything l web pages browsed, email sent and received, documents written, scanned images, home directory, people known, projects worked on wAnd any properties, relationships l text of object (if know how) l author, title, color, citations, quotations, annotations, quality, last usage wUsers freely adds types, relationships
12
Gathering My Data wActive user input l interfaces let user add data, note relationships wMining data from haystack l plug-in services opportunistically extract data l e.g., find author/title/text in MSWord document l or, detect that one document quotes another wObserving user l plug-ins to other interfaces report user actions l web pages browsed, mail sent, queries made
13
Adaptation wRemember user’s attempts to tune a query l instead of first query attempt, use last one l record items user picked as good matches l future similar queries do better right away wStored content shows what user knows/likes l modify queries to big search engines l filter results coming back l personalized “cool site of the day”
14
Collaborative Access wLeverage others’ work organizing data l no need to “publish” expertise l exposed automatically l self interest helps others wPrivacy/permission concerns l allowing exposure easier than publishing l much public info: mailing lists, papers read wWhose opinions matter? l people I mail, w/shared data, referrals l collaborative filtering techniques
15
Conclusion wLibraries are not enough wHaystack teases out individual knowledge wIndividualizes information access for user wExposes individual knowledge to benefit community wCurrent status: individual-user prototype. Some data extraction, observation, adapting. Collaborative version in future.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.