Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger
Let Me Interview You! Web: – What’s the last Web page you visited? How did you get there? – Have you looked for anything on the Web? Files: – What’s the last you read? What did you do with it? – Have you gone back to an you’ve read before? – What’s the last file you looked at? How did you get to it? – Have you looked for a file?
Overview Introduction Related Work Study Methodology Results: Search Discussion Intro RW Study Res Disc
Overview Intro RW Study Res Disc Introduction Related Work Study Methodology Results: Search Discussion
The Information Explosion You must extract information from: 3 billion Web pages (Google) Dozens of incoming s daily Hundreds of files on your personal computer Intro RW Study Res Disc
Haystack: Personal Information Storage Web pages Files Calendar Contacts Haystack Intro RW Study Res Disc
Haystack: Personal Information Storage What was that paper I read last week about Information Retrieval? Haystack Intro RW Study Res Disc
Haystack: Personal Information Storage Ah yes! Thank you. Haystack Intro RW Study Res Disc
Supporting Information Interaction Treat different corpora the same? Provide access to meta-data? – Keyword search (XP, advanced search) – Browse (Hearst) Intro RW Study Res Disc We don’t really know … Understand access in the wild!
Overview Intro RW Study Res Disc Introduction Related Work Study Methodology Results: Search Discussion – Interaction by corpus – How people search
Interaction By Corpus Paper documents – [Malone, 1983], [Whittaker & Hirshberg, 2001] Files – [Barreau & Nardi, 1995] Web – [Abrams, et al. 1998], [Byrne, et al. 1999] /Calendar – [Whittaker & Snider, 1996], [Bellotti & Smith, 2000] Intro RW Study Res Disc
How People Look for Information Focus: Web Log analysis – [Catledge & Pitkow, 95], [Tauscher & Greenberg 97] Controlled tasks/environment – [Baldonado & Winograd, 1997], [Spool, 1998] Situated navigation – Micronesian islanders [Suchman, 1987] – Electronic [Marchionini, 1995], [Hearst, 2000] – Information scent [Chi, Pirolli, Chen & Pitkow, 2001] Intro RW Study Res Disc
Overview Intro RW Study Res Disc Introduction Related Work Study Methodology Results: Search Discussion
Method Subjects – 15 MIT CS graduate students (5 women, 10 men) Setup – 10 short interviews (~ 5 min.) – 1 long interview (~ 45 min.) Topics – Web, , Files Intro RW Study Res Disc
Short Interviews Modified diary study [Palen, 2002] Randomly interrupted participant Two question types – Last /file/Web page looked at – Last /file/Web page looked for Goal: Discover patterns in searching and browsing Intro RW Study Res Disc
Long Interviews “Guided tour” of subject’s Web space, , and file system Goals: – Discover organizational patterns – Discover problems in organizational structure – Relate organization to search/browse behavior Intro RW Study Res Disc
Overview Intro RW Study Res Disc Introduction Related Work Study Methodology Results: Search Discussion – What and how – Relating what and how – Individual strategies
Complex Information Spaces People had complex spaces Felt in control Intro RW Study Res Disc “That’s an interesting question. I think my is the worst, because I have so much of it. And there are people on the other end who expect me to reply to it. My file system is pretty well organized. I have to go through it every once in a while, every couple of months and just kind of push things into the right folders and delete the old stuff. The Web just works, usually.”
What People Look For Specific Information – A small fact – E.g., URL, phone number, appointment time General Information – A broad set of information – E.g., good sneakers to buy, info on cancer Specific Document – The actual document – E.g., a file to print, an to reply to Intro RW Study Res Disc
How People Look For Information The last thing you looked for on the Web Intro RW Study Res Disc Search is more than just keyword search – Did you use a search engine? Browse, use bookmarks, type URLs “I was looking to figure out where Glaris was. When I lived in Switzerland there were only a few reasonable mapping places of the country. And so I had bookmarked [the Switzerland map site].”
– Traditional search – Jump directly to target – Specify everything up front Strategies Looking for Information Intro RW Study Res Disc Teleporting Orienteering – Use local navigation – [O’Day and Jeffries, 1993] – Could include keyword search
Example: Orienteering […] J: I knew that she had a very small Web page saying, “I’m here at Harvard. Here’s my contact information.” Intro RW Study Res Disc Interviewer: Have you looked for anything on the Web today? Jim: I had to look for the office number of the Harvard professor. […] I: So you went to the Math department, and then what did you do over there? J: It had a place where you can find people and I went to that page and they had a dropdown list of visiting faculty, and so I went to that link and I looked for her name and there it was. I: So how did you go about doing that? J: I went to the homepage of the Math department at Harvard
Example: Teleporting What if Jim had teleported instead? Could have typed into a search engine: “Connie Monroe, office number” Intro RW Study Res Disc
“Keyword Search” and “Browse” “Keyword Search” – Traditional search – Jump directly to target – Specify everything up front “Keyword Search” and “Browse” Intro RW Study Res Disc Teleporting Orienteering – Use local navigation – [O’Day and Jeffries, 1993] – Could include keyword search Teleporting Orienteering
Orienteer to specific information Relating How and What People orienteer a lot What people look for related to how they look SpecificGeneralDocument Orienteer Teleport Intro RW Study Res Disc Surprise:
– Did you know what contained that information? Why So Much Orienteering? Your last search Intro RW Study Res Disc People look for the information source Specific information searches Document searches – What were you looking for?
Looking for the Source: Example “I was looking to figure out where Glaris was. When I lived in Switzerland there were only a few reasonable mapping places of the country. And so I had bookmarked [the Switzerland map site].” Intro RW Study Res Disc
Looking for the Source: Example Interviewer: Have you looked for anything on the Web today? Jim: I had to look for the office number of the Harvard professor. I: So how did you go about doing that? J: I went to the homepage of the Math department at Harvard […] J: I knew that she had a very small Web page saying, “I’m here at Harvard. Here’s my contact information. […] I: So you went to the Math department, and then what did you do over there? J: It had a place where you can find people and I went to that page and they had a dropdown list of visiting faculty, and so I went to that link and I looked for her name and there it was. Intro RW Study Res Disc
Individual Strategies Search strategies varied by individual Pilers: Pile information Filers: File information Intro RW Study Res Disc Where was the last you found? – Inbox? – Elsewhere?
File or Pile Intro RW Study Res Disc Filer Piler
How Individuals Search For Files Intro RW Study Res Disc Filers Pilers Teleport Orienteer
Overview Intro RW Study Res Disc Introduction Related Work Study Methodology Results Discussion – Understanding and applying what we learn – Future work
Understanding Teleporting v. Orienteering Why was orienteering chosen over teleporting? Teleporting doesn’t work Teleporting requires too much cognitive effort Risk of over-specifying target Orienteering gives knowledge of the source Teleporting a failure mode – Can’t associate information with source – Can’t find the information source Intro RW Study Res Disc
Understanding Filers v. Pilers Why do filers teleport more than pilers? Irony: Those with good organization don’t take advantage of it Filers have strictly organized information Are used to defining meta-data for their information Pilers loosely organize their information Are used to associative navigating Intro RW Study Res Disc
Haystack: Applying What We Learn Using meta-data: Support orienteering – Not about having the perfect search interface – Need ability to prompt Individualized support – Pilers/filers – Learning individual behaviors Intro RW Study Res Disc
Future Work: Search Previously viewed information Causes of failure Searches across corpus Getting help from others Intro RW Study Res Disc
Future Work: Organization Consistency of organization across corpus Corpora boundaries Context used in organization Organization’s effect on search Intro RW Study Res Disc
Conclusion Look at search in the wild Strategies: Teleport/Orienteer Individual strategies Future systems should: – Support orienteering – Provide individualized support
Questions? To learn more about Haystack: Contact us with comments: - -
Relating How and Corpus and files: Almost always orienteered Easy to associate information with document Web: Teleported much more often FilesWeb Orienteer Teleport Intro RW Study Res Disc
Relating What and Corpus FilesWeb Specific39733 General10730 Document searches were primarily for specific information File searches were primarily for documents Web searches were more evenly distributed Intro RW Study Res Disc