Augmenting (personal) IR Readings Review Evaluation Papers returned & discussed Papers and Projects checkin time
Relevance Feedback in IR Already in most systems - Improved query formulations - System evaluation of system Works from natural characteristics in documents - More interesting to work from the NC of people - Personal Relevance Feedback If you don’t know what the document set is, how do you reformulate a query? - Browse by query, then search - (Bibliometric) chaining
Old School Query Reformulation Identify core terms in document database Deemphasize (not index?) less core terms “We know what’s good for you” - Small set of documents - Accurate knowledge of users Small steps, building to quality documents Weights of queries are shifted - Preferred terms - Partial weights 0 - 1
It’s all about vectors Remember VectorSpace? Documents Queries How similar is the query to a document? - Averages and weights give final set - Length and location
Probability Relevance Feedback Document-based not term based Ranking documents be their content - And tweaked weights Depends on variety of documents in database - Wider variance = harder to predict - More processing power can help - Means to average and normalize values Ad hoc adjustment, relative weighting - Using the found documents as additional queries How do you evaluate RFS as doc db changes? - Previous retrieval is key, but not with changes - Adding common terms may help (in general)
IR & Filtering Are they the same? - Is a filter a proactive search? - Does filtering lead to better browsing, which leads to less need for searching? Good for lots of changing text (Web) Active use What about push media with filters? - RSS -
What do we mean by augment? Douglas Englebart’s system - GUI - Interaction - Connectivity - Management Improve upon Extend user capabilities Do what you want, but faster “Do what I mean, not what I say” What are some ways to augment?
What is Personalization? In computing? - Optimized - System specific In interfaces? - Modes of interaction - Appropriate for user level For IR? - Results - Time - Mode - (Relevance) Feedback
Personalized IR system design How would you design a personal IR system? Who would use it? How would you learn about them? - Interests - Sources - Preferences How do you evaluate a personal system? Understanding users is the key to personalizing search or search interfaces.
Letizia Interleaving browsing with (automated) search Augmented browsing = less searching? Understanding your usage preferences - “Behavior based” - Letizia explores for you “doing concurrent, autonomous exploration of links from the user’s current position” p1 - PageRank for individuals? - PageRank for the exact situation? - Smart crawling based on a profile?
Letizia’s Inferences What you do tells the systems your interests and habits List of keywords about your interests Persistence of interest issues - Shifts - Time to restate interest Automated queries, keyword matches Doesn’t get in the way (much) What about the interface? Making Web search better?
Siteseer “Personalized navigation for the Web” Isn’t this a CF system? Bookmarks are key indicators of interest Category fits Implicit recommendations
How to personalize the Web: WBI Interests are bookmarks or home pages - Links - Text Proxy-like between the user the Web Agent like functions - Monitor - records features - Editor - tweaks retrieved information - Generator - request to response - Autonomous agent - triggers
Outride Data mining for personalized search Fast model fitting for profiles Search keyword augmentation - Interests - Preferences “Contextual Computing” - Just in time information - Situational More than content analysis “Author relevancy”
Personalized Search Efficiency Contextualization - Activity - Availability Individualization - User goals (models of Iseek) - (Past) behavior Interface Awareness and customization
Personalization vs. Customization What’s the difference? - For a system, for a user - Interaction methods, selection methods My.yahoo.com vs. amazon.com AskJeeves vs. a Reference Librarian
WIRED System Evaluations Install IR software Set up documents for indexing - What types of documents - Sizes, formats, time to index? Perform some searches - Note search functionality - Describe (screen shot?) interface for search Examine results - Describe (screen shot?) results page/screen - Rotate, use subset of documents - Note differences in queries What model, index, system do you think the system uses (based on class discussions & readings)?
System Evaluation Questions What do these systems seem to offer? How would you use them? How would a group use them? Can it affect the way you search? - The way you work? - The way you store/organize information? What’s different than you expected? - Better or Worse? - From your deign ideas?