Collaborative Filtering Shaun Kaasten CPSC CSCW
Outline What is filtering? Filtering techniques Why should we use CF? Examples of CF systems Virtual community CF design goals Evaluation forms Active CF Summary
What is Filtering? Information overload Finding desired information Eliminating undesirable ’94 Resnick et al.
Filtering Techniques (in and out) Cognitive (content) Text in the item Economic Costs and benefits Mass mailings (low production costs) Social People and judgments Collaborative filtering – subjective evaluations of others ’87 Malone et al.
Why should we use CF? People are better at subjective evaluations Writing style,clarity, music, cake recipes Benefit from seeing the history of an object’s use Read/edit wear ’95 Maltz & Ehrlich
Tapestry (1992) Xerox PARC Users annotate documents they read Helped others decide what to read Failures Not free Not distributed SQL interface – difficult to browse
Grouplens (1994)
Bellcore Video CF (1994) Suggested Videos for: John A. Jamus. Your must-see list with predicted ratings: 7.0 "Alien (1979)" 6.5 "Blade Runner" 6.2 "Close Encounters Of The Third Kind (1977)" Your video categories with average ratings: 6.7 "Action/Adventure" 6.5 "Science Fiction/Fantasy" 6.3 "Children/Family" The viewing patterns of 243 viewers were consulted. Patterns of 7 viewers were found to be most similar. Correlation with target viewer: 0.59 viewer bullert,jane r
Bellcore Community Web Browser (1995)
Movielens (1998?)
Web CF: Amazon Customer Reviews
Web CF: Cnet User Opinions
Web CF: MSDN Article Ratings
Virtual Community ’95 Hill et al. Influence each other without interacting Share benefits of collaboration without costs Time – developing personal relationships Privacy Synchronous communication No intelligent agents (other than people)
CF Design Goals: Bellcore & Grouplens Common Easy participation People power, not agents Prediction accuracy increases with user base size Grouplens Compatibility Privacy Rich recommendations Bellcore Works for groups, not just individuals Recommendations should include confidence
Evaluation Forms Explicit Music reviews on Amazon Grouplens- grading of Usenet message Implicit Grouplens – monitor how long a user reads an article
History-Enriched Digital Objects ’94 Hill et al.
Trade off: Effort vs. Rewards ’95 Hill et al.
Finding Similar Tastes Compute correlation coefficients for the user’s reviews and others Use as weights to combine the ratings for current article Correlation avoids differences of scale interpretation ’94 Resnick et al.
Cold Start Problem Profile needed to find similar tastes Training period No immediate benefit for user (Grudin’s rule) Restricted from new areas ’95 Maltz & Ehrlich
Active CF Passive No direct connection between evaluator & reader Works for: many documents in a single database Active Intent to share knowledge with particular people Works for: distributed systems, where just finding sources is difficult Benefit increases with the divergence of the documents ’95 Maltz & Ehrlich
Case Study: Computer Support Center Expectation: workers use on-line or printed documentation to answer problems Finding: rely on each other Information mediator Skilled at finding and applying info ’95 Maltz & Ehrlich
Build a system to support… Collaboration and information sharing amongst colleagues Information mediators sending out references and commentary of useful documents ’95 Maltz & Ehrlich
What informal methods are missing Contextual information Name, source, date, sender information Ease of use Add annotations Return benefits early - no cold start Flexibility Method of distribution, comments and context No set roles ’95 Maltz & Ehrlich
The Pointer System
Distribution of Pointers Private database – bookmarks Individuals Subscribe-only mailing lists Information digests Pre-designed document – newsletters, reports, etc. ’95 Maltz & Ehrlich
Challenging Common Theories Comment providers should be anonymous Knowing something about commenter is critical to evaluating the usefulness of that document ’95 Maltz & Ehrlich
Challenging Common Theories Information finders should be freed from addressing and sending mail Users really do have recipients in mind when they discover information
Irony of Active CF Recipients are passive Cannot use system to find reviewed information ’95 Maltz & Ehrlich
Summary Choice under uncertainty Benefit from knowledgeable people Virtual community of experts (?) Active CF systems help point colleagues to information Passive CF help ‘explorers’ learn from the community