cs Future Direction : Collaborative Filtering Motivating Observations: Relevance Feedback is useful, but expensive a)Humans don’t often have time to give positive/negative judgments on a long list of returned web pages to improve individual searches b)Effort is used once, then wasted want pooling and re-use of efforts access individuals
cs Collaborative Filtering Motivating Observations (continued) : Relevance Quality Queries : bootleg CD’sNAFTA Medical School AdmissionsSimulated Annealing REMAlzheimer’s Many web pages can be “about” a topic (specialized unit) But there are great differences in quality of presentation, detail, professionalism, substance, etc.
cs Possible Solution: build a supervised learner for quality/ NOT topic matter Train on examples of each, learn distinguishing properties
cs Supervised Learner for “Quality” of a Page P(Quality|Features) in addition to topic similarity salient features may include: # of links Size How often cited Variety of content “Top 5 th of Web” awards etc, assessment of usage counter (hit count) Complexity of graphics quality?? Prior quality rating of server One Solution:
cs Collaborative Filtering Problem: Different humans have different profiles of relevance/quality Query: Alzheimer’s disease = A document or web page Relevant (high quality) for 6 th Grader Appropriate for Care Giver Medical Researcher
cs One Solution: Pool collective wisdom and compute weighted average of page rankings across multiple users in an affinity group (taking into account topic relevance, quality, and other intangibles) Hypothesis : humans have a better idea than machines of what other humans will find interesting
cs Collaborative Filtering Idea: instead of trying to model (often intangible) quality judgments, keep a record of previous human relevance and quality judgments Query: Alzheimer’s Table of user rankings of web pages for a query Web pages Users A B C D E F G
cs Solution 1: Identify individual with similar tastes (high Pearson’s coefficient on similar ranking judgments) instead of: P(relevant to me | Page i content) compute: P(relevant to me | relevant to you) My similarity to you * P(relevant to you | Page i content) Your Judgments
cs Solution 2: Model Group Profiles for relevance judgments (e.g. Junior High School vs. Medical Researchers) compute: P(relevant to me | relevant to group g ) My similarity to the group * P(relevant to group g | Page i content) group’s collective (avg) relevance judgments Supervised Learning