Presentation is loading. Please wait.

Presentation is loading. Please wait.

Learning Bit by Bit Collaborative Filtering/Recommendation Systems.

Similar presentations


Presentation on theme: "Learning Bit by Bit Collaborative Filtering/Recommendation Systems."— Presentation transcript:

1 Learning Bit by Bit Collaborative Filtering/Recommendation Systems

2 Collaborative Filtering

3 Collaborative Filtering - Definition Traversing a large body of information contributed “collaboratively” by many different people in such a way as to find similarities between users or things. Bootstrapping these similarities to make recommendations to users.

4 Key Components -tracking user behavior

5 Key Components -tracking user behavior -storing this data long term

6 Key Components -tracking user behavior -storing this data long term -mining this data for patterns (similarity)

7 Key Components -tracking user behavior -storing this data long term -mining this data for patterns (similarity) -predicting future behavior (recommendation)

8 Data!

9 Similarity

10 Similarity is a quantity that reflects the strength of relationship between two objects or two features.

11 Similarity Ratings: movies, songs, restaurants… User did the hard part- quantifying their feelings

12 Similarity Use the info to inform others Notice trends between users Suggest new content, products …

13 Similarity Feature Space

14

15 cull relevant information from data to create a limited portrait of a person, thing, event or behavior.

16

17 Euclidean Distance Distance between point A and point B = √(A1 – B1)² + (A2 - B2)²

18 Distance between Rose and Seymour = √(3 – 2)² + (4 - 2)² = 2.236

19 Euclidean Distance √(A1 – B1)² + (A2 - B2)² + (A3 – B3)² + (A4 – B4)² + … + (An – Bn)² Where n is the number of dimensions or features you are looking at

20 Similarity as Distance Somewhat reciprocal: distance as a measure of dissimilarity

21 Similarity as Distance Somewhat reciprocal: distance as a measure of dissimilarity naïve similarity = 1+distance / 1

22 Demo SimilarityMetrics in iweb2.ch3.collaborative.data

23 Pearson Correlation

24 Positive Correlation Negative Correlation No Correlation A B C D

25 Jaccard Index Ratio of the intersection : the union of 2 sets Points in agreement/ total points Movie1, Movie2 Movie1, Movie2, Movie3, Movie4, Movie5 User1: liked Movie1, disliked Movie2, liked Movie3, liked Movie4, disliked Movie5 User2: liked Movie1, disliked Movie2, disliked Movie3, disliked Movie4, liked Movie5

26 Similarity Metrics Summary Euclidean DistancePearson General purpose Normalizing data, finding a relationship Categorical data Jaccard

27 User-based vs. Item-based

28 Recommendations: User-based Find items similar users liked Weighted average to predict a rating for a user

29 Weights

30

31 Recommendations: Item-based Similarity between items is used instead of similarity between users

32 Recommendations: difference

33 User-based

34 Item-based

35 Types of Collaborative Filtering User-Based -index of user similarities stored -Other users similar to you liked X Item-Based -index of item similarities is stored -Other people who liked X also liked Y -faster for large data sets with less overlapping data between users (sparse)

36 Testing Almost impossible to intuit accuracy Save a portion of the known data for test Ex. If you have 100 users with 10 ratings each randomly spot check accuracy of rating prediction on 10%


Download ppt "Learning Bit by Bit Collaborative Filtering/Recommendation Systems."

Similar presentations


Ads by Google