Presentation is loading. Please wait.

Presentation is loading. Please wait.

AWS, HADOOP AND MAHOUT – VIDEO GAME RECOMMENDER BEN GOODING UNIVERSITY OF ARKANSAS – DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING PRESENTED - APRIL 30,

Similar presentations


Presentation on theme: "AWS, HADOOP AND MAHOUT – VIDEO GAME RECOMMENDER BEN GOODING UNIVERSITY OF ARKANSAS – DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING PRESENTED - APRIL 30,"— Presentation transcript:

1 AWS, HADOOP AND MAHOUT – VIDEO GAME RECOMMENDER BEN GOODING UNIVERSITY OF ARKANSAS – DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING PRESENTED - APRIL 30, 2015

2 MAHOUT Pronounced like Trout Open Source Machine Learning platform from Apache Used Mahout 0.9

3 RECOMMENDER TYPES Item-Item Based Recommenders How similar items are to items User Based Recommenders Based on the notion of some similarity between users

4 NEIGHBORHOODS Two types of Neighborhoods N-Nearest Neighbor Nearest Neighbor Threshold

5 SIMILARITIES Euclidean Distance Similarity 1/(1+d) where d is the distance between two users Co-occurrence Similarity Explained by previous presentations Tanimoto Coefficient Ignores user preference numbers, only cares that a user has a preference Loglikelihood Similarity Based on # of items in common but is an expression of how unlikely two users are to have a similar interest Pearson Correlation Similarity # between -1 and 1. Measures tendency of two numbers when paired to move together High correlation the similarity is close to 1. Opposite, close to -1

6 THE DATASET 228,570 Users 21,025 Games 463,669 Reviews Dataset contained excess information. Stanford provided Python script to parse data, but not enough parsing. Modified Python script to parse out everything except User ID, Product ID, and Review Score Eliminated unknown user names Used G-Edit to remove some other excess information Wrote a C++ program to convert the User and Product IDs into numerical values

7 USER BASED NEAREST-N RECOMMENDER EVALUATION Similarityn=1n=2n=4n=8n=16n=32n=64n=128 EuclideanNaN0.2050.2840.3610.4980.5420.6040.646 PearsonNaN0.7990.8680.8860.8780.9040.9600.989 Log- likelihood NaN0.5260.7710.7690.7660.8080.7840.718 TanimotoNaN0.7230.9550.8260.7920.8070.8220.755

8 USER BASED NEIGHBOR THRESHOLD RECOMMENDER EVAULATION Similarityt = 0.95t = 0.9t=0.85t=0.8t=0.75t=0.7 Euclidean0.503 0.504 Pearson0.689 0.6650.6390.6290.703 Log- likelihood 0.8010.7790.7910.7960.7900.796 TanimotoNaN

9 ITEM BASED RECOMMENDER EVALUATION SimilarityScore Euclidean0.786 Pearson0.944 Log-likehood0.789 Tanimoto0.783

10 HADOOP Distributed File System Difficult to setup without an easy to understand tutorial Got working on my virtual machine Couldn’t get Mahout to work with Hadoop as a single node cluster Java Class Not Found Exception

11 AMAZON WEB SERVICES Provides Elastic Map Reduce clusters Pre-installed with Mahout and Hadoop Used 1 Master Node and 3 Slaves Utilized the AWS Command Line Interface

12 AWS RECOMMENDER Took roughly 10-20 minutes to produce all of the recommendations. Used the item based recommender No distributed Generic User Based Recommender Generated recommendations for the users Utilized a Python based web server to display recommendations Input user id, spits out recommendations

13 FUTURE WORK Attempt to use Parallel ALS recommendations. Should provide more accurate results than the item based recommender Code available upon request, along with AWS Command Line commands


Download ppt "AWS, HADOOP AND MAHOUT – VIDEO GAME RECOMMENDER BEN GOODING UNIVERSITY OF ARKANSAS – DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING PRESENTED - APRIL 30,"

Similar presentations


Ads by Google