A Hybrid Recommender System Using Link Analysis and Genetic Tuning in the Bipartite Network of BoardGameGeek.com Brett Boge CS 765 University of Nevada, Reno
RecapDataGeneral ApproachStep 1: Link-analysisStep 2: Content-based CascadeStep 2: Genetic tuning
RecapDataGeneral ApproachStep 1: Link-analysisStep 2: Content-based CascadeStep 2: Genetic tuning
Data (Overview) Users 400,000 + Games 55,000 + Ratings 0–3000 /ea
Data (Scope) Starting with the top 5,000 games List of users == those which have rated at least one of the top 5,000 games Users with no ratings cannot be connected to any component of the graph, and can only be evaluated in the most general sense
Data (Retrieval) Data will be obtained through the BGG XML API2 Game|Small World, id thing?id=40692&ratingcomments=1 User|Licinian user?name=Licinian collection?name=Licinian &own/played/trade/want/wishlist/etc
Data (Sets) Ratings/Ownership Data Teaching Set 70% Testing Set 30% (hopefully most recent)
RecapDataGeneral ApproachStep 1: Link-analysisStep 2: Content-based CascadeStep 2: Genetic tuning
User & Item profiles Based on content specific to that object (properties) Content Based Users & Items similar to those liked/owned in the past More abstract, only links matter Collaborative Based General Approach
Weighted Switched Mixed Feature combination Cascade Methods of Hybrid Filtering R. Burke, "Hybrid recommender systems: Survey and experiments," Approaches General Approach
Our Method Approaches General Approach Link- analysis As described by Huang et al. in A Link analysis approach to recommendation under sparse data A PageRank style analysis of hubs and authorities Content- based Refines the previous results Uses information about the items themselves to adjust ranking Will need tuning
RecapDataGeneral ApproachStep 1: Link-analysisStep 2: Content-based CascadeStep 2: Genetic tuning
Overview From Z. Huang, et al., "A Link analysis approach to recommendation under sparse data," Approaches Link Analysis Step Link Analysis Consumer - Product Matrix Consumer Representativeness Matrix Product Representativeness Matrix
Matrix Definitions From Z. Huang, et al., "A Link analysis approach to recommendation under sparse data," Approaches Link Analysis Step Product Representativeness Matrix Consumer Representativeness Matrix
Initialization From Z. Huang, et al., "A Link analysis approach to recommendation under sparse data," Approaches Link Analysis Step Consumer Representativeness Matrix Product Representativeness Matrix
Update Phase From Z. Huang, et al., "A Link analysis approach to recommendation under sparse data," Approaches Link Analysis Step PR = CR ∙ B CR = PR ∙ C T + CR 0 Consumer Representativeness Matrix Product Representativeness Matrix
RecapDataGeneral ApproachStep 1: Link-analysisStep 2: Content-based CascadeStep 2: Genetic tuning
Product Representativeness Result Approaches Content-based Cascade Product Representativeness Matrix Game 1 Game 2 Game 3 User A xxx User B PR 21 PR 22 PR 23 User C xxx PR i
Additional Data Approaches Content-based Cascade PropertyDescription Subdomain (S)General type of game (Strategy, Family, Party) Category (C)Genre/specific type of game (Civilization, Territory Building) Playing Time (P)Publisher provided, in minutes Mechanic (M)Game mechanics used (Dice Rolling, Variable Powers) Suggested best Number of players (N) User voted best number of players to play the game
Similarity Measures Approaches Content-based Cascade PropertySimilarity Subdomain (S)Cosine Category (C)Cosine Playing Time (P)Error Mechanic (M)Cosine Suggested best Number of players (N) Error These will need to be normalized on the same scale ( )
Product Similarity Matrix Approaches Content-based Cascade SCPMN Game …
Refining the Product Ranking Approaches Content-based Cascade Create PRfinal by refining PR: W is a vector of weights which determine how much a given property should effect the original score
RecapDataGeneral ApproachStep 1: Link-analysisStep 2: Content-based CascadeStep 2: Genetic tuning
Determining an Optimal W Approaches Genetic Tuning W needs to be defined optimally for this given domain A genetic algorithm will be used to tune W Chromosome = sequential binary representation of W Fitness based on Rank Score (from Huang et al.) 8 bits per weight, ranging from to start Rates of crossover/mutation TBD
Conclusion / Questions