Download presentation
Presentation is loading. Please wait.
Published byAubrey Fisher Modified over 9 years ago
1
Music Recommendation A Data Mining Approach Daniel McEnnis 2nd year PhD Daniel McEnnis 2nd year PhD
2
Overview High level overview Toolkit Improvements Experiments Evaluation Algorithms research Data Future work High level overview Toolkit Improvements Experiments Evaluation Algorithms research Data Future work
3
Project Goals Integrate social information Make algorithms ‘culturally aware’ Implement existing algorithms Systematic evaluation framework Integrate social information Make algorithms ‘culturally aware’ Implement existing algorithms Systematic evaluation framework
4
Similarity Algorithms Create new relations based on some aspect of similarity 6 different varieties of similarity Each algorithm can use one of 6 distance functions Create new relations based on some aspect of similarity 6 different varieties of similarity Each algorithm can use one of 6 distance functions
5
Aggregator Algorithms Takes data from one set of actors and moves it to another 6 different varierties Each variety uses one of 7 aggregator functions Basic building block of Graph-RAT applications Takes data from one set of actors and moves it to another 6 different varierties Each variety uses one of 7 aggregator functions Basic building block of Graph-RAT applications
6
Graph Triples Census Probable novel algorithm Proof of Correctness Completed Proof of Time Complexity Completed Literature review in progress Probable novel algorithm Proof of Correctness Completed Proof of Time Complexity Completed Literature review in progress
7
SUCCESS! Graph-RAT programming language now functioning Graph-RAT integrates social, cultural, personal, and audio data into algorithms Includes most commercial algorithms Contains primitives for existing academic systems Evaluation is entirely automated Graph-RAT programming language now functioning Graph-RAT integrates social, cultural, personal, and audio data into algorithms Includes most commercial algorithms Contains primitives for existing academic systems Evaluation is entirely automated
8
PROBLEMS
9
Evaluation Exploration 9 types of music recommendation Personalized versus generic Open query versus targeted query Dynamic versus static data New music versus all music 9 types of music recommendation Personalized versus generic Open query versus targeted query Dynamic versus static data New music versus all music
10
Personalized Radio Open query with personalized presentation Static data vs dynamic data New items prediction vs predict anything Open query with personalized presentation Static data vs dynamic data New items prediction vs predict anything
11
Targeted Search Not personalized Similarity queries Automatically generating targeted lists for a browsing hierarchy New music vs all music Static vs dynamic data Not personalized Similarity queries Automatically generating targeted lists for a browsing hierarchy New music vs all music Static vs dynamic data
12
Personalized Tag Radio Create a personalized play list matching a given query New music vs all music Static vs dynamic data Create a personalized play list matching a given query New music vs all music Static vs dynamic data
13
Excluded Types ‘Top 40’ prediction Rendered obsolete by other types ‘Top 40’ prediction Rendered obsolete by other types
14
Existing Algorithms Item-to-Item collaborative filtering 7 variations User-to-user collaborative filtering 7 variations Associative mining collaborative filtering Direct machine learning playlist data Direct machine learning audio data Item-to-Item collaborative filtering 7 variations User-to-user collaborative filtering 7 variations Associative mining collaborative filtering Direct machine learning playlist data Direct machine learning audio data
15
Novel Algorithms Machine learning over profile data Machine learning over cultural and profile data Machine learning on different concatenations Audio Playlist Profile Cultural Machine learning over profile data Machine learning over cultural and profile data Machine learning on different concatenations Audio Playlist Profile Cultural
16
Initial Data LiveJournal Separating music data is difficult No tag info or audio content No enough musical data LastFM by User No audio content Data cleaning is an issue LiveJournal Separating music data is difficult No tag info or audio content No enough musical data LastFM by User No audio content Data cleaning is an issue
17
Current Data 40’s Jazz Recordings 1800 annotated recordings from 70 CDs Covers nearly all 40’s popular music LastFM by Song Retrieves tag and user info by song Data cleaning on user playcounts needed 40’s Jazz Recordings 1800 annotated recordings from 70 CDs Covers nearly all 40’s popular music LastFM by Song Retrieves tag and user info by song Data cleaning on user playcounts needed
18
Data Cleaning Tags Polysemy Synonomy Disjoint Hypersomny Hyposomny Initial algorithms developed Polysemy Synonomy Disjoint Hypersomny Hyposomny Initial algorithms developed
19
Future Work: Programming Radically different programming environment SQL LINQ library package in C# Radically different programming environment SQL LINQ library package in C#
20
Future Work: Scalability Distributed SQL database implementation Just-in-time compilation Event-based recalculation of algorithm results Parallel execution of algorithms Multi-threaded algorithms Distributed SQL database implementation Just-in-time compilation Event-based recalculation of algorithm results Parallel execution of algorithms Multi-threaded algorithms
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.