Sophomore Slumpware Predicting Album Sales with Artificial Neural Networks Matthew Wirtala ECE 539
Overview Record sales have decreased ~30% over the past 4 years Record sales have decreased ~30% over the past 4 years No consensus on why this is No consensus on why this is File-sharing? File-sharing? Inferior albums being released? Inferior albums being released?
Overview Perhaps album sales can be predicted with an MLP network Perhaps album sales can be predicted with an MLP network May show what factors determine how well an album will sell May show what factors determine how well an album will sell Indicate which albums deserve a better marketing push Indicate which albums deserve a better marketing push
Feature data Critical acclaim Critical acclaim Review scores gathered from 4 sources Review scores gathered from 4 sources Rolling Stone Rolling Stone
Feature data Hype level Hype level Amount of press coverage will lead to higher public awareness and possibly higher album sales Amount of press coverage will lead to higher public awareness and possibly higher album sales Previous album sales Previous album sales Serve as barometer of how established an artist may be. Serve as barometer of how established an artist may be.
Data labelling Too difficult to predict exact album sales Too difficult to predict exact album sales Data labelled as one of three classes Data labelled as one of three classes Albums that sell fewer than 500,000 copies Albums that sell fewer than 500,000 copies Gold albums (500,000 – 1,000,000 copies) Gold albums (500,000 – 1,000,000 copies) Platinum albums ( > 1,000,000 copies sold) Platinum albums ( > 1,000,000 copies sold)
Data preprocessing Data gathered for 60 albums Data gathered for 60 albums 20 from each class 20 from each class Some from same artist falling into separate classes Some from same artist falling into separate classes Data randomized and split into three partitions Data randomized and split into three partitions Feature vectors normalized to Feature vectors normalized to
The Neural Network Utilized Professor Hu’s standard bp.m algorithm Utilized Professor Hu’s standard bp.m algorithm Trialed many different configurations Trialed many different configurations Optimal configuration Optimal configuration 2 hidden layers 2 hidden layers 7 neurons in first layer, 8 in second 7 neurons in first layer, 8 in second Learning rate = 0.267, momentum = Learning rate = 0.267, momentum = Tested with 3-way cross validation Tested with 3-way cross validation
Results Highest classification rate 60% Highest classification rate 60% Correctly classified class 1 and 2 albums with 80-90% accuracy Correctly classified class 1 and 2 albums with 80-90% accuracy Could not separate class 2 albums Could not separate class 2 albums Class 2 featured albums with vectors similar to those of classes 1 and 3 Class 2 featured albums with vectors similar to those of classes 1 and 3 Sample confusion matrix: Sample confusion matrix:
Future Improvements Further analysis of feature vectors to determine possible differences in class 2 albums Further analysis of feature vectors to determine possible differences in class 2 albums Possible reduction of labelling to two classes (combine Gold and Platinum) Possible reduction of labelling to two classes (combine Gold and Platinum) Classification does show that predictions can be made based on the features considered in this study Classification does show that predictions can be made based on the features considered in this study