Music Classification Using Neural Networks Craig Dennis ECE 539
Problem and Motivation People have hundreds of MP3s and other digital music files unclassified on their computer iTunes and other large digital music stores must classify thousands of files with many different genres Different genres sound different, so their frequency content should be different Very difficult to choose frequency content The goal is to classify music based on how it sounds using a neural network
Data Collection 3 Different Genres, 30 Samples Each Classical (Beethoven, Mozart, etc.) Pop (Coldplay, Madonna, etc.) Classic Rock (Eric Clapton, Led Zeppelin, etc.) Samples recorded at 44.1Khz and are the middle 5 seconds of the song
Data Collection Continued Frequency Content Analysis Computed the Fast Fourier Transform of 50ms samples to get frequency content Averaged the magnitude of 6 different frequency bands over 250ms samples Total of 120 different frequency samples spanning both time and frequency Also included length of song and tempo
Sample Data Pop Data Song: The Killers – Mr. Brightside Lots of low and high frequencies throughout entire 5 seconds All instruments are playing, sample in a middle of a verse Feature Magnitude
Sample Data Classic Rock Song: Cream – Sunshine Of Your Love More low frequency content than high frequency content Mostly during a guitar solo halfway through the song Magnitude Feature
Sample Data Classical Song: Russian Dance from The Nutcracker Short bursts of mid and high frequency content Rather quiet part with some louder parts near the end of the sample Magnitude Feature
Preliminary Results Using K-Nearest-Neighbor with all features Trained with 60 songs, test with 30 Average classification rate using 3-way cross validation is 68.88% Seems to classify Classical and Pop correctly however confuses Classic Rock as Pop Multi-layer perceptron seems to choose all testing songs are from one genre for a classification rate of 33%
Future Work Feature reduction to reduce the 120 features to a more manageable 20 or 30 features Try reduced features on Multi-layer peceptron and other neural networks
Further Improvement Increase the number of song samples Have more precise frequency bands, break the frequency spectrum in to more than 6 pieces Have more “important” features from the frequency bands, very hard to find