Download presentation
Presentation is loading. Please wait.
1
Teaching machines to appreciate music
Classifying songs into three genres using a trusty Multi-Layer Perceptron A Project by Chad Ostrowski & Curtis Reinking for EE 456 (Neural Networks) in the spring of 2009
2
Our genres-to-classify are Post-Rock, Folk, and Hip-Hop
Can you guess which is which?
3
Let’s feed songs-as-data-arrays into an MLP and let it do its thing!
The sample bit rate of the type of music files MATLAB insists on is 44100Hz Good ol’ .wav (but the right kind of .wav (ACM Waveform, if you’re wondering). There are at least 4 kinds.) A lot of our songs are >6 minutes in length 6 minutes * 60 seconds/minute * data points/second = 15,876,000 data points so we’d have an input layer of that size. Hidden layers of…eh, 2e7. worse: this is a variable depending on the song.
4
Let us chop it up. Let us extract features.
folk hip hop Features, anyone?
5
FFT (the Fast (or Discrete) Fourier Transform) ought to be a good way for a computer to learn about music it listens to Hip hop Folk
6
It turns out double peaks are frightfully common. We mute them.
Hip Hop Folk
7
So all of our inputs are:
Size of song Means of un-fft-ed clips (that’s four inputs) Means of fft-ed clips (that’s four more) The average number of “big” peaks Locations of the five tallest peaks in the fft of each clip (twenty inputs) 30 total inputs
8
How to output? We recall that output vectors have no boundary problems and give greater accuracy than output scalars. 100 denotes folk 010 denotes post rock 001 denotes hip hop
9
Drum-roll please!! We tested various network sizes, settling on 30x100x100x3. It crapped on us.
10
(and that’s all we got) (so far)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.