Teaching machines to appreciate music

Slides:



Advertisements
Similar presentations
Face Recognition: A Convolutional Neural Network Approach
Advertisements

Ch. Eick: More on Machine Learning & Neural Networks Different Forms of Learning: –Learning agent receives feedback with respect to its actions (e.g. using.
Entropy and Dynamism Criteria for Voice Quality Classification Applications Authors: Peter D. Kukharchik, Igor E. Kheidorov, Hanna M. Lukashevich, Denis.
SOUND with MATLAB. SOUND INPUT [a, fa, na]= wavread(’mim wav') Sound data Sampling Frequency #bit representation.
Learning Functions and Neural Networks II Lecture 9 Luoting Fu Spring 2012.
1 Chapter 16 Fourier Analysis with MATLAB Fourier analysis is the process of representing a function in terms of sinusoidal components. It is widely employed.
Designing Facial Animation For Speaking Persian Language Hadi Rahimzadeh June 2005.
F 鍾承道 Acoustic Features for Speech Recognition: From Mel-Frequency Cepstrum Coefficients (MFCC) to BottleNeck Features(BNF)
S. Mandayam/ ANN/ECE Dept./Rowan University Artificial Neural Networks / Fall 2004 Shreekanth Mandayam ECE Department Rowan University.
Classification of Music According to Genres Using Neural Networks, Genetic Algorithms and Fuzzy Systems.
Lab8 (Signal & System) Instructor: Anan Osothsilp Date: 17 April 07.
Introduction to Fast Fourier Transform (FFT) Algorithms R.C. Maher ECEN4002/5002 DSP Laboratory Spring 2003.
Presenting: Itai Avron Supervisor: Chen Koren Characterization Presentation Spring 2005 Implementation of Artificial Intelligence System on FPGA.
Classification of Music According to Genres Using Neural Networks, Genetic Algorithms and Fuzzy Systems.
1 Music Classification Using SVM Ming-jen Wang Chia-Jiu Wang.
MUSIC HU A fun ‘musical’ project is due at the end of this unit! (12/6/11) Reminder…
Audio classification Discriminating speech, music and environmental audio Rajas A. Sambhare ECE 539.
1 Chapter 5 Image Transforms. 2 Image Processing for Pattern Recognition Feature Extraction Acquisition Preprocessing Classification Post Processing Scaling.
Classification / Regression Neural Networks 2
Radial Basis Function Networks:
Yang, Luyu.  Postal service for sorting mails by the postal code written on the envelop  Bank system for processing checks by reading the amount of.
Soundscapes James Martin. Overview Problem Statement Proposed Solution Solution Created (Modules, Model, Pics) Testing Looking Back See It in Action Q&A.
Convolution in Matlab The convolution in matlab is accomplished by using “conv” command. If “u” is a vector with length ‘n’ and “v” is a vector with length.
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 25 Nov 4, 2005 Nanjing University of Science & Technology.
MUSIC HU 300 ~ Seminar 4 ~ PappadakisWelcome!. Any questions before we get started? Reminder: Unit 4 Project is Due June 14 at midnight. Looking ahead…
Music Classification Using Neural Networks Craig Dennis ECE 539.
CSC321 Introduction to Neural Networks and Machine Learning Lecture 3: Learning in multi-layer networks Geoffrey Hinton.
An Artificial Neural Network Approach to Surface Waviness Prediction in Surface Finishing Process by Chi Ngo ECE/ME 539 Class Project.
Biological Inspiration for Artificial Neural Networks Nick Mascola.
Audio Fingerprinting as a New Task for MIREX-2014 Chung-Che Wang Jyh-Shing Roger Jang.
Let’s look at the project first. You’re asked to write a 750 – 1000 word essay on the role that music has played in your life. Let’s talk about that….
Presented by Michael Katic. Main Influence Most of my practices in this project were an attempt to recreate an Audio Search Algorithm developed by Shazam.
ENEE 322: Continuous-Time Fourier Transform (Chapter 4)
Speech Recognition through Neural Networks By Mohammad Usman Afzal Mohammad Waseem.
Automatic Classification of Audio Data by Carlos H. L. Costa, Jaime D. Valle, Ro L. Koerich IEEE International Conference on Systems, Man, and Cybernetics.
Heart Sound Biometrics for Continual User Authentication
Tenacious Deep Learning
Outline Problem Description Data Acquisition Method Overview
Recognition of biological cells – development
Processing Sound Files
Speaker Classification through Deep Learning
Presentation on Artificial Neural Network Based Pathological Voice Classification Using MFCC Features Presenter: Subash Chandra Pakhrin 072MSI616 MSC in.
Other Classification Models: Neural Network
A Support Vector Machine Approach to Sonar Classification
Self organizing networks
Speech Recognition Christian Schulze
Lecture 9 MLP (I): Feed-forward Model
Musical Style Classification
National Conference on Recent Advances in Wireless Communication & Artificial Intelligence (RAWCAI-2014) Organized by Department of Electronics & Communication.
Neural Networks Advantages Criticism
Net 222: Communications and networks fundamentals (Practical Part)
Speech Generation Using a Neural Network
Cache Replacement Scheme based on Back Propagation Neural Networks
Prediction of Wine Grade
LECTURE 18: FAST FOURIER TRANSFORM
Multi-Layer Perceptron
College Football Playoff Composition Prediction using Machine Learning
On Convolutional Neural Network
INTRODUCTION TO SIGNALS & SYSTEMS
Convolutional Neural Networks
ECE 791 Project Proposal Project Title: Developing and Evaluating a Tool for Converting MP3 Audio Files to Staff Music Project Team: Salvatore DeVito.
Phoneme Recognition Using Neural Networks by Albert VanderMeulen
Face Recognition: A Convolutional Neural Network Approach
EE150: Signals and Systems 2016-Spring
LECTURE 18: FAST FOURIER TRANSFORM
Artificial Neural Networks / Spring 2002
Directional Occlusion with Neural Network
ME 123 Computer Applications I Lecture 5: Input and Output 3/17/03
Presentation transcript:

Teaching machines to appreciate music Classifying songs into three genres using a trusty Multi-Layer Perceptron A Project by Chad Ostrowski & Curtis Reinking for EE 456 (Neural Networks) in the spring of 2009

Our genres-to-classify are Post-Rock, Folk, and Hip-Hop Can you guess which is which?

Let’s feed songs-as-data-arrays into an MLP and let it do its thing! The sample bit rate of the type of music files MATLAB insists on is 44100Hz Good ol’ .wav (but the right kind of .wav (ACM Waveform, if you’re wondering). There are at least 4 kinds.) A lot of our songs are >6 minutes in length 6 minutes * 60 seconds/minute * 44100 data points/second = 15,876,000 data points so we’d have an input layer of that size. Hidden layers of…eh, 2e7. worse: this is a variable depending on the song.

Let us chop it up. Let us extract features. folk hip hop Features, anyone?

FFT (the Fast (or Discrete) Fourier Transform) ought to be a good way for a computer to learn about music it listens to Hip hop Folk

It turns out double peaks are frightfully common. We mute them. Hip Hop Folk

So all of our inputs are: Size of song Means of un-fft-ed clips (that’s four inputs) Means of fft-ed clips (that’s four more) The average number of “big” peaks Locations of the five tallest peaks in the fft of each clip (twenty inputs) 30 total inputs

How to output? We recall that output vectors have no boundary problems and give greater accuracy than output scalars. 100 denotes folk 010 denotes post rock 001 denotes hip hop

Drum-roll please!! We tested various network sizes, settling on 30x100x100x3. It crapped on us.

(and that’s all we got) (so far)