Teaching machines to appreciate music

Slides:

Advertisements

Similar presentations

Face Recognition: A Convolutional Neural Network Approach

Advertisements

Ch. Eick: More on Machine Learning & Neural Networks Different Forms of Learning: –Learning agent receives feedback with respect to its actions (e.g. using.

Entropy and Dynamism Criteria for Voice Quality Classification Applications Authors: Peter D. Kukharchik, Igor E. Kheidorov, Hanna M. Lukashevich, Denis.

SOUND with MATLAB. SOUND INPUT [a, fa, na]= wavread(’mim wav') Sound data Sampling Frequency #bit representation.

Learning Functions and Neural Networks II Lecture 9 Luoting Fu Spring 2012.

1 Chapter 16 Fourier Analysis with MATLAB Fourier analysis is the process of representing a function in terms of sinusoidal components. It is widely employed.

Designing Facial Animation For Speaking Persian Language Hadi Rahimzadeh June 2005.

F 鍾承道 Acoustic Features for Speech Recognition: From Mel-Frequency Cepstrum Coefficients (MFCC) to BottleNeck Features(BNF)

S. Mandayam/ ANN/ECE Dept./Rowan University Artificial Neural Networks / Fall 2004 Shreekanth Mandayam ECE Department Rowan University.

Classification of Music According to Genres Using Neural Networks, Genetic Algorithms and Fuzzy Systems.

Lab8 (Signal & System) Instructor: Anan Osothsilp Date: 17 April 07.

Introduction to Fast Fourier Transform (FFT) Algorithms R.C. Maher ECEN4002/5002 DSP Laboratory Spring 2003.

Presenting: Itai Avron Supervisor: Chen Koren Characterization Presentation Spring 2005 Implementation of Artificial Intelligence System on FPGA.

Classification of Music According to Genres Using Neural Networks, Genetic Algorithms and Fuzzy Systems.

1 Music Classification Using SVM Ming-jen Wang Chia-Jiu Wang.

MUSIC HU A fun ‘musical’ project is due at the end of this unit! (12/6/11) Reminder…

Audio classification Discriminating speech, music and environmental audio Rajas A. Sambhare ECE 539.

1 Chapter 5 Image Transforms. 2 Image Processing for Pattern Recognition Feature Extraction Acquisition Preprocessing Classification Post Processing Scaling.

Classification / Regression Neural Networks 2

Radial Basis Function Networks:

Yang, Luyu.  Postal service for sorting mails by the postal code written on the envelop  Bank system for processing checks by reading the amount of.

Soundscapes James Martin. Overview Problem Statement Proposed Solution Solution Created (Modules, Model, Pics) Testing Looking Back See It in Action Q&A.

Convolution in Matlab The convolution in matlab is accomplished by using “conv” command. If “u” is a vector with length ‘n’ and “v” is a vector with length.

1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 25 Nov 4, 2005 Nanjing University of Science & Technology.

MUSIC HU 300 ~ Seminar 4 ~ PappadakisWelcome!. Any questions before we get started? Reminder: Unit 4 Project is Due June 14 at midnight. Looking ahead…

Music Classification Using Neural Networks Craig Dennis ECE 539.

CSC321 Introduction to Neural Networks and Machine Learning Lecture 3: Learning in multi-layer networks Geoffrey Hinton.

An Artificial Neural Network Approach to Surface Waviness Prediction in Surface Finishing Process by Chi Ngo ECE/ME 539 Class Project.

Biological Inspiration for Artificial Neural Networks Nick Mascola.

Audio Fingerprinting as a New Task for MIREX-2014 Chung-Che Wang Jyh-Shing Roger Jang.

Let’s look at the project first. You’re asked to write a 750 – 1000 word essay on the role that music has played in your life. Let’s talk about that….

Presented by Michael Katic. Main Influence Most of my practices in this project were an attempt to recreate an Audio Search Algorithm developed by Shazam.

ENEE 322: Continuous-Time Fourier Transform (Chapter 4)

Speech Recognition through Neural Networks By Mohammad Usman Afzal Mohammad Waseem.

Automatic Classification of Audio Data by Carlos H. L. Costa, Jaime D. Valle, Ro L. Koerich IEEE International Conference on Systems, Man, and Cybernetics.

Heart Sound Biometrics for Continual User Authentication

Tenacious Deep Learning

Outline Problem Description Data Acquisition Method Overview

Recognition of biological cells – development

Processing Sound Files

Speaker Classification through Deep Learning

Presentation on Artificial Neural Network Based Pathological Voice Classification Using MFCC Features Presenter: Subash Chandra Pakhrin 072MSI616 MSC in.

Other Classification Models: Neural Network

A Support Vector Machine Approach to Sonar Classification

Self organizing networks

Speech Recognition Christian Schulze

Lecture 9 MLP (I): Feed-forward Model

Musical Style Classification

National Conference on Recent Advances in Wireless Communication & Artificial Intelligence (RAWCAI-2014) Organized by Department of Electronics & Communication.

Neural Networks Advantages Criticism

Net 222: Communications and networks fundamentals (Practical Part)

Speech Generation Using a Neural Network

Cache Replacement Scheme based on Back Propagation Neural Networks

Prediction of Wine Grade

LECTURE 18: FAST FOURIER TRANSFORM

Multi-Layer Perceptron

College Football Playoff Composition Prediction using Machine Learning

On Convolutional Neural Network

INTRODUCTION TO SIGNALS & SYSTEMS

Convolutional Neural Networks

ECE 791 Project Proposal Project Title: Developing and Evaluating a Tool for Converting MP3 Audio Files to Staff Music Project Team: Salvatore DeVito.

Phoneme Recognition Using Neural Networks by Albert VanderMeulen

Face Recognition: A Convolutional Neural Network Approach

EE150: Signals and Systems 2016-Spring

LECTURE 18: FAST FOURIER TRANSFORM

Artificial Neural Networks / Spring 2002

Directional Occlusion with Neural Network

ME 123 Computer Applications I Lecture 5: Input and Output 3/17/03

Presentation transcript:

Teaching machines to appreciate music Classifying songs into three genres using a trusty Multi-Layer Perceptron A Project by Chad Ostrowski & Curtis Reinking for EE 456 (Neural Networks) in the spring of 2009

Our genres-to-classify are Post-Rock, Folk, and Hip-Hop Can you guess which is which?

Let’s feed songs-as-data-arrays into an MLP and let it do its thing! The sample bit rate of the type of music files MATLAB insists on is 44100Hz Good ol’ .wav (but the right kind of .wav (ACM Waveform, if you’re wondering). There are at least 4 kinds.) A lot of our songs are >6 minutes in length 6 minutes * 60 seconds/minute * 44100 data points/second = 15,876,000 data points so we’d have an input layer of that size. Hidden layers of…eh, 2e7. worse: this is a variable depending on the song.

Let us chop it up. Let us extract features. folk hip hop Features, anyone?

FFT (the Fast (or Discrete) Fourier Transform) ought to be a good way for a computer to learn about music it listens to Hip hop Folk

It turns out double peaks are frightfully common. We mute them. Hip Hop Folk

So all of our inputs are: Size of song Means of un-fft-ed clips (that’s four inputs) Means of fft-ed clips (that’s four more) The average number of “big” peaks Locations of the five tallest peaks in the fft of each clip (twenty inputs) 30 total inputs

How to output? We recall that output vectors have no boundary problems and give greater accuracy than output scalars. 100 denotes folk 010 denotes post rock 001 denotes hip hop

Drum-roll please!! We tested various network sizes, settling on 30x100x100x3. It crapped on us.

(and that’s all we got) (so far)