Urban Sound Classification

Slides:



Advertisements
Similar presentations
Acoustic/Prosodic Features
Advertisements

Test: CNN vs. AMM Data: Four sets of Jail Break data from ARL/Penn State Total Negative 88 Total Positive 69 Total 157 Two sets of five tests on all four.
Kin 304 Regression Linear Regression Least Sum of Squares
Paper presentation for CSI5388 PENGCHENG XI Mar. 23, 2005
© 2010 Pearson Prentice Hall. All rights reserved Regression Interval Estimates.
Lesson learnt from the UCSD datamining contest Richard Sia 2008/10/10.
Final Project: Project 9 Part 1: Neural Networks Part 2: Overview of Classifiers Aparna S. Varde April 28, 2005 CS539: Machine Learning Course Instructor:
Predicting Protein Interactions HERPES! Team Question Mark Jeff Brown Dante Kappotis Robert Vanderley Anthony Biasella.
Proteomic Mass Spectrometry
Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 Juan J. Rodríguez and Ludmila I. Kuncheva.
Ensemble Learning (2), Tree and Forest
STUDENTLIFE PREDICTIVE MODELING Hongyu Chen Jing Li Mubing Li CS69/169 Mobile Health March 2015.
CS 5604 Spring 2015 Classification Xuewen Cui Rongrong Tao Ruide Zhang May 5th, 2015.
Predicting Income from Census Data using Multiple Classifiers Presented By: Arghya Kusum Das Arnab Ganguly Manohar Karki Saikat Basu Subhajit Sidhanta.
1 Action Classification: An Integration of Randomization and Discrimination in A Dense Feature Representation Computer Science Department, Stanford University.
TapPrints: Your Finger Taps Have Fingerprints Emiliano Miluzzo*, Alex Varshavsky*, Suhrid Balakrishnan*, Romit R. Choudhury + * at&t Labs – Research, USA.
Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.
Today Ensemble Methods. Recap of the course. Classifier Fusion
PhD Candidate: Tao Ma Advised by: Dr. Joseph Picone Institute for Signal and Information Processing (ISIP) Mississippi State University Linear Dynamic.
Consensus Group Stable Feature Selection
Predicting Voice Elicited Emotions
Classification Ensemble Methods 1
A Brief Introduction and Issues on the Classification Problem Jin Mao Postdoc, School of Information, University of Arizona Sept 18, 2015.
Risk Solutions & Research © Copyright IBM Corporation 2005 Default Risk Modelling : Decision Tree Versus Logistic Regression Dr.Satchidananda S Sogala,Ph.D.,
Clustering-based Active Learning on Sensor Type Classification in Buildings Dezhi Hong, Hongning Wang, Kamin Whitehouse University of Virginia 1.
GPGPU Performance and Power Estimation Using Machine Learning Gene Wu – UT Austin Joseph Greathouse – AMD Research Alexander Lyashevsky – AMD Research.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
ADAPTIVE BABY MONITORING SYSTEM Team 56 Michael Qiu, Luis Ramirez, Yueyang Lin ECE 445 Senior Design May 3, 2016.
Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte A Bag-of-Features Framework for Time Series Classification.
PREDICTING SONG HOTNESS
Does one size really fit all? Evaluating classifiers in a Bag-of-Visual-Words classification Christian Hentschel, Harald Sack Hasso Plattner Institute.
US Croplands Richard Massey Dr Teki Sankey. Objectives 1.Classify annual cropland extent, Rainfed-Irrigated, and crop types for the US at 250m resolution.
City Forensics: Using Visual Elements to Predict Non-Visual City Attributes Sean M. Arietta, Alexei A. Efros, Ravi Ramamoorthi, Maneesh Agrawala Presented.
NOISE POLLUTION & THERMAL POLLUTION
Regression Usman Roshan.
University of Waikato, New Zealand
Automatic Lung Cancer Diagnosis from CT Scans (Week 1)
Recognition of bumblebee species by their buzzing sound
Traffic State Detection Using Acoustics
Kin 304 Regression Linear Regression Least Sum of Squares
Trees, bagging, boosting, and stacking
Animal Shelter Adoption
BPK 304W Regression Linear Regression Least Sum of Squares
Urban Sound Classification with a Convolution Neural Network
Introduction Feature Extraction Discussions Conclusions Results
Article and Work by: Justin Salamon and Juan Pablo Bello
BPK 304W Correlation.
Transportation Mode Recognition using Smartphone Sensor Data
Mobile Sensor-Based Biometrics Using Common Daily Activities
Sample vs Population comparing mean and standard deviations
Hyperparameters, bias-variance tradeoff, validation
The Open World of Micro-Videos
Dog/Cat Classifier Christina Stiff.
Predicting Breast Cancer Diagnosis From Fine-Needle Aspiration
Machine learning Empirical Performance Analysis
Decision tree ensembles in biomedical time-series classifaction
Metabolic Network Prediction of Drug Side Effects
Motifs across 4 heraldry books (More results)
Regression Usman Roshan.
Somi Jacob and Christian Bach
Ensemble learning Reminder - Bagging of Trees Random Forest
Classification with CART
Analysis for Predicting the Selling Price of Apartments Pratik Nikte
Sofia Pediaditaki and Mahesh Marina University of Edinburgh
Predicting Loan Defaults
Chapter 12 Inference on the Least-squares Regression Line; ANOVA
Analysis on Accelerated Learning Cohorts
Adrian E. Gonzalez , David Parra Department of Computer Science
Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017
Outlines Introduction & Objectives Methodology & Workflow
Presentation transcript:

Urban Sound Classification Joseph Chiou

SVM – based on first 193 features using Pearson Correlation Avg accuracy: 16.25% Run time: 1 sec Accuracy on Fold 10: 17.29% (highest acc on Fold 3 – 20.98%) - highest acc class: Dog bark (33%) RF – based on first 193 features using Pearson Correlation Avg accuracy: 20.6% Run time: 4:25 Accuracy on Fold 10: 24.41% (highest acc across all fold) - highest acc class: Dog bark (57%)

Accuracy Overviews

One layer CNN 128 x 128 x 2 Epoch: 20 90/10 validation. Use Fold 10 for testing, and Fold 9 to validate. 10 fold cross validation Avg accuracy: 60.53% Most predictive class: Gun shot (100%) Run time: 1:02:11 Least predictive classes: Air conditioner (37%) Siren (44%) Mean accuracy of different test fold: 57.76% 2 dense layer

Samples distribution in Fold 10 GU only has 2 samples being considered (32?) In order to create a 128 frame the window size is 65024 samples/mms Window size = hop size * (frame -1) 512 * 127 # samples between each successive fast fourier transform Window size smaller than this # is not considered.

SVM C value = 0.01 10 fold cross validation. 90/10 validation on Fold 10 Accuracy: 62.49% Most predictive classes: Gun shot (85%) Run time: 2:05 Avg accuracy across all testing fold: 55.4% (test fold 2, 3, and 6 below 50%, test fold 4, 5, 9, and 10 higher than 60%) Gun shot has high% but it also has sig less samples than other class (32)

Random Forest Tree: 500 Depth: 6 90/10 validation on Fold 10 Accuracy: 61.29% Most predictive class: Children playing (82%) Run time: 4:54 Dr. Roshan’s variable: tree 100, depth 6 Avg accuracy: 58.89% (100 runs avg)

Thank you

Comparison CNN RF SVM

Accuracy of each sound type CNN RF SVM Air Conditioner 0.37 0.77 0.61 Car Horn 0.53 0.7 Children Playing 0.75 0.82 Dog Bark 0.71 0.55 0.62 Drilling 0.67 0.46 0.54 Engine Idling 0.73 Gun Shot 1 0.85 Jackhammer 0.47 0.64 Siren 0.44 Street Music 0.66 CNN performs better on identifying noise sound

Model accuracy vs epoch Accuracy stays around 0.6 after 10 epoch