Urban Sound Classification

Slides:

Advertisements

Similar presentations

Acoustic/Prosodic Features

Advertisements

Test: CNN vs. AMM Data: Four sets of Jail Break data from ARL/Penn State Total Negative 88 Total Positive 69 Total 157 Two sets of five tests on all four.

Kin 304 Regression Linear Regression Least Sum of Squares

Paper presentation for CSI5388 PENGCHENG XI Mar. 23, 2005

© 2010 Pearson Prentice Hall. All rights reserved Regression Interval Estimates.

Lesson learnt from the UCSD datamining contest Richard Sia 2008/10/10.

Final Project: Project 9 Part 1: Neural Networks Part 2: Overview of Classifiers Aparna S. Varde April 28, 2005 CS539: Machine Learning Course Instructor:

Predicting Protein Interactions HERPES! Team Question Mark Jeff Brown Dante Kappotis Robert Vanderley Anthony Biasella.

Proteomic Mass Spectrometry

Rotation Forest: A New Classifier Ensemble Method 交通大學電子所蕭晴駿 Juan J. Rodríguez and Ludmila I. Kuncheva.

Ensemble Learning (2), Tree and Forest

STUDENTLIFE PREDICTIVE MODELING Hongyu Chen Jing Li Mubing Li CS69/169 Mobile Health March 2015.

CS 5604 Spring 2015 Classification Xuewen Cui Rongrong Tao Ruide Zhang May 5th, 2015.

Predicting Income from Census Data using Multiple Classifiers Presented By: Arghya Kusum Das Arnab Ganguly Manohar Karki Saikat Basu Subhajit Sidhanta.

1 Action Classification: An Integration of Randomization and Discrimination in A Dense Feature Representation Computer Science Department, Stanford University.

TapPrints: Your Finger Taps Have Fingerprints Emiliano Miluzzo*, Alex Varshavsky*, Suhrid Balakrishnan*, Romit R. Choudhury + * at&t Labs – Research, USA.

Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.

Today Ensemble Methods. Recap of the course. Classifier Fusion

PhD Candidate: Tao Ma Advised by: Dr. Joseph Picone Institute for Signal and Information Processing (ISIP) Mississippi State University Linear Dynamic.

Consensus Group Stable Feature Selection

Predicting Voice Elicited Emotions

Classification Ensemble Methods 1

A Brief Introduction and Issues on the Classification Problem Jin Mao Postdoc, School of Information, University of Arizona Sept 18, 2015.

Risk Solutions & Research © Copyright IBM Corporation 2005 Default Risk Modelling : Decision Tree Versus Logistic Regression Dr.Satchidananda S Sogala,Ph.D.,

Clustering-based Active Learning on Sensor Type Classification in Buildings Dezhi Hong, Hongning Wang, Kamin Whitehouse University of Virginia 1.

GPGPU Performance and Power Estimation Using Machine Learning Gene Wu – UT Austin Joseph Greathouse – AMD Research Alexander Lyashevsky – AMD Research.

Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.

ADAPTIVE BABY MONITORING SYSTEM Team 56 Michael Qiu, Luis Ramirez, Yueyang Lin ECE 445 Senior Design May 3, 2016.

Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte A Bag-of-Features Framework for Time Series Classification.

PREDICTING SONG HOTNESS

Does one size really fit all? Evaluating classifiers in a Bag-of-Visual-Words classification Christian Hentschel, Harald Sack Hasso Plattner Institute.

US Croplands Richard Massey Dr Teki Sankey. Objectives 1.Classify annual cropland extent, Rainfed-Irrigated, and crop types for the US at 250m resolution.

City Forensics: Using Visual Elements to Predict Non-Visual City Attributes Sean M. Arietta, Alexei A. Efros, Ravi Ramamoorthi, Maneesh Agrawala Presented.

NOISE POLLUTION & THERMAL POLLUTION

Regression Usman Roshan.

University of Waikato, New Zealand

Automatic Lung Cancer Diagnosis from CT Scans (Week 1)

Recognition of bumblebee species by their buzzing sound

Traffic State Detection Using Acoustics

Kin 304 Regression Linear Regression Least Sum of Squares

Trees, bagging, boosting, and stacking

Animal Shelter Adoption

BPK 304W Regression Linear Regression Least Sum of Squares

Urban Sound Classification with a Convolution Neural Network

Introduction Feature Extraction Discussions Conclusions Results

Article and Work by: Justin Salamon and Juan Pablo Bello

BPK 304W Correlation.

Transportation Mode Recognition using Smartphone Sensor Data

Mobile Sensor-Based Biometrics Using Common Daily Activities

Sample vs Population comparing mean and standard deviations

Hyperparameters, bias-variance tradeoff, validation

The Open World of Micro-Videos

Dog/Cat Classifier Christina Stiff.

Predicting Breast Cancer Diagnosis From Fine-Needle Aspiration

Machine learning Empirical Performance Analysis

Decision tree ensembles in biomedical time-series classifaction

Metabolic Network Prediction of Drug Side Effects

Motifs across 4 heraldry books (More results)

Regression Usman Roshan.

Somi Jacob and Christian Bach

Ensemble learning Reminder - Bagging of Trees Random Forest

Classification with CART

Analysis for Predicting the Selling Price of Apartments Pratik Nikte

Sofia Pediaditaki and Mahesh Marina University of Edinburgh

Predicting Loan Defaults

Chapter 12 Inference on the Least-squares Regression Line; ANOVA

Analysis on Accelerated Learning Cohorts

Adrian E. Gonzalez , David Parra Department of Computer Science

Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017

Outlines Introduction & Objectives Methodology & Workflow

Presentation transcript:

Urban Sound Classification Joseph Chiou

SVM – based on first 193 features using Pearson Correlation Avg accuracy: 16.25% Run time: 1 sec Accuracy on Fold 10: 17.29% (highest acc on Fold 3 – 20.98%) - highest acc class: Dog bark (33%) RF – based on first 193 features using Pearson Correlation Avg accuracy: 20.6% Run time: 4:25 Accuracy on Fold 10: 24.41% (highest acc across all fold) - highest acc class: Dog bark (57%)

Accuracy Overviews

One layer CNN 128 x 128 x 2 Epoch: 20 90/10 validation. Use Fold 10 for testing, and Fold 9 to validate. 10 fold cross validation Avg accuracy: 60.53% Most predictive class: Gun shot (100%) Run time: 1:02:11 Least predictive classes: Air conditioner (37%) Siren (44%) Mean accuracy of different test fold: 57.76% 2 dense layer

Samples distribution in Fold 10 GU only has 2 samples being considered (32?) In order to create a 128 frame the window size is 65024 samples/mms Window size = hop size * (frame -1) 512 * 127 # samples between each successive fast fourier transform Window size smaller than this # is not considered.

SVM C value = 0.01 10 fold cross validation. 90/10 validation on Fold 10 Accuracy: 62.49% Most predictive classes: Gun shot (85%) Run time: 2:05 Avg accuracy across all testing fold: 55.4% (test fold 2, 3, and 6 below 50%, test fold 4, 5, 9, and 10 higher than 60%) Gun shot has high% but it also has sig less samples than other class (32)

Random Forest Tree: 500 Depth: 6 90/10 validation on Fold 10 Accuracy: 61.29% Most predictive class: Children playing (82%) Run time: 4:54 Dr. Roshan’s variable: tree 100, depth 6 Avg accuracy: 58.89% (100 runs avg)

Thank you

Comparison CNN RF SVM

Accuracy of each sound type CNN RF SVM Air Conditioner 0.37 0.77 0.61 Car Horn 0.53 0.7 Children Playing 0.75 0.82 Dog Bark 0.71 0.55 0.62 Drilling 0.67 0.46 0.54 Engine Idling 0.73 Gun Shot 1 0.85 Jackhammer 0.47 0.64 Siren 0.44 Street Music 0.66 CNN performs better on identifying noise sound

Model accuracy vs epoch Accuracy stays around 0.6 after 10 epoch