Postgraduate Department of Electrical Engineering PPGEE UFPR - Federal University of Paraná Luis Gustavo Weigert Machado

Slides:



Advertisements
Similar presentations
Chapter 4 Pattern Recognition Concepts: Introduction & ROC Analysis.
Advertisements

Entropy and Dynamism Criteria for Voice Quality Classification Applications Authors: Peter D. Kukharchik, Igor E. Kheidorov, Hanna M. Lukashevich, Denis.
Franz de Leon, Kirk Martinez Web and Internet Science Group  School of Electronics and Computer Science  University of Southampton {fadl1d09,
Learning to Cluster Web Search Results SIGIR 04. ABSTRACT Organizing Web search results into clusters facilitates users quick browsing through search.
Outline Introduction Music Information Retrieval Classification Process Steps Pitch Histograms Multiple Pitch Detection Algorithm Musical Genre Classification.
Blues Blues is a vocal-instrumental form of music which has origin in African American communities in southern U.S. Solo voice was later accompanied by.
Overview What : Stroke type Transformation: Timbre Rhythm When: Stroke timing Resynthesis.
Feature Vector Selection and Use With Hidden Markov Models to Identify Frequency-Modulated Bioacoustic Signals Amidst Noise T. Scott Brandes IEEE Transactions.
Classification of Music According to Genres Using Neural Networks, Genetic Algorithms and Fuzzy Systems.
Data Mining Techniques Outline
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Speaker Adaptation for Vowel Classification
Classification of Music According to Genres Using Neural Networks, Genetic Algorithms and Fuzzy Systems.
Chapter 5 Data mining : A Closer Look.
INTRODUCTION Problem: Damage condition of residential areas are more concerned than that of natural areas in post-hurricane damage assessment. Recognition.
POTENTIAL RELATIONSHIP DISCOVERY IN TAG-AWARE MUSIC STYLE CLUSTERING AND ARTIST SOCIAL NETWORKS Music style analysis such as music classification and clustering.
Rock music is a genre of Pop music (popular music) which has its roots in 1940’s and 1950’s, being heavily influenced by Rhythm and Blues (R&B). Rock.
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Pattern Recognition & Detection: Texture Classifier ECE 4226 Presented By: Denis Petrusenko December 10, 2007.
Attention Deficit Hyperactivity Disorder (ADHD) Student Classification Using Genetic Algorithm and Artificial Neural Network S. Yenaeng 1, S. Saelee 2.
Musical Genres and Styles. Exercise One (in class) You are in charge of a CD department in a music store. You must decide whether the following selections.
Audio classification Discriminating speech, music and environmental audio Rajas A. Sambhare ECE 539.
„Bandwidth Extension of Speech Signals“ 2nd Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction 22nd and 23rd June.
MACHINE LEARNING TECHNIQUES FOR MUSIC PREDICTION S. Grant Lowe Advisor: Prof. Nick Webb.
Designing a Music CD Album Cover. Creating a CD Cover Audio CDs have been available for commercial use since CDs were later designed to store other.
 Rock music is a genre of music that entered the mainstream in the 1960s however it also has its roots from the 1940s and 1950s rock and roll, rhythm.
Unit 11 The Sounds of the World. Musical Styles Light Hip-hop and rap Pop Classical Folk Music Jazz Latin Rock and roll Blues Heavy metal.
Jeopardy GenresInstruments PeopleSongs Timeline Q $100 Q $200 Q $300 Q $400 Q $500 Q $100 Q $200 Q $300 Q $400 Q $500 Final Jeopardy Source:
 …reflects and determines the people’s way of life.  … does a lot to change the outlook of the people, to make peace, to bring some positive social.
TEMPLATE DESIGN © Zhiyao Duan 1,2, Lie Lu 1, and Changshui Zhang 2 1. Microsoft Research Asia (MSRA), Beijing, China.2.
Metadata in the Cloud Stephen White President, Gracenote.
Musical Genres and Styles. Exercise One You are in charge of a CD department in a music store. You must decide whether the following selections go in--
Feature Vector Selection and Use With Hidden Markov Models to Identify Frequency-Modulated Bioacoustic Signals Amidst Noise T. Scott Brandes IEEE Transactions.
MUH 2017 “Survey of Rock Music” (Summer “B” 2012, Class # 50585) Instructor: Scott Warfield, Assoc. Prof. Classroom Building 1, Room 121 8:00 am-9:50 am.
Audio Tempo Extraction Presenter: Simon de Leon Date: February 9, 2006 Course: MUMT611.
Duraid Y. Mohammed Philip J. Duncan Francis F. Li. School of Computing Science and Engineering, University of Salford UK Audio Content Analysis in The.
MUSIC HU 300 ~ Seminar 4 ~ PappadakisWelcome!. Any questions before we get started? Reminder: Unit 4 Project is Due June 14 at midnight. Looking ahead…
MSc Project Musical Instrument Identification System MIIS Xiang LI ee05m216 Supervisor: Mark Plumbley.
Rotem Golan Department of Computer Science Ben-Gurion University of the Negev, Israel.
V-Cert Music Technology Producing Dance Music UNIT 6 – Stage 1 NAME: Gemma Mitchell.
Musical Genre Categorization Using Support Vector Machines Shu Wang.
Genre Research. . History.. R&B/Hip-Hop Songs, formerly the Black Singles Chart, is a chart released weekly by Billbord in the United States. The chart.
BASS TRACK SELECTION IN MIDI FILES AND MULTIMODAL IMPLICATIONS TO MELODY gPRAI Pattern Recognition and Artificial Intelligence Group Computer Music Laboratory.
Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte A Bag-of-Features Framework for Time Series Classification.
PREDICTING SONG HOTNESS
When we listen to classical music it pictures in our mind happy days. When I listen to folk music it makes me remember my early childhood. When I listen.
Can you name any music style?Brain-storming music style classical music pop Folk music jazzcountry music rock and roll rap Latin music the blues Light.
A content-based System for Music Recommendation and Visualization of User Preference Working on Semantic Notions Dmitry Bogdanov, Martin Haro, Ferdinand.
Automatic Classification of Audio Data by Carlos H. L. Costa, Jaime D. Valle, Ro L. Koerich IEEE International Conference on Systems, Man, and Cybernetics.
Late 20th/21st Centaury Music
The Greek Audio Dataset
Evaluating Classifiers
Music Technology Part 1 : Download a Specification
Musical Genres BAAM!.
MUSIC.
Brian Whitman Paris Smaragdis MIT Media Lab
Urban Sound Classification with a Convolution Neural Network
IWPR18: LSTM music classification, WR58
SOUND.
Musical Style Classification
Presented by Steven Lewis
From antiquity to modern
EE513 Audio Signals and Systems
Department of Electrical Engineering
Presenter: Simon de Leon Date: March 2, 2006 Course: MUMT611
Automatic Handwriting Generation
Using Clustering to Make Prediction Intervals For Neural Networks
Modeling IDS using hybrid intelligent systems
Measuring the Similarity of Rhythmic Patterns
From antiquity to modern
Presentation transcript:

Postgraduate Department of Electrical Engineering PPGEE UFPR - Federal University of Paraná Luis Gustavo Weigert Machado Supervisor: Prof. PhD Alessandro Lameiras Koerich Hierarchical Classifiers Combination for Automatic Musical Information Retrieval

Abstract The most aggravating problem in the automatic classification of music is the true rates which is considerably low. We present a hierarchical combination of classifiers for increasing the strength in the musical styles classification employing different features extracted from music. To solve this problem, some classification stages will be built with the aim of taking different features extracted from each music sample. In the first stage, the music samples will be trained with a neural network, and the probabilities results found will be evaluated to create thresholds set by the overall result, and also a list of confusion classes will be defined. Before, the confusion classes and the thresholds will be presented to the second stage to generate binary classifiers for each confusion using other features extracted of the same music. And finally, we will create a third stage to combine the results using the first and second stages. 2

MSD Dataset The Million Song Dataset (MSD) – 1 million contemporary popular music tracks with 280GB of data. – Metadata (trackid, artist, date). – Features (pitches, timbre and loudness) extracted using The Echonest API. 3

TU-WIEN MSD Benchmarks Same audio samples of MSD linked with the unique IDs. Mostly containing 30 or 60 seconds snippets. Extracted several features, splitting into different datasets. Ground Truth assignments provided by allmusic.com. – Genre Dataset (MAGD) 422,714 labels. – Top Genre Dataset (Top-MAGD) 406,427 labels. – Style Dataset(MASD) 273,936 labels. Data splitted into train (90%, 80%, 66%, 50%) and test sets. Stratified and non stratified datasetes: Artists, album and time filters. Avoiding to have the same characteristic in both the Training and test set. 4

TU-WIEN MSD Benchmarks Genre NameNumber of Songs Big Band3,115 Blues Contemporary6,874 Country Traditional11,164 Dance15,114 Electronica10,987 Experimental12,139 Folk International9,849 Gospel6,974 Grunge Emo6,256 Hip Hop Rap16,100 Jazz Classic10,024 Metal Alternative14,009 Metal Death9,851 Metal Heavy10,784 Pop Contemporary13,624 Pop Indie18,138 Pop Latin7,699 Punk9,610 Reggae5,232 RnB Soul6,238 Rock Alternative12,717 Rock College16,575 Rock Contemporary16,530 Rock Hard13,276 Rock Neo Psychedelia11,057 Total273,936 Feature SetExtractorDimDeriv. 1MFCCsMARSAYS52 2ChromaMARSAYS48 3TimbralMARSAYS124 4MFCCsjAudio Low-level spectral features (Spectral Centroid, Spectral Rolloff Point, Spectral Flux,Compactness, and Spectral Variability, Root Mean Square, Zero Crossings, and Fraction of Low Energy Windows) jAudio1696 6Method of MomentsjAudio1060 7Area Method of MomentsjAudio Linear Predictive CodingjAudio Rhythm Patternsrp extract Statistical Spectrum Descriptorsrp extract168 11Rhythm Histogramsrp extract60 12Modulation Frequency Variance Descriptorrp extract420 13Temporal Statistical Spectrum Descriptorsrp extract Temporal Rhythm Histogramsrp extract420 Features extracted from the MSD samples. Style Dataset(MASD) Alexander Schindler, Rudolf Mayer, and Andreas Rauber. FACILITATING COMPREHENSIVE BENCHMARKING EXPERIMENTS ON THE MILLION SONG DATASET. ISMIR

Datasets Used Assignments : MSD Allmusic Guide Style (273,936 patterns). Partitions: stratified 66% for train and 33% for test. Features: – First Stage: Statistical Spectrum Descriptors (168 features). – Second Stage: Area Method of Moments (20 features). 6

Proposal Training – First Stage: Train a MLP NN with the style assignment outputs. Calculate thresholds for each class using the output probabilities. Find the most confused classes using the confusion matrix and also build a list of confused classes. – Second Stage: Train SVM binary classifiers using the list of confused classes with a different dataset. – Third Stage: Train binary classifiers, but now using 2-class MLP NN, with the same configuration of the second stage. Evaluating – First Stage: Get MAX 1 and MAX 2 output probabilities. Compare MAX 1 with the threshold for reject, classify or send to second stage. – Second Stage: Get MAX 3. Search for a binary classifier, and compare with the threshold and MAX 1 for reject, classify or send to third stage. – Third Stage: Get MAX 4 and combine the probabilities with MAX 3. Using the threshold to reject or classify. 7

Training the First Stage Classifier: MLP Neural Network with 168 inputs, 100 hidden layer units, and 25 outputs. Features: Statistical Spectrum Descriptors. Partition: 66% of the dataset. 8

Training the First Stage 9

Training the Second Stage Classifier: 2-class SVM with gridsearch to estimate the cost and  parameters. Features: Area Method of Moments. Partition: 66% of the dataset. 10

Training the Second Stage Train each binary classifier in  list of binary classifiers). 11

Training the Third Stage Classifier: 2-class MLP NN, and 2-class SVM, the same used in the second stage. Features: Area Method of Moments, same of the second stage. 2-class MLP NN: Train each binary classifier in  The same as the Training method adopted in the second stage. 12

Evaluating the First Stage 13

Evaluating the Second Stage 14

Evaluating the Third Stage 15

Results First Stage (%)Second Stage (%) ClassifiedRejectedSent to 2nd StageClassifiedRejectedSent to 3rd Stage ClassTPFPTPFPTPFPTPFPTPFPTPFP Big Band0,0000,3450,0000,3320,0000,4630,0000,1550,0050,0000,3030,000 Blues Contemporary0,1280,5750,0310,8540,0630,8620,0050,2630,0290,0000,6270,000 Country Traditional1,4300,7060,1880,5890,4190,7420,0260,2970,0250,0000,8010,012 Dance0,4812,4760,1590,6550,2291,5060,1300,3250,1540,0000,6990,427 Electronica0,0991,6480,0910,9180,1051,1210,0280,3310,0970,0000,7700,000 Experimental0,0231,4080,0131,3320,0191,6230,0090,6130,0340,0000,9870,000 Folk International0,0111,2170,0120,8790,0011,4810,0000,4540,0560,0000,9720,000 Gospel0,0001,2110,0000,4780,0000,8620,0000,2540,0380,0000,5700,000 Grunge Emo0,0001,2500,0000,4010,0000,6300,0000,3360,0130,0000,2810,000 Hip Hop Rap4,4650,2890,2430,1230,5140,2590,0510,1100,0000,0660,5350,011 Jazz Classic0,5950,5240,3560,5820,5321,0700,1510,3600,0500,0000,9920,049 Metal Alternative2,0751,0740,1960,5650,5290,6830,3970,1770,0160,0000,5480,074 Metal Death0,9641,2670,0170,3040,5490,5090,1040,3140,0020,0000,6310,008 Metal Heavy0,2711,9370,0240,4910,0941,0980,0670,4930,0090,0000,3500,274 Pop Contemporary0,4132,3080,0310,6240,2031,4100,0490,3790,1080,0000,8280,249 Pop Indie0,8381,9360,4591,1240,1952,0510,1290,6660,0550,0000,9460,450 Pop Latin0,0781,1720,0190,6050,0390,8970,0000,2040,0690,0000,6630,000 Punk0,4911,3410,1030,5570,1680,8540,0120,5190,0120,0000,4580,021 Reggae0,0260,9730,0140,4340,0120,4540,0000,1100,0410,0000,3150,000 RnB Soul0,0000,9950,0000,4490,0000,8440,0000,2390,0390,0000,5660,000 Rock Alternative0,0002,2090,0000,9640,0001,4680,0000,5470,0280,0000,8930,000 Rock College0,0792,5010,0041,4880,0251,9490,0090,7500,0340,0001,1820,000 Rock Contemporary1,1521,8210,1430,7920,3941,7300,2780,2620,0740,0000,4571,053 Rock Hard0,1611,7980,0121,1940,1111,5810,0750,6420,0420,0000,9330,000 Rock Neo Psychedelia0,0001,9900,0000,7960,0001,2610,0000,5630,0310,0000,6660,000 Total13,78034,9682,11617,5294,20027,4081,5189,3641,0610,06616,9742,625 The results are presented in percentage relative to the amount test patterns. Classified TP: Samples classified correctly. Classified FP: Samples classified wrong. Rejected TP: Samples rejected and would be classified wrong. Rejected FP: Samples rejected but would be classified right. Second Stage TP: Samples sent to the second stage and would be classified wrong. Second Stage FP: Samples sent to the second stage but would be classified right. Third Stage TP: Samples sent to the third stage and would be classified wrong. Third Stage FP: Samples sent to the third stage but would be classified right. 16