Sfax University, Tunisia National Engineering School of Sfax Advanced Technologies for Medicine and Signals Research Unit Efficient parameterization for Automatic speaker recognition using Support Vector Machines Today I will present my paper entitled Rania Chakroun Mondher Frikha Leila beltaifa zouari
1. 2. 3. 4. Outline Introduction System Overview Experimental Results The outline of my presentation is orgnized as follows: **First, I will present the context of this work with an introduction about speaker recognition **Second, I will give an overview of the systems developped **Then, I will detail our Experimental Results. ** and Finally, I will give our Conclusions & Future Work Let us begin with the introduction Conclusions & Future Work 4. Chakroun Rania ISDA 2015
Motivation Individuals need to be identified in sensitive domains and applications such as Access control Research and supervision individuals Border control Military intelligence (suspects' identification) …. that’s why biometric technology is developed to improve security and reduce fraud Instead of passwords, PINs or signatures (which can be stolen, forgotten or forged) Chakroun Rania ISDA 2015
Motivation There are many biometric modalities Fingerprint Palmprint Hand Geometry Face Iris Retina Scan DNA Signatures Voiceprint …. Which is the modality adopted in this work Chakroun Rania ISDA 2015
What language was spoken? Extracting Information from Speech Speech conveys many Information Where is he/she from? What language was spoken? What was spoken? Accent Recognition Language Recognition Speech Recognition Emotion Recognition Gender Recognition Speaker Recognition Positive? Negative? Happy? Sad? Male or Female? Who spoke? Chakroun Rania ISDA 2015
Speaker Recognition applications The applications of speaker recognition technology are quite varied Authentication Banking application. Law Enforcemet Proving the identity of a recorded voice can help to convict a criminal or discharge an innocent in court. Surveillance. Electronic eavesdropping of telephone and radio conversations. Chakroun Rania ISDA 2015
Problems Speaker recognition area suffers from many problems Inter-Speaker Variability Intra-Speaker Variability Spoofing attacks Learning and testing data duration …etc. Speaker reco applications share many problems Chakroun Rania ISDA 2015
Objective Determine whom is talking from set of known voices No identity claim from user Limited data duration which information we use ? how can we exploit them ? In this work, we want to Determine ……………Knowing that No identity Chakroun Rania ISDA 2015
1. 2. 3. 4. Outline Introduction System Overview Experimental Results Let us pass in Conclusions & Future Work 4. Chakroun Rania ISDA 2015
Speaker Recognition System Speaker recognition systems have 2 main parts Which characteristics? Which approaches? Feature Extraction Learning algorithm Speaker Recognition Systems havo 2 main parts These systems use on which approaches they depend? the Feature Extraction Recognition algorithm Decision Chakroun Rania ISDA 2015
Characteristics & approaches MFCC LFCC LPCC PLP,… … Approaches Gaussian Mixture Models (GMM) Support Vector Machines (SVM) We find many Characteristics We find different Approaches where the most successful are Chakroun Rania ISDA 2015
1. 2. 3. 4. Outline Introduction System Overview Experimental Results Conclusions & Future Work 4. Chakroun Rania ISDA 2015
Database TIMIT Database 64 American speakers From 8 dialect region (« dr1 » to « dr8 ») With 10 utterances/ speaker Average duration of 3.28 s /utterance Dialect Region code Dialect Region DR1 New England DR2 Northern DR3 North Midland DR4 South Midland DR5 Southern DR6 New York City DR7 Western DR8 Army brat Twenty eight Chakroun Rania ISDA 2015
Speaker Recognition System Which characteristics we use? Which approaches we use? Feature Extraction Learning algorithm So Which characteristics we use? And Which approaches we shoose to use? Feature Extraction Recognition algorithm Decision Chakroun Rania ISDA 2015
Characteristics & approaches System based on Support Vector Machine (SVM) Characteristics First System based on Mel Frequency Cepstral Coefficients (MFCC) 19 MFCC+Delta+double delta (60-dimensional vectors) Second proposed system based on reduced MFCC features 12 MFCC+Delta+double delta (39-dimensional vectors) Third proposed system based new cepstral features Cepstral Mean and Variance Normalization (CMVN) 39-dimensional vectors We shoose the approach based We develop Chakroun Rania ISDA 2015
Results best result 98,43 % best result 100 % best result 100 % Speaker identification rates with SVM-based systems with 8 utterances for training and 2 utterances for the test using RBF and Linear kernels best result 98,43 % best result 100 % best result 100 % Chakroun Rania ISDA 2015
Results best result 96,88 % best result 98,43 % best result 100 % Speaker identification rates with SVM-based systems with 3 utterances for training and 2 utterances for the test using RBF and Linear kernels best result 96,88 % best result 98,43 % best result 100 % Chakroun Rania ISDA 2015
1. 2. 3. 4. Outline Introduction System Overview Experimental Results Conclusions & Future Work 4. Chakroun Rania ISDA 2015
Conclusions & Future Work Speaker recognition technology is a viable technique currently available for applications. A new approach based on SVM using new low-dimensional CMVN feature vectors is proposed and gives significant improvements for Speaker identification. Future work will focus on further experiments to test the new approach with new algorithms and/or features. Research will focus on using speaker identification for more unconstrained and uncontrolled situations. Chakroun Rania ISDA 2015
Thank you for your attention