3. Applications to Speaker Verification

Slides:

Advertisements

Similar presentations

Entropy and Dynamism Criteria for Voice Quality Classification Applications Authors: Peter D. Kukharchik, Igor E. Kheidorov, Hanna M. Lukashevich, Denis.

Advertisements

Franz de Leon, Kirk Martinez Web and Internet Science Group  School of Electronics and Computer Science  University of Southampton {fadl1d09,

Acoustic Vector Re-sampling for GMMSVM-Based Speaker Verification

Introduction The aim the project is to analyse non real time EEG (Electroencephalogram) signal using different mathematical models in Matlab to predict.

Supervised Learning: Linear Perceptron NN. Distinction Between Approximation- Based vs. Decision-Based NNs Teacher in Approximation-Based NN are quantitative.

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

F 鍾承道 Acoustic Features for Speech Recognition: From Mel-Frequency Cepstrum Coefficients (MFCC) to BottleNeck Features(BNF)

Page 0 of 8 Time Series Classification – phoneme recognition in reconstructed phase space Sanjay Patil Intelligent Electronics Systems Human and Systems.

Speaker Adaptation for Vowel Classification

Dynamic Face Recognition Committee Machine Presented by Sunny Tang.

November 2, 2010Neural Networks Lecture 14: Radial Basis Functions 1 Cascade Correlation Weights to each new hidden node are trained to maximize the covariance.

Optimal Adaptation for Statistical Classifiers Xiao Li.

Supervised Learning Networks. Linear perceptron networks Multi-layer perceptrons Mixture of experts Decision-based neural networks Hierarchical neural.

Hazırlayan NEURAL NETWORKS Radial Basis Function Networks I PROF. DR. YUSUF OYSAL.

Authors: Anastasis Kounoudes, Anixi Antonakoudi, Vasilis Kekatos

Hazırlayan NEURAL NETWORKS Radial Basis Function Networks II PROF. DR. YUSUF OYSAL.

8/10/ RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.

Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.

EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.

Radial Basis Function Networks

Classification of place of articulation in unvoiced stops with spectro-temporal surface modeling V. Karjigi , P. Rao Dept. of Electrical Engineering,

INTRODUCTION  Sibilant speech is aperiodic.  the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ /  we present a sibilant.

CPSC 601 Lecture Week 5 Hand Geometry. Outline: 1.Hand Geometry as Biometrics 2.Methods Used for Recognition 3.Illustrations and Examples 4.Some Useful.

VBS Documentation and Implementation The full standard initiative is located at Quick description Standard manual.

Chapter 14 Speaker Recognition 14.1 Introduction to speaker recognition 14.2 The basic problems for speaker recognition 14.3 Approaches and systems 14.4.

International Conference on Intelligent and Advanced Systems 2007 Chee-Ming Ting Sh-Hussain Salleh Tian-Swee Tan A. K. Ariff. Jain-De,Lee.

Evaluation of Speaker Recognition Algorithms. Speaker Recognition Speech Recognition and Speaker Recognition speaker recognition performance is dependent.

Csc Lecture 7 Recognizing speech. Geoffrey Hinton.

Supervisor: Dr. Eddie Jones Co-supervisor: Dr Martin Glavin Electronic Engineering Department Final Year Project 2008/09 Development of a Speaker Recognition/Verification.

Yang, Luyu.  Postal service for sorting mails by the postal code written on the envelop  Bank system for processing checks by reading the amount of.

Well Log Data Inversion Using Radial Basis Function Network Kou-Yuan Huang, Li-Sheng Weng Department of Computer Science National Chiao Tung University.

Basics of Neural Networks Neural Network Topologies.

Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.

Speaker Recognition by Habib ur Rehman Abdul Basit CENTER FOR ADVANCED STUDIES IN ENGINERING Digital Signal Processing ( Term Project )

Ch 4. Linear Models for Classification (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized and revised by Hee-Woong Lim.

Speaker Verification Speaker verification uses voice as a biometric to determine the authenticity of a user. Speaker verification systems consist of two.

A Baseline System for Speaker Recognition C. Mokbel, H. Greige, R. Zantout, H. Abi Akl A. Ghaoui, J. Chalhoub, R. Bayeh University Of Balamand - ELISA.

Look who’s talking? Project 3.1 Yannick Thimister Han van Venrooij Bob Verlinden Project DKE Maastricht University.

Other NN Models Reinforcement learning (RL)

Speaker Identification by Combining MFCC and Phase Information Longbiao Wang (Nagaoka University of Technologyh, Japan) Seiichi Nakagawa (Toyohashi University.

Speaker Verification Using Adapted GMM Presented by CWJ 2000/8/16.

By Sarita Jondhale 1 Signal preprocessor: “conditions” the speech signal s(n) to new form which is more suitable for the analysis Postprocessor: operate.

Speech Processing Using HTK Trevor Bowden 12/08/2008.

Speaker Verification System Middle Term Presentation Performed by: Barak Benita & Daniel Adler Instructor: Erez Sabag.

Speech Recognition through Neural Networks By Mohammad Usman Afzal Mohammad Waseem.

Automatic Classification of Audio Data by Carlos H. L. Costa, Jaime D. Valle, Ro L. Koerich IEEE International Conference on Systems, Man, and Cybernetics.

CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.

Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.

Neural networks and support vector machines

ARTIFICIAL NEURAL NETWORKS

ECE 539 Project Jialin Zhang

Presentation on Artificial Neural Network Based Pathological Voice Classification Using MFCC Features Presenter: Subash Chandra Pakhrin 072MSI616 MSC in.

Final Year Project Presentation --- Magic Paint Face

Boosting Nearest-Neighbor Classifier for Character Recognition

Outline Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no.

Presentation for EEL6586 Automatic Speech Processing

Neuro-Computing Lecture 4 Radial Basis Function Network

Neuro-Fuzzy and Soft Computing for Speaker Recognition (語者辨識)

Neural Network - 2 Mayank Vatsa

EE513 Audio Signals and Systems

Introduction to Radial Basis Function Networks

Model generalization Brief summary of methods

A maximum likelihood estimation and training on the fly approach

Face Recognition: A Convolutional Neural Network Approach

Machine Learning Perceptron: Linearly Separable Supervised Learning

Linear Discrimination

EM Algorithm and its Applications

Review of Statistical Pattern Recognition

Combination of Feature and Channel Compensation (1/2)

Outline Announcement Neural networks Perceptrons - continued

Presentation transcript:

3. Applications to Speaker Verification 11/19/2018

Outline of the presentation 3. Applications to Speaker Verifications 3.1 Feature extraction 3.2 Speaker models 3.3 Scoring normalization 3.4 Video demo 11/19/2018

Pre-processing and Feature Extraction A/D Converter cepstrum and delta cepstrum coefficients LPC Analysis Hamming Windowing 11/19/2018

Pre-processing and Feature Extraction Spectral Envelop Reconstructed from different feature parameters FFT-based signal Spectrum LP Spectrum Spectrum derived from LP-Cepstrum Cepstral Processing Spectrum Amplitude (dB) Hz 11/19/2018

Covariance analysis or EM Enrollment RBF Network EBF Network Feature vectors Feature vectors K-means K-means Covariance analysis or EM K-nearest neighbor Function centers Linear regression Linear regression Covariance matrices Function widths Output weights 11/19/2018 W

Background speakers´centers Input (Feature vectors) Enrollment 0(Bias) Output weights  Background speakers´centers Speaker centers x1 x2 xD Input (Feature vectors) 11/19/2018

Verification + y(x) + - Averaging Averaging Softmax   x1 x2 xD ^ 11/19/2018

Verification Distributions of the average network outputs RBF EBF 11/19/2018

Error rates against decision threshold Verification Error rates against decision threshold 11/19/2018

Verification Results (TIMIT) Number of centers per network 11/19/2018

Verification Results Decision Boundaries EBF (diagonal cov. Matrices) EBF (full cov. Matrices) 11/19/2018

Conclusion EBF networks with full covariance matrices trained with the EM algorithm outperform the ones whose basis function parameters are estimated by the k-means algorithm and sample covariance. RBF networks are found to be the poorest performer in terms of verification accuracy. 11/19/2018

Conclusion EBF networks with full covariance matrices achieve the lowest error rates when networks with the same number of free parameters are compared. 11/19/2018

Scoring Normalization for Speaker 4. Bonus Materials: Scoring Normalization for Speaker Verification 11/19/2018

Purpose of Scoring Normalization Speaker model of claimed ID Sc Speech with claimed speaker ID X Feature extraction - Imposter Models Normalization Term 11/19/2018

Purpose of Scoring Normalization > Threshold Accept the claimant If log L(X)  Threshold Reject the claimant Prob. x1 (Accept) x2 (Reject) x 11/19/2018

EBFN-based normalization Speaker centers Anti-speaker centers Speaker models: Elliptical basis function networks (EBFN) 11/19/2018

References: [1] Mak, M.W. and Kung, S.Y. (2000). "Estimation of elliptical basis function parameters by the EM algorithms with application to speaker verification," IEEE Trans. on Neural Networks, Vol. 11, No. 4, pp. 961-969. [2] Yiu, K.K., Mak, M.W. and Li, C.K. (1999), “Gaussian mixture models and probabilistic decision-based neural networks for pattern classification: A comparative study," Neural Computing and Applications, 8, 235-245. [3] Zhang, W.D. Mak, M.W. and He, M.X. (2000). "A two-stage scoring method combining world and cohort models for speaker verification," Proc. ICASSP, Vol. 2, pp. 1193-1196, 2000. [4] Lin, S.H., Kung, S.Y. and Lin, L.J. (1997). “Face recognition/detection by probabilistic decision-based neural network, IEEE Trans. on Neural Networks, 8 (1), pp. 114-132. [5] Mak, M.W. et al. (1994), “Speaker Identification using Multi Layer Perceptrons and Radial Basis Functions Networks,” Neurocomputing, 6 (1), 99-118. 11/19/2018