Research & development Component Score Weighting for GMM based Text-Independent Speaker Verification Liang Lu SNLP Unit, France Telecom R&D Beijing 2008-01-21.

Slides:



Advertisements
Similar presentations
Known Non-targets for PLDA-SVM Training/Scoring Construction of Discriminative Kernels from Known and Unknown Non-targets for PLDA-SVM Scoring Results.
Advertisements

© Fraunhofer FKIE Corinna Harwardt Automatic Speaker Recognition in Military Environment.
Acoustic Vector Re-sampling for GMMSVM-Based Speaker Verification
1 Bayesian Adaptation in HMM Training and Decoding Using a Mixture of Feature Transforms Stavros Tsakalidis and Spyros Matsoukas.
Watching Unlabeled Video Helps Learn New Human Actions from Very Few Labeled Snapshots Chao-Yeh Chen and Kristen Grauman University of Texas at Austin.
AN INVESTIGATION OF DEEP NEURAL NETWORKS FOR NOISE ROBUST SPEECH RECOGNITION Michael L. Seltzer, Dong Yu Yongqiang Wang ICASSP 2013 Presenter : 張庭豪.
Hidden Markov Model based 2D Shape Classification Ninad Thakoor 1 and Jean Gao 2 1 Electrical Engineering, University of Texas at Arlington, TX-76013,
Most slides from Expectation Maximization (EM) Northwestern University EECS 395/495 Special Topics in Machine Learning.
Speaker Adaptation for Vowel Classification
8/12/2003 Text-Constrained Speaker Recognition Using Hidden Markov Models Kofi A. Boakye International Computer Science Institute.
Gaussian Mixture Example: Start After First Iteration.
9/20/2004Speech Group Lunch Talk Speaker ID Smorgasbord or How I spent My Summer at ICSI Kofi A. Boakye International Computer Science Institute.
Background Subtraction for Urban Traffic Monitoring using Webcams Master Graduation Project Final Presentation Supervisor: Rein van den Boomgaard Mark.
EE225D Final Project Text-Constrained Speaker Recognition Using Hidden Markov Models Kofi A. Boakye EE225D Final Project.
SNR-Dependent Mixture of PLDA for Noise Robust Speaker Verification
Tous droits réservés © 2005 CRIM The CRIM Systems for the NIST 2008 SRE Patrick Kenny, Najim Dehak and Pierre Ouellet Centre de recherche informatique.
Gaussian Mixture Models and Expectation Maximization.
(ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence
HMM-BASED PSEUDO-CLEAN SPEECH SYNTHESIS FOR SPLICE ALGORITHM Jun Du, Yu Hu, Li-Rong Dai, Ren-Hua Wang Wen-Yi Chu Department of Computer Science & Information.
An Adaptive Modeling for Robust Prognostics on a Reconfigurable Platform Behrad Bagheri Linxia Liao.
Institute of Information Science, Academia Sinica, Taiwan Speaker Verification via Kernel Methods Speaker : Yi-Hsiang Chao Advisor : Hsin-Min Wang.
Utterance Verification for Spontaneous Mandarin Speech Keyword Spotting Liu Xin, BinXi Wang Presenter: Kai-Wun Shih No.306, P.O. Box 1001,ZhengZhou,450002,
Chapter 14 Speaker Recognition 14.1 Introduction to speaker recognition 14.2 The basic problems for speaker recognition 14.3 Approaches and systems 14.4.
1 Phoneme and Sub-phoneme T- Normalization for Text-Dependent Speaker Recognition Doroteo T. Toledano 1, Cristina Esteve-Elizalde 1, Joaquin Gonzalez-Rodriguez.
International Conference on Intelligent and Advanced Systems 2007 Chee-Ming Ting Sh-Hussain Salleh Tian-Swee Tan A. K. Ariff. Jain-De,Lee.
1 A Unified Relevance Model for Opinion Retrieval (CIKM 09’) Xuanjing Huang, W. Bruce Croft Date: 2010/02/08 Speaker: Yu-Wen, Hsu.
1 Improved Speaker Adaptation Using Speaker Dependent Feature Projections Spyros Matsoukas and Richard Schwartz Sep. 5, 2003 Martigny, Switzerland.
Signature with Text-Dependent and Text-Independent Speech for Robust Identity Verification B. Ly-Van*, R. Blouet**, S. Renouard** S. Garcia-Salicetti*,
Signature with Text-Dependent and Text-Independent Speech for Robust Identity Verification B. Ly-Van*, R. Blouet**, S. Renouard** S. Garcia-Salicetti*,
Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.
Dijana Petrovska-Delacrétaz 1 Asmaa el Hannani 1 Gérard Chollet 2 1: DIVA Group, University of Fribourg 2: GET-ENST, CNRS-LTCI,
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Boosting Training Scheme for Acoustic Modeling Rong Zhang and Alexander I. Rudnicky Language Technologies Institute, School of Computer Science Carnegie.
Speaker Verification Speaker verification uses voice as a biometric to determine the authenticity of a user. Speaker verification systems consist of two.
A Baseline System for Speaker Recognition C. Mokbel, H. Greige, R. Zantout, H. Abi Akl A. Ghaoui, J. Chalhoub, R. Bayeh University Of Balamand - ELISA.
Nick Wang, 25 Oct Speaker identification and verification using EigenVoices O. Thyes, R. Kuhn, P. Nguyen, and J.-C. Junqua in ICSLP2000 Presented.
Tom Ko and Brian Mak The Hong Kong University of Science and Technology.
I-SMOOTH FOR IMPROVED MINIMUM CLASSIFICATION ERROR TRAINING Haozheng Li, Cosmin Munteanu Pei-ning Chen Department of Computer Science & Information Engineering.
Multi-Speaker Modeling with Shared Prior Distributions and Model Structures for Bayesian Speech Synthesis Kei Hashimoto, Yoshihiko Nankaku, and Keiichi.
Conditional Random Fields for ASR Jeremy Morris July 25, 2006.
Voice Activity Detection based on OptimallyWeighted Combination of Multiple Features Yusuke Kida and Tatsuya Kawahara School of Informatics, Kyoto University,
Variational Bayesian Methods for Audio Indexing
MINIMUM WORD CLASSIFICATION ERROR TRAINING OF HMMS FOR AUTOMATIC SPEECH RECOGNITION Yueng-Tien, Lo Speech Lab, CSIE National.
A DYNAMIC APPROACH TO THE SELECTION OF HIGH ORDER N-GRAMS IN PHONOTACTIC LANGUAGE RECOGNITION Mikel Penagarikano, Amparo Varona, Luis Javier Rodriguez-
1 Evaluating High Accuracy Retrieval Techniques Chirag Shah,W. Bruce Croft Center for Intelligent Information Retrieval Department of Computer Science.
Low Power Huffman Coding for High Performance Data Transmission Chiu-Yi Chen,Yu-Ting Pai, Shanq-Jang Ruan, International Conference on, ICHIT '06,
Speaker Verification Using Adapted GMM Presented by CWJ 2000/8/16.
SNR-Invariant PLDA Modeling for Robust Speaker Verification Na Li and Man-Wai Mak Department of Electronic and Information Engineering The Hong Kong Polytechnic.
Gaussian Mixture Language Models for Speech Recognition Mohamed Afify, Olivier Siohan and Ruhi Sarikaya.
Discriminative n-gram language modeling Brian Roark, Murat Saraclar, Michael Collins Presented by Patty Liu.
2011 IEEE International Conference on Fuzzy Systems The Development of the Automatic Lane Following Navigation System for the Intelligent Robotic Wheelchair.
Present by: Fang-Hui Chu Large Margin Gaussian Mixture Modeling for Phonetic Classification and Recognition Fei Sha*, Lawrence K. Saul University of Pennsylvania.
2009 NIST Language Recognition Systems Yan SONG, Bing Xu, Qiang FU, Yanhua LONG, Wenhui LEI, Yin XU, Haibing ZHONG, Lirong DAI USTC-iFlytek Speech Group.
- A Maximum Likelihood Approach Vinod Kumar Ramachandran ID:
Utterance verification in continuous speech recognition decoding and training Procedures Author :Eduardo Lleida, Richard C. Rose Reporter : 陳燦輝.
A Study on Speaker Adaptation of Continuous Density HMM Parameters By Chin-Hui Lee, Chih-Heng Lin, and Biing-Hwang Juang Presented by: 陳亮宇 1990 ICASSP/IEEE.
A Tutorial on Speaker Verification First A. Author, Second B. Author, and Third C. Author.
1 Minimum Bayes-risk Methods in Automatic Speech Recognition Vaibhava Geol And William Byrne IBM ; Johns Hopkins University 2003 by CRC Press LLC 2005/4/26.
Voice Activity Detection Based on Sequential Gaussian Mixture Model Zhan Shen, Jianguo Wei, Wenhuan Lu, Jianwu Dang Tianjin Key Laboratory of Cognitive.
Detecting Semantic Concepts In Consumer Videos Using Audio Junwei Liang, Qin Jin, Xixi He, Gang Yang, Jieping Xu, Xirong Li Multimedia Computing Lab,
Progress Report - V Ravi Chander
RECURRENT NEURAL NETWORKS FOR VOICE ACTIVITY DETECTION
Voice conversion using Artificial Neural Networks
Bayesian Models in Machine Learning
ECE539 final project Instructor: Yu Hen Hu Fall 2005
Decision Making Based on Cohort Scores for
Generally Discriminant Analysis
SNR-Invariant PLDA Modeling for Robust Speaker Verification
Internet Traffic Analysis: Coseners, 2019 Mohammed Alasmar ‘19.
Discriminative Training
Presentation transcript:

research & development Component Score Weighting for GMM based Text-Independent Speaker Verification Liang Lu SNLP Unit, France Telecom R&D Beijing

research & development Outline Introduction Conventional LLR and Motivation for detailed score processing Component Score Weighting Experimental Results Conclusion

research & development Introduction State of the art GMM-UBM framework  GMM based model construction  Log-likelihood Ratio (LLR) based decision making  Score Normalisation (Tnorm, Hnorm, etc) for robustesses

research & development Introduction Major challenges  Limited data for speaker model training  Mismatch between training and testing data

research & development Motivation for Component Score Weighting Motivation  The insufficiency of training data and mismatch between training and testing condition make the mixtures in GMM different in discriminative capability  The LLR just sum the score of each mixture without considering its reliability  Does it helpful if LLR considers the discriminative capability of each mixture? Question If it does, how to explore the discriminative capabilities of Gaussian Component Mixtures

research & development Component Score Weighting Our Method First, scatter the LLR to each Gaussian mixture Where, the k-th mixture is dominant for frame, namely, Let we call is the dominant score and is the residual score

research & development in original LLR Component Score Weighting Extend the original LLR After doing this, the original LLR will be spitted into two score serials, dominant score serial and residual score serial Original: If we consider the discriminative capacity of each Gaussian mixture Extended:

research & development Component Score Weighting Now the question is: How can we know the discriminative capability of each Gaussian mixture and what the should be? Our assumption: We believe that the high dominant scores will have better discriminative capability and should be highlighted.

research & development Component Score Weighting Why the high dominant scores?  If the test utterance is from the target speaker, then more components in GMM should get high value compared with UBM.  If the utterance is form imposter, then high-valued components in GMM are hardly more UBM.  If the test utterance is from the target speaker, the low-valued components in GMM is due to the mixtures are not well trained or mismatch exists between training and testing data.

research & development Component Score Weighting Restrained Emphasized We simply used an exponential function as the weighting function The residual scores have little importance and we ignore them finally. The final LLR score is as follows:

research & development Experimental Results systemEER (%)MinDCF (x100) GMM baseline GMM with CSW GMM with TNorm GMM with CSW&TNorm Table: Results for GMM baseline and GMM with Component Score Weighting with TNorm Experiments are performed in the 1conv4w-1conv4w task of the 2006 NIST SRE corpora

research & development Conclusion  Split the LLR score and consider the discriminative capacity of Gaussian mixtures is helpful to cope with the insufficiency of training data and mismatch between training and testing condition.  The score weighting function should be coincident with the component score distribution and discriminative capacity.  The exponential weighting function used in this investigation is not universal and also may not optimal. More work is needed to explore an optimal weighting function.

research & development