Decision Making Based on Cohort Scores for

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

Integrated Instance- and Class- based Generative Modeling for Text Classification Antti PuurulaUniversity of Waikato Sung-Hyon MyaengKAIST 5/12/2013 Australasian.
An Overview of Machine Learning
Software Quality Ranking: Bringing Order to Software Modules in Testing Fei Xing Michael R. Lyu Ping Guo.
LYU0103 Speech Recognition Techniques for Digital Video Library Supervisor : Prof Michael R. Lyu Students: Gao Zheng Hong Lei Mo.
Speaker Adaptation for Vowel Classification
Minimum Classification Error Networks Based on book chapter 9, by Shigeru Katagiri Jaakko Peltonen, 28 th February, 2002.
1 How to be a Bayesian without believing Yoav Freund Joint work with Rob Schapire and Yishay Mansour.
Language and Speaker Identification using Gaussian Mixture Model Prepare by Jacky Chau The Chinese University of Hong Kong 18th September, 2002.
Optimal Adaptation for Statistical Classifiers Xiao Li.
HMM-BASED PSEUDO-CLEAN SPEECH SYNTHESIS FOR SPLICE ALGORITHM Jun Du, Yu Hu, Li-Rong Dai, Ren-Hua Wang Wen-Yi Chu Department of Computer Science & Information.
1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University.
Institute of Information Science, Academia Sinica, Taiwan Speaker Verification via Kernel Methods Speaker : Yi-Hsiang Chao Advisor : Hsin-Min Wang.
Automatically Identifying Localizable Queries Center for E-Business Technology Seoul National University Seoul, Korea Nam, Kwang-hyun Intelligent Database.
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
International Conference on Intelligent and Advanced Systems 2007 Chee-Ming Ting Sh-Hussain Salleh Tian-Swee Tan A. K. Ariff. Jain-De,Lee.
Research & development Component Score Weighting for GMM based Text-Independent Speaker Verification Liang Lu SNLP Unit, France Telecom R&D Beijing
REVISED CONTEXTUAL LRT FOR VOICE ACTIVITY DETECTION Javier Ram’ırez, Jos’e C. Segura and J.M. G’orriz Dept. of Signal Theory Networking and Communications.
Classification and Ranking Approaches to Discriminative Language Modeling for ASR Erinç Dikici, Murat Semerci, Murat Saraçlar, Ethem Alpaydın 報告者:郝柏翰 2013/01/28.
Signature with Text-Dependent and Text-Independent Speech for Robust Identity Verification B. Ly-Van*, R. Blouet**, S. Renouard** S. Garcia-Salicetti*,
Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.
1 Robust Endpoint Detection and Energy Normalization for Real-Time Speech and Speaker Recognition Qi Li, Senior Member, IEEE, Jinsong Zheng, Augustine.
Speaker Verification Speaker verification uses voice as a biometric to determine the authenticity of a user. Speaker verification systems consist of two.
A Baseline System for Speaker Recognition C. Mokbel, H. Greige, R. Zantout, H. Abi Akl A. Ghaoui, J. Chalhoub, R. Bayeh University Of Balamand - ELISA.
I-SMOOTH FOR IMPROVED MINIMUM CLASSIFICATION ERROR TRAINING Haozheng Li, Cosmin Munteanu Pei-ning Chen Department of Computer Science & Information Engineering.
Multi-Speaker Modeling with Shared Prior Distributions and Model Structures for Bayesian Speech Synthesis Kei Hashimoto, Yoshihiko Nankaku, and Keiichi.
Speaker Identification by Combining MFCC and Phase Information Longbiao Wang (Nagaoka University of Technologyh, Japan) Seiichi Nakagawa (Toyohashi University.
Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.
Voice Activity Detection based on OptimallyWeighted Combination of Multiple Features Yusuke Kida and Tatsuya Kawahara School of Informatics, Kyoto University,
Variational Bayesian Methods for Audio Indexing
Data Mining, ICDM '08. Eighth IEEE International Conference on Duy-Dinh Le National Institute of Informatics Hitotsubashi, Chiyoda-ku Tokyo,
On Utillizing LVQ3-Type Algorithms to Enhance Prototype Reduction Schemes Sang-Woon Kim and B. John Oommen* Myongji University, Carleton University*
A DYNAMIC APPROACH TO THE SELECTION OF HIGH ORDER N-GRAMS IN PHONOTACTIC LANGUAGE RECOGNITION Mikel Penagarikano, Amparo Varona, Luis Javier Rodriguez-
Presented by: Fang-Hui Chu Discriminative Models for Speech Recognition M.J.F. Gales Cambridge University Engineering Department 2007.
A New Approach to Utterance Verification Based on Neighborhood Information in Model Space Author :Hui Jiang, Chin-Hui Lee Reporter : 陳燦輝.
Bayesian Speech Synthesis Framework Integrating Training and Synthesis Processes Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda Nagoya Institute.
NTU & MSRA Ming-Feng Tsai
Research & Technology Progress in the framework of the RESPITE project at DaimlerChrysler Research & Technology Dr-Ing. Fritz Class and Joan Marí Sheffield,
Speaker Verification Using Adapted GMM Presented by CWJ 2000/8/16.
Stats Term Test 4 Solutions. c) d) An alternative solution is to use the probability mass function and.
Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning Speech Communication, 2000 Authors: S. M. Witt, S. J. Young Presenter:
SNR-Invariant PLDA Modeling for Robust Speaker Verification Na Li and Man-Wai Mak Department of Electronic and Information Engineering The Hong Kong Polytechnic.
On the relevance of facial expressions for biometric recognition Marcos Faundez-Zanuy, Joan Fabregas Escola Universitària Politècnica de Mataró (Barcelona.
Computer Vision Lecture 7 Classifiers. Computer Vision, Lecture 6 Oleh Tretiak © 2005Slide 1 This Lecture Bayesian decision theory (22.1, 22.2) –General.
Utterance verification in continuous speech recognition decoding and training Procedures Author :Eduardo Lleida, Richard C. Rose Reporter : 陳燦輝.
A Study on Speaker Adaptation of Continuous Density HMM Parameters By Chin-Hui Lee, Chih-Heng Lin, and Biing-Hwang Juang Presented by: 陳亮宇 1990 ICASSP/IEEE.
Bayes Risk Minimization using Metric Loss Functions R. Schlüter, T. Scharrenbach, V. Steinbiss, H. Ney Present by Fang-Hui, Chu.
Objectives: Loss Functions Risk Min. Error Rate Class. Resources: DHS – Chap. 2 (Part 1) DHS – Chap. 2 (Part 2) RGO - Intro to PR MCE for Speech MCE for.
Study on Deep Learning in Speaker Recognition Lantian Li CSLT / RIIT Tsinghua University May 26, 2016.
Face Detection EE368 Final Project Group 14 Ping Hsin Lee
Progress Report - V Ravi Chander
The Elements of Statistical Learning
Source: Procedia Computer Science(2015)70:
Machine Learning Basics
Outlier Discovery/Anomaly Detection
3. Applications to Speaker Verification
LECTURE 05: THRESHOLD DECODING
Learning with information of features
SMEM Algorithm for Mixture Models
Sfax University, Tunisia
MEgo2Vec: Embedding Matched Ego Networks for User Alignment Across Social Networks Jing Zhang+, Bo Chen+, Xianming Wang+, Fengmei Jin+, Hong Chen+, Cuiping.
Generally Discriminant Analysis
A maximum likelihood estimation and training on the fly approach
Junheng, Shengming, Yunsheng 11/09/2018
Topic 5: Cluster Analysis
SNR-Invariant PLDA Modeling for Robust Speaker Verification
Deep Factorization for Speech Signal
EM Algorithm and its Applications
Discriminative Training
Outlines Introduction & Objectives Methodology & Workflow
Presentation transcript:

Decision Making Based on Cohort Scores for Speaker Verification Lantian Li CSLT / RIIT, Tsinghua University lilt13@mails.tsinghua.edu.cn Co-work with Renyu Wang, Caixia Wang and Thomas Fang Zheng IEEE APSIPA ASC, Dec. 13-16, 2016 APSIPA presentation 1/13/2019 APSIPA ASC 2016

Outline Introduction Cohort-based decision making framework A single-score decision making Solution by a multi-score decision making Cohort-based decision making framework Cohort selection Feature design Discriminative model training Experiments Conclusions 1/13/2019 APSIPA ASC 2016

Introduction Speaker recognition Decision making The single-score decision is simple and efficient Quite sensitive to variations (score variation) Text contents, channel, speaking styles. Difficulty in choosing an appropriate threshold Error-pron decision In a typical GMM-UBM system, the score is often computed as the log likelihood ratio that the test utterance being generated from the GMM of the claimed speaker and the UBM. Leads to the err-pron decsion 1/13/2019 APSIPA ASC 2016

Introduction Score normalization techniques Bayes’ theorem (Z-norm, T-norm, etc.) Cohort normalization Cohort replaces the UBM The alternative hypothesis more accurately It is also simply averaged to normalize the target score.  Still a single-score approach 1/13/2019 APSIPA ASC 2016

Motivations Our idea A new cohort approach Cohort normalization is not just a mean average. Cohort scores: distributions, ranks, spanning areas, etc. A new cohort approach Decision on the whole cohort sets Employ a powerful discriminative model A true and reliable multi-score decision making 1/13/2019 APSIPA ASC 2016

Cohort-based decision making framework Cohort selection How to select a cohort for each claimed speaker. Feature design How to fully use these cohort scores. Discriminative model training How to build a more powerful decision model. Add three parts in the typical GMM-UBM system. 1/13/2019 APSIPA ASC 2016

Cohort-based decision making framework Cohort selection Vector quantization (VQ) K-L distance Minimize the within-class cost stopping criterion 1/13/2019 APSIPA ASC 2016

Cohort-based decision making framework Feature design Distribution of scores Score normalization Rank position Sorted score differences … The distribution of scores between target test and imposter test is different. 1/13/2019 APSIPA ASC 2016

Experiments Database (‘CSLT-DSDB’: all recordings is the text-prompted digit strings.) Training set: 200 females and 200 males for UBM training. Development set: 145 speakers including 280 enrollment and 2,874 test utterances. Cohort selection and feature design. Evaluation set: 92 speakers including 1,220 target trials and 111,020 non-target trials. Experimental setups 13-dim MFCCs + + 256 Gaussian components ∆ ∆∆ 1/13/2019 APSIPA ASC 2016

Experiments Two cohort selection criterion Global clustering Sibling speakers Nearest neighbor 1/13/2019 APSIPA ASC 2016

Experiments Feature design Rank position Sorted score differences 1/13/2019 APSIPA ASC 2016

Experiments Discriminative model training SVM-based scoring NN-based scoring 1.621 (Baseline) 1.621 (Baseline) ‘norm’ is the standard T-norm methods, ‘r-pos’ is the ‘Rank position’, ‘s-diff’ is the ‘Sorted score differences’. 1/13/2019 APSIPA ASC 2016

Conclusions Decision making method Future work Distribution of Cohort scores Design score-level features (‘sorted score differences’) Powerful discriminative models Stable and better performance than GMM-UBM baseline Future work Feature designing and cohort selection 1/13/2019 APSIPA ASC 2016

Thank you lilt.cslt.org IEEE APSIPA ASC Dec. 13-16, 2016, Jeju, Korea 1/13/2019 APSIPA ASC 2016