Known Non-targets for PLDA-SVM Training/Scoring Construction of Discriminative Kernels from Known and Unknown Non-targets for PLDA-SVM Scoring Results.

Slides:

Advertisements

Similar presentations

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki

Advertisements

Zhijie Yan, Qiang Huo and Jian Xu Microsoft Research Asia

Context-based object-class recognition and retrieval by generalized correlograms by J. Amores, N. Sebe and P. Radeva Discussion led by Qi An Duke University.

Discriminative Training in Speech Processing Filipp Korkmazsky LORIA.

Punctuation Generation Inspired Linguistic Features For Mandarin Prosodic Boundary Prediction CHEN-YU CHIANG, YIH-RU WANG AND SIN-HORNG CHEN 2012 ICASSP.

Neural networks Introduction Fitting neural networks

Presentation in Aircraft Satellite Image Identification Using Bayesian Decision Theory And Moment Invariants Feature Extraction Dickson Gichaga Wambaa.

Acoustic Vector Re-sampling for GMMSVM-Based Speaker Verification

Fusion of HMM’s Likelihood and Viterbi Path for On-line Signature Verification Bao Ly Van - Sonia Garcia Salicetti - Bernadette Dorizzi Institut National.

Author :Panikos Heracleous, Tohru Shimizu AN EFFICIENT KEYWORD SPOTTING TECHNIQUE USING A COMPLEMENTARY LANGUAGE FOR FILLER MODELS TRAINING Reporter :

15.0 Utterance Verification and Keyword/Key Phrase Spotting References: 1. “Speech Recognition and Utterance Verification Based on a Generalized Confidence.

A Text-Independent Speaker Recognition System

Robust Voice Activity Detection for Interview Speech in NIST Speaker Recognition Evaluation Man-Wai MAK and Hon-Bill YU The Hong Kong Polytechnic University.

Large Lump Detection by SVM Sharmin Nilufar Nilanjan Ray.

HIWIRE MEETING Torino, March 9-10, 2006 José C. Segura, Javier Ramírez.

Speaker Adaptation for Vowel Classification

Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone Joseph Tabrikian Dept. of Electrical and Computer Engineering Ben-Gurion.

Minimum Classification Error Networks Based on book chapter 9, by Shigeru Katagiri Jaakko Peltonen, 28 th February, 2002.

Sketched Derivation of error bound using VC-dimension (1) Bound our usual PAC expression by the probability that an algorithm has 0 error on the training.

Applying Recent Advances in Speaker Recognition to the Field of Document Classification CS294-5 Class Project Kofi Boakye Andrew Hatch.

Sample-Separation-Margin Based Minimum Classification Error Training of Pattern Classifiers with Quadratic Discriminant Functions Yongqiang Wang 1,2, Qiang.

9/20/2004Speech Group Lunch Talk Speaker ID Smorgasbord or How I spent My Summer at ICSI Kofi A. Boakye International Computer Science Institute.

Presentation in IJCNN 2004 Biased Support Vector Machine for Relevance Feedback in Image Retrieval Hoi, Chu-Hong Steven Department of Computer Science.

A Study of the Relationship between SVM and Gabriel Graph ZHANG Wan and Irwin King, Multimedia Information Processing Laboratory, Department of Computer.

Large Lump Detection by SVM Sharmin Nilufar Nilanjan Ray.

SNR-Dependent Mixture of PLDA for Noise Robust Speaker Verification

ICASSP'06 1 S. Y. Kung 1 and M. W. Mak 2 1 Dept. of Electrical Engineering, Princeton University 2 Dept. of Electronic and Information Engineering, The.

Tous droits réservés © 2005 CRIM The CRIM Systems for the NIST 2008 SRE Patrick Kenny, Najim Dehak and Pierre Ouellet Centre de recherche informatique.

Introduction Mapping Modeling Speaker Diarization Summary H. Aronowitz (IBM) Intra-Class Variability Modeling for Speech Processing June 08 1 Dr. Hagai.

Soft Margin Estimation for Speech Recognition Main Reference: Jinyu Li, " SOFT MARGIN ESTIMATION FOR AUTOMATIC SPEECH RECOGNITION," PhD thesis, Georgia.

Truncation of Protein Sequences for Fast Profile Alignment with Application to Subcellular Localization Man-Wai MAK and Wei WANG The Hong Kong Polytechnic.

Semantic Similarity over Gene Ontology for Multi-label Protein Subcellular Localization Shibiao WAN and Man-Wai MAK The Hong Kong Polytechnic University.

We introduce the use of Confidence c as a weighted vote for the voting machine to avoid low confidence Result r of individual expert from affecting the.

VBS Documentation and Implementation The full standard initiative is located at Quick description Standard manual.

1 Phoneme and Sub-phoneme T- Normalization for Text-Dependent Speaker Recognition Doroteo T. Toledano 1, Cristina Esteve-Elizalde 1, Joaquin Gonzalez-Rodriguez.

Research & development Component Score Weighting for GMM based Text-Independent Speaker Verification Liang Lu SNLP Unit, France Telecom R&D Beijing

A Sparse Modeling Approach to Speech Recognition Based on Relevance Vector Machines Jon Hamaker and Joseph Picone Institute for.

TEMPLATE DESIGN © Zhiyao Duan 1,2, Lie Lu 1, and Changshui Zhang 2 1. Microsoft Research Asia (MSRA), Beijing, China.2.

Signature with Text-Dependent and Text-Independent Speech for Robust Identity Verification B. Ly-Van*, R. Blouet**, S. Renouard** S. Garcia-Salicetti*,

ICASSP Speech Discrimination Based on Multiscale Spectro–Temporal Modulations Nima Mesgarani, Shihab Shamma, University of Maryland Malcolm Slaney.

Sound-Event Partitioning and Feature Normalization for Robust Sound-Event Detection 2 Department of Electronic and Information Engineering The Hong Kong.

Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.

Using Support Vector Machines to Enhance the Performance of Bayesian Face Recognition IEEE Transaction on Information Forensics and Security Zhifeng Li,

Speaker Verification Speaker verification uses voice as a biometric to determine the authenticity of a user. Speaker verification systems consist of two.

A Baseline System for Speaker Recognition C. Mokbel, H. Greige, R. Zantout, H. Abi Akl A. Ghaoui, J. Chalhoub, R. Bayeh University Of Balamand - ELISA.

Institute of Information Science, Academia Sinica, Taiwan Introduction to Speaker Diarization Date: 2007/08/16 Speaker: Shih-Sian Cheng.

Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.

Consensus Group Stable Feature Selection

A DYNAMIC APPROACH TO THE SELECTION OF HIGH ORDER N-GRAMS IN PHONOTACTIC LANGUAGE RECOGNITION Mikel Penagarikano, Amparo Varona, Luis Javier Rodriguez-

Speaker Change Detection using Support Vector Machines V.Kartik, D.Srikrishna Satish and C.Chandra Sekhar Speech and Vision Laboratory Department of Computer.

An i-Vector PLDA based Gender Identification Approach for Severely Distorted and Multilingual DARPA RATS Data Shivesh Ranjan, Gang Liu and John H. L. Hansen.

Speaker Verification Using Adapted GMM Presented by CWJ 2000/8/16.

Automatic Pronunciation Scoring of Specific Phone Segments for Language Instruction EuroSpeech 1997 Authors: Y. Kim, H. Franco, L. Neumeyer Presenter:

Discriminative Training and Machine Learning Approaches Machine Learning Lab, Dept. of CSIE, NCKU Chih-Pin Liao.

SNR-Invariant PLDA Modeling for Robust Speaker Verification Na Li and Man-Wai Mak Department of Electronic and Information Engineering The Hong Kong Polytechnic.

Speech Enhancement based on

ASSESSING SEARCH TERM STRENGTH IN SPOKEN TERM DETECTION Amir Harati and Joseph Picone Institute for Signal and Information Processing, Temple University.

Neural networks (2) Reminder Avoiding overfitting Deep neural network Brief summary of supervised learning methods.

Utterance verification in continuous speech recognition decoding and training Procedures Author :Eduardo Lleida, Richard C. Rose Reporter : 陳燦輝.

A Tutorial on Speaker Verification First A. Author, Second B. Author, and Third C. Author.

BRAIN Alliance Research Team Annual Progress Report (Jul – Feb

Research on Machine Learning and Deep Learning

Discriminative Recurring Signal Detection and Localization Zeyu You, Raviv Raich*, Xiaoli Z. Fern, and Jinsub Kim School of EECS, Oregon State University,

3. Applications to Speaker Verification

Decision Making Based on Cohort Scores for

Effects of Lombard Reflex on Deep-Learning-Based

SNR-Invariant PLDA Modeling for Robust Speaker Verification

Web Page Classification with Heterogeneous Data Fusion

Discriminative Training

Combination of Feature and Channel Compensation (1/2)

Presentation transcript:

Known Non-targets for PLDA-SVM Training/Scoring Construction of Discriminative Kernels from Known and Unknown Non-targets for PLDA-SVM Scoring Results Methods Unknown Non-targets for PLDA-SVM Training/Scoring Wei RAO and Man-Wai MAK Dept. of Electronic and Information Engineering, The Hong Kong Polytechnic University References [1] P. Kenny, “Bayesian speaker verification with heavy-tailed priors”, in Proc. of Odyssey: Speaker and Language Recognition Workshop, Brno, Czech Republic, June [2] D. Garcia-Romero and C.Y. Espy-Wilson, “Analysis of i-vector length normalization in speaker recognition systems”, in Proc. Interspeech 2011, Florence, Italy, Aug. 2011, pp. 249–252. [3] M. W. Mak and W. Rao, “Likelihood-Ratio Empirical Kernels for I-Vector Based PLDA-SVM Scoring”, in Proc. ICASSP 2013, Vancouver, Canada, May 2013, pp [4] W. Rao and M.W. Mak, “Boosting the Performance of I-Vector Based Speaker Verification via Utterance Partitioning”, IEEE Trans. on Audio, Speech and Language Processing, May 2013, vol. 21, no. 5, pp Methods Empirical LR Kernel Maps Introduction Motivation NIST 2012 SRE permits systems to use the information of other target- speakers (called known non-targets) in each verification trial. Methods We exploited this new protocol to enhance the performance of PLDA-SVM scoring [3], which is an effective way to utilize the multiple enrollment utterances of target speakers. We used the score vectors of both known and unknown non-targets as the impostor class data to train speaker-dependent SVMs. We also applied utterance partitioning to alleviate the imbalance between the speaker- and imposter-class data during SVM training. Key Findings Results show that incorporating known non-targets into the training of speaker-dependent PLDA-SVMs together with utterance partitioning can boost the performance of i-vector based PLDA systems significantly. contains the i-vectors of the competing known non-targets with respect to s. Results demonstrate the advantages of including known non-targets for training the SVMs EER UP-AVR is very important for SVM scoring. The performance of PLDA- SVM scoring after UP-AVR is much better than PLDA scoring. I-vector Extractor Test utt. Target speaker enrollment utts. Background speaker utts. UP-AVR PLDA Scoring + Empirical Kernel Map (Test vector) (Speaker-class vectors) (Imposter-class vectors) UP-AVR PLDA score of i-vectors and Target speaker’s i-vectors Unknown non-targets’ i-vectors PLDA Scoring + Empirical Kernel Map Background speaker’s i-vectors SVM Training Target speaker SVM Background speaker’s i-vectors Target speaker’s i-vectors PLDA Scoring + Empirical Kernel Map Target speaker’s i-vectors Known non-targets’ i-vectors PLDA Scoring + Empirical Kernel Map Background speaker’s i-vectors SVM Training Target speaker SVM Background speaker’s i-vectors Target speaker’s i-vectors PLDA Scoring + Empirical Kernel Map Feature Extraction and Index Randomization Utterance Partitioning