Presentation is loading. Please wait.

Presentation is loading. Please wait.

An i-Vector PLDA based Gender Identification Approach for Severely Distorted and Multilingual DARPA RATS Data Shivesh Ranjan, Gang Liu and John H. L. Hansen.

Similar presentations


Presentation on theme: "An i-Vector PLDA based Gender Identification Approach for Severely Distorted and Multilingual DARPA RATS Data Shivesh Ranjan, Gang Liu and John H. L. Hansen."— Presentation transcript:

1 An i-Vector PLDA based Gender Identification Approach for Severely Distorted and Multilingual DARPA RATS Data Shivesh Ranjan, Gang Liu and John H. L. Hansen {Shivesh.Ranjan, Gang.Liu, John.Hansen}@utdallas.edu Why female and male speech differ? Why female and male speech differ? Vocal Tract Length (14cm vs 17.5cm). Length of vocal folds (ratio of vocal fold lengths is 0.8). Larynx Anatomy (difference in thickness). Center for Robust Speech Systems (CRSS) Erik Jonsson School of Engineering & Computer Science The University of Texas at Dallas Richardson, Texas 75080-3021, USA Applications of Gender Identification Applications of Gender Identification Improving speech & speaker recognition accuracy. Accent identification, Speaker health identification. Emotion Recognition, Surveillance, Call center-business applications, Human computer intelligent interaction. Motivations for i-Vector based Gender ID approach Motivations for i-Vector based Gender ID approach i-Vector offers a compact representation of an utterance while preserving the speaker-specific attributes. Gender is an important speaker specific attribute. i-Vector based systems are the current state-of-the-art in Speaker ID and Language ID. GMM-UBM based Gender ID systems. Gender ID framework First 2 dimensions of MMI based 3-D projection of 2600 i-vectors from the FE test-set. Fundamentals of i-Vector G-PLDA framework Fundamentals of i-Vector G-PLDA framework Gender Separability in the i-Vector Space Training and Test data-sets Fisher English (FE)Training Data 20,652 gender-labeled FE utterances (89% of the total corpus) was used to train the UBM, and the T matrix for i-Vector extraction. Fisher English (FE) Test Data 2,600 utterances selected randomly from the FE corpus (11% of the total corpus). Smaller test- sets of duration 20s, 10s, and 3s were also created. DARPA RATS Test Data 438 test-utterances from the different channels (A, B, C, D, E, F, G, H) and the clean (SRC) source, and in 5 different languages. DARPA RATS Unlabeled Development Set 502 utterances per channel for all the channels except H. 480 utterances for channel H. Results on FE data Duration Mismatch Compensation Retrain the gender ID system with corresponding shorter-duration segments. Unsupervised Domain Adaptation Issues with the RATS test-set Gender ID system is trained only on FE data, and no gender-labeled data is available for the RATS test-set. 4 of the 5 languages are not present in the FE training-set. Unsupervised Clustering Use unsupervised clustering (Label Generating-Max Margin Clustering) to assign labels to unlabeled RATS development data. Estimate the in-domain PLDA model using the estimated labels. Out-of-domain PLDA model adaptation Gender ID results on RATS data i-Vector based Gender ID: Conclusions On FE test-sets, the proposed approach is able to achieve accuracy and EER of up to 97.62% and 2.31% respectively. Duration mismatch compensation offers significantly smaller degradation in performance for shorter duration test segments. On RATS test-set, unsupervised domain adaptation strategy offered a 6.8% relative gain (5.25% absolute) in classification accuracy, and a 14.75% relative reduction (3.08% absolute) in EER.


Download ppt "An i-Vector PLDA based Gender Identification Approach for Severely Distorted and Multilingual DARPA RATS Data Shivesh Ranjan, Gang Liu and John H. L. Hansen."

Similar presentations


Ads by Google