Presentation is loading. Please wait.

Presentation is loading. Please wait.

Text independent speaker identification in multilingual environments I. Luengo, E. Navas, I. Sainz, I. Saratxaga, J. Sanchez, I. Odriozola and I. Hernaez.

Similar presentations


Presentation on theme: "Text independent speaker identification in multilingual environments I. Luengo, E. Navas, I. Sainz, I. Saratxaga, J. Sanchez, I. Odriozola and I. Hernaez."— Presentation transcript:

1 Text independent speaker identification in multilingual environments I. Luengo, E. Navas, I. Sainz, I. Saratxaga, J. Sanchez, I. Odriozola and I. Hernaez

2 Contents Introduction  SR in language mismatched conditions  Existent solutions  Proposed solution Working database Variability measures Experimental results Conclusions

3 Speaker Recognition System Feature Extr. Train M Feature Extr. ScoreDecision TRAIN TEST Language mismatch? Accuracy decreases

4 Existent solutions Multi-language training  One model trained with various languages (per speaker)  Model learns characteristics of different languages Multi-model training  One model for each language (per speaker)  Language detector

5 Existent solutions Drawbacks  Possible languages must be known in advance for each speaker  Not generalizable for languages not seen during training  More recording sessions needed for training  + Time  + Money Desired solution: Language independent  Suitable for languages not seen during training  Capable of single-language training

6 Proposed solution Language-independent features NNormalization? NNew features? Short-term intonation and energy values High speaker discrimination capability Global distribution may change little with language Combinable with MFCC OOnly in voiced frames (intonation) HHigh session variability MMVN for inter-session normalization

7 Database Bilingual Spanish-Basque speech database  22 speakers (11 Male, 11 Female)  4 sessions (inter-session variability)  7 numeric sequences (8 digits) per session and language

8 Variability measures Adding new features ALWAYS increases separability/variability + Speaker separability  + discrimination  + Language variability  + model/test mismatch  + Session variability  + model/test mismatch Key issue: Does speaker separability increase more than language/session variability?

9 Variability measures Kullback-Leibler divergence for variability estimation Interesting measures:  Good if new features increase these ratios Inter-speaker variability Inter-language variability Inter-speaker variability Inter-session variability

10 Variability measures MFCCMFCC+PGain Lang-4.094.6112% Spk S6.348.2530% B6.828.7729% Ses S3.624.8133% B3.524.6432% Spk/Lang S1.551.7915% B1.671.9014% Spk/Ses S1.751.72-2% B1.941.89-3%

11 Experimental results X-Y  Training in X, testing in Y S-SB-BS-BB-SSB-SSB-B MFCC (ref)98.397.363.667.396.895.6 MFCC (V)97.696.862.667.096.695.6 MFCC+P (V)97.196.371.073.096.194.4 Gain (V)-0.5% 13.4%9.0%-0.5%-1.3%

12 Conclusions Short-term intonation and energy values increase language robustness  Little accuracy drop on language-matched conditions Very useful if test language is unpredictable Variability measures predict results reasonably  Allows easy selection of features prior to experiments

13 Text independent speaker identification in multilingual environments I. Luengo, E. Navas, I. Sainz, I. Saratxaga, J. Sanchez, I. Odriozola and I. Hernaez


Download ppt "Text independent speaker identification in multilingual environments I. Luengo, E. Navas, I. Sainz, I. Saratxaga, J. Sanchez, I. Odriozola and I. Hernaez."

Similar presentations


Ads by Google