Presentation is loading. Please wait.

Presentation is loading. Please wait.

© Fraunhofer FKIE Corinna Harwardt 28.10.2009 Automatic Speaker Recognition in Military Environment.

Similar presentations


Presentation on theme: "© Fraunhofer FKIE Corinna Harwardt 28.10.2009 Automatic Speaker Recognition in Military Environment."— Presentation transcript:

1 © Fraunhofer FKIE Corinna Harwardt 28.10.2009 Automatic Speaker Recognition in Military Environment

2 © Fraunhofer FKIE Overview  Basics of automatic speaker recognition  The VerA system GMM-UBM based system Results on military relevant audio data High-Level Features for improved speaker recognition  The Problem looked at in the PhD thesis: Different degrees of vocal effort in training and test data

3 © Fraunhofer FKIE Corinna Harwardt Speaker recognition The goal of speaker recognition is to determine the probability that a given speech signal is uttered by a certain speaker.

4 © Fraunhofer FKIE Corinna Harwardt Speaker identification

5 © Fraunhofer FKIE Corinna Harwardt Speaker verification

6 © Fraunhofer FKIE Corinna Harwardt Typical configuration of a speaker recognition system

7 © Fraunhofer FKIE Corinna Harwardt VerA VerA – SprecherVerifikation militärisch relevanter Audiodaten (speaker verification on military relevant audio data) Baseline: MFCC, GMM-UBM based system Energy-based VAD (voice activity detection) MFCC (mel frequency cepstrum coefficients) Developed for speech recognition applications acoustic features Calculated on short parts of the signal (20 ms) GMM (gaussian mixture models) Statistical Modeling of the features extracted from the signal (e.g. MFCC)

8 © Fraunhofer FKIE Corinna Harwardt Preliminary results on the Kiel corpus

9 © Fraunhofer FKIE Corinna Harwardt Preliminary results on military relevant audio data

10 © Fraunhofer FKIE Corinna Harwardt Comparison to other systems Military relevant data average EER SIDSysytem5, EU 18,76% SIDSystem4, FU 14,88% SIDSystem6 V2, EU14,97% SIDSystem6 V2, FU17,04% SIDSystem7 V2, EU18,05% SIDSystem7 V2, FU22,48% SIDSystem1, FU20,15% SIDSystem2, FU23,46% SIDSystem3, FU21,33% VerA16,67% Kiel corpusEER SIDSysytem5, EU22,82% SIDSystem4, FU12,23% SIDSystem6 V1, EU ‎31,25% SIDSystem6 V1, FU‎ 31,25% SIDSystem6 V2, EU9,74% SIDSystem6 V2, FU9,83% SIDSystem7 V1, EU31,25% SIDSystem7 V1, FU31,25% SIDSystem7 V2, EU 10,31% SIDSystem7 V2, FU14,11% SIDSystem1, FU8,71% SIDSystem2, FU44,34% SIDSystem3, FU40,38% VerA4,71%

11 © Fraunhofer FKIE Corinna Harwardt High-Level Features I … are features relying on linguistic content or features which are calculated on parts of the signal longer than the normally used approximately 20 ms in frame-based approaches … might for example use prosodic, phonetic or idiolectal information. … lead to additional information compared to acoustically motivated features like MFCCs … shall therefore be used additionally to acoustic features and not exclusively … are relatively robust against distortions

12 © Fraunhofer FKIE Corinna Harwardt High-Level features II Goal: Pick a high-level feature, which does not need a speech recognizer High-Level features under consideration: F0 statistics as proposed in (Reynolds et al. 2002 and Rose 2002) Formant statistics (Becker et al. 2008)

13 © Fraunhofer FKIE Corinna Harwardt Different degrees of vocal effort Problem: The recognition performance degrades for several speech processing tasks if speech with high vocal effort is used without additional training (Becker et al. 2008). The goal is either: Find robust features for speaker recognition with normal and high- vocal effort. Or: to find a method to predict the changes of acoustic features due to raised vocal effort.

14 © Fraunhofer FKIE Corinna Harwardt References D. Reynolds et al.: SuperSID Project Final Report – Exploiting High-Level Information for High-Performance Speaker Recognition. Department of Defense; National Science Foundation, 2002. P. Rose: Forensic Speaker Identification, Taylor & Francis, 2002. T. Becker, M. Jessen, and C. Grigoras, “Forensic Speaker Verification Using Formant Features and Gaussian Mixture Models,” 9th Annual Conference of the International Speech Communication Association, 2008.


Download ppt "© Fraunhofer FKIE Corinna Harwardt 28.10.2009 Automatic Speaker Recognition in Military Environment."

Similar presentations


Ads by Google