1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University.

1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

2 Who am I?

3 Speaker Recognition VerificationIdentification Text Dependent Text Independent Types of speaker recognition

4 Speaker Recognition Why is it hard? Minimal training data Background noise Transducer mismatch Channel distortions People’s voices change over time and under stress Performance

5 Feature Extraction Extract speech Spectral analysis Cepstrum: Cepstral means removal

6 Hidden Markov Models Statistical pattern recognition State dependent modeling –Distribution/state –Radial basis functions common State sequence unobservable

7 HMM Efficient decoders: Training –EM algorithm –Convergence to local maxima guaranteed

8 Recognition Model for each speaker Maximum a priori (MAP) decision rule Arg Max Features Models Scores

9 The MAP decision rule Optimal decision rule provided we have accurate distribution parameters & observations. Problem: –Corruption of feature vectors. –Distribution known to be inaccurate.

10 A case of mistaken identity

11 Integral decode Goal: Include uncorrupted observation ô t. Problem: ô t unobservable. Determine a local neighborhood  t about o t and use a priori information to weight the likelihood:

12 Integral decode issues Problems approximating the integral –High frame rate * number of models –Non-trivial dimensionality Selection of the neighborhood

13 Approximating the integral Monte Carlo impractical Use simplified cubature technique:

14 Neighborhood choice Choosing an appropriate neighborhood: –Upper bound difference neighborhoods [Merhav and Lee 93] –Error source modeling

15 Upper bound difference neighborhoods Arbitrary signal pairs with a few general conditions. PSD Cepstra

16 Taking the upper bound Asymptotic difference between cepstral parameters:

17 Error source modeling Multiple error sources Simplifying assumption of one normal distribution with zero mean Use time series analysis to estimate the noise Trend

18 Error Source Modeling Estimate variance from detrended signal

19 Error source modeling Problem: – is infinite Solution: –Most of the points are outliers –Set percentage of distribution beyond which points are culled.

20 Complexity of integration Expensive Ways to reduce/cope –Implemented Top K processing Principle Components Analysis –Possible Gaussian Selection Sub-band Models SIMD or MIMD parallelism

21 Top K Processing 1 second3 seconds 5 seconds

22 Principal Component Analysis Choose P most important directions

23 Principal Component Analysis Integrate using new basis set for step function

24 Speech Corpus King-92 –Used San Diego subset 26 male speakers Long distance telephone speech Quiet room environment 5 sessions recorded one week apart –1-3 train –Sessions 4-5 partitioned into test segments

25 Baseline performance

26 Integral decode performance 1 second3 seconds5 seconds

27 Integral decode with other conditions Performance on –high quality speech –transducer mismatch

28 Future work Extensions to the integral decode –Automatic parameter selection –Gaussian selection –distributed computation Efficient multiple class preclassifiers

30 Optimal/utterance hyperparameters – 5 seconds KingNB26KingWB51 SpidreF18XDR SpidreM27XDR

31 95% Confidence Intervals Caveat: –Per speaker means –Large granularity

32 Pattern Recognition Long term statistics [Bricker et al 71, Markel et al 77] Vector Quantization [Soong et al 87] HMM [Rosenberg et al 90, Tishby 91, Matsui & Furui 92, Reynolds et al 95] Connectionist frameworks Feed forward [Oglesby & Mason 90] Learning vector quantization [He et al 99]

33 Pattern Recognition Contd. Hybrid/Modified HMMs Min Classification Error discriminant [Liu et al 95] Tree structured neural classifiers [Liou & Mammone 95] Trajectory modeling [Russell et al 85, Liu et al 95, Ostendorf et al 96, He et al 99] Sub-band recognition [Besacier & Bonastre 97]

1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University.

Similar presentations

Presentation on theme: "1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University.

Similar presentations

Presentation on theme: "1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University."— Presentation transcript:

Similar presentations

About project

Feedback