Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University.

Similar presentations


Presentation on theme: "1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University."— Presentation transcript:

1 1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University

2 2 Who am I?

3 3 Speaker Recognition VerificationIdentification Text Dependent Text Independent Types of speaker recognition

4 4 Speaker Recognition Why is it hard? Minimal training data Background noise Transducer mismatch Channel distortions People’s voices change over time and under stress Performance

5 5 Feature Extraction Extract speech Spectral analysis Cepstrum: Cepstral means removal

6 6 Hidden Markov Models Statistical pattern recognition State dependent modeling –Distribution/state –Radial basis functions common State sequence unobservable

7 7 HMM Efficient decoders: Training –EM algorithm –Convergence to local maxima guaranteed

8 8 Recognition Model for each speaker Maximum a priori (MAP) decision rule Arg Max Features Models Scores

9 9 The MAP decision rule Optimal decision rule provided we have accurate distribution parameters & observations. Problem: –Corruption of feature vectors. –Distribution known to be inaccurate.

10 10 A case of mistaken identity

11 11 Integral decode Goal: Include uncorrupted observation ô t. Problem: ô t unobservable. Determine a local neighborhood  t about o t and use a priori information to weight the likelihood:

12 12 Integral decode issues Problems approximating the integral –High frame rate * number of models –Non-trivial dimensionality Selection of the neighborhood

13 13 Approximating the integral Monte Carlo impractical Use simplified cubature technique:

14 14 Neighborhood choice Choosing an appropriate neighborhood: –Upper bound difference neighborhoods [Merhav and Lee 93] –Error source modeling

15 15 Upper bound difference neighborhoods Arbitrary signal pairs with a few general conditions. PSD Cepstra

16 16 Taking the upper bound Asymptotic difference between cepstral parameters:

17 17 Error source modeling Multiple error sources Simplifying assumption of one normal distribution with zero mean Use time series analysis to estimate the noise Trend

18 18 Error Source Modeling Estimate variance from detrended signal

19 19 Error source modeling Problem: – is infinite Solution: –Most of the points are outliers –Set percentage of distribution beyond which points are culled.

20 20 Complexity of integration Expensive Ways to reduce/cope –Implemented Top K processing Principle Components Analysis –Possible Gaussian Selection Sub-band Models SIMD or MIMD parallelism

21 21 Top K Processing 1 second3 seconds 5 seconds

22 22 Principal Component Analysis Choose P most important directions

23 23 Principal Component Analysis Integrate using new basis set for step function

24 24 Speech Corpus King-92 –Used San Diego subset 26 male speakers Long distance telephone speech Quiet room environment 5 sessions recorded one week apart –1-3 train –Sessions 4-5 partitioned into test segments

25 25 Baseline performance

26 26 Integral decode performance 1 second3 seconds5 seconds

27 27 Integral decode with other conditions Performance on –high quality speech –transducer mismatch

28 28 Future work Extensions to the integral decode –Automatic parameter selection –Gaussian selection –distributed computation Efficient multiple class preclassifiers

29 29

30 30 Optimal/utterance hyperparameters – 5 seconds KingNB26KingWB51 SpidreF18XDR SpidreM27XDR

31 31 95% Confidence Intervals Caveat: –Per speaker means –Large granularity

32 32 Pattern Recognition Long term statistics [Bricker et al 71, Markel et al 77] Vector Quantization [Soong et al 87] HMM [Rosenberg et al 90, Tishby 91, Matsui & Furui 92, Reynolds et al 95] Connectionist frameworks Feed forward [Oglesby & Mason 90] Learning vector quantization [He et al 99]

33 33 Pattern Recognition Contd. Hybrid/Modified HMMs Min Classification Error discriminant [Liu et al 95] Tree structured neural classifiers [Liou & Mammone 95] Trajectory modeling [Russell et al 85, Liu et al 95, Ostendorf et al 96, He et al 99] Sub-band recognition [Besacier & Bonastre 97]


Download ppt "1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University."

Similar presentations


Ads by Google