Philip Jackson and Martin Russell Electronic Electrical and Computer Engineering Models of speech dynamics in a segmental-HMM recognizer using intermediate.

Philip Jackson and Martin Russell Electronic Electrical and Computer Engineering Models of speech dynamics in a segmental-HMM recognizer using intermediate linear representations http://web.bham.ac.uk/p.jackson/balthasar/

Speech dynamics into ASR INTRODUCTION

Conventional model INTRODUCTION 1 acoustic observations HMM acoustic PDF 1111234222223334442

Linear-trajectory model INTRODUCTION 2341 W acoustic observations articulatory-to- intermediate layer segmental HMM acoustic PDF acoustic mapping

Multi-level Segmental HMM segmental finite-state process intermediate “articulatory” layer –linear trajectories mapping required –linear transformation –radial basis function network INTRODUCTION

Estimation of linear mapping Matched sequences and THEORY

Linear-trajectory equations Defined as: THEORY

Training the model parameters For optimal least-squares estimates (acoustic domain): THEORY midpoint slope

THEORY midpoint slope For optimal least-squares estimates (articulatory domain): Training the model parameters

THEORY midpoint slope For optimal maximum-likelihood estimates (articulatory domain): Training the model parameters

Tests on MOCHA S. British English, at 16kHz (Wrench, 2000) –MFCC13 acoustic features, incl. zero’ th –articulatory x - & y -coords from 7 EMA coils –PCA9+Lx: first nine articulatory modes plus the laryngograph log energy METHOD

MOCHA baseline performance RESULTS Constant-trajectory SHMM (ID_0) Linear-trajectory SHMM (ID_1)

Performance across mappings RESULTS

Phone categorisation No.No.Description A 1all data B 2silence; speech C 6linguistic categories: silence/stop; vowel; liquid; nasal; fricative; affricate D 10as (Deng and Ma, 2000) : silence; vowel; liquid; nasal; UV fric; /s,ch/; V fric; /z,jh/; UV stop; V stop E 10discrete articulatory regions F 49silence; individual phones METHOD

Tests on TIMIT N. American English, at 8kHz –MFCC13 acoustic features, incl. zero’ th a)F1-3: formants F1, F2 and F3, estimated by Holmes formant tracker b)F1-3+BE5: five band energies added c)PFS12: synthesiser control parameters METHOD

TIMIT baseline performance Constant-trajectory SHMM (ID_0) Linear-trajectory SHMM (ID_1) RESULTS

Performance across feature sets RESULTS

Performance across groupings RESULTS

Results across groupings RESULTS

Model visualisation Original acoustic data Constant- trajectory model Linear- trajectory model (c,F) DISCUSSION

Conclusions Developed framework for speech dynamics in an intermediate space Linear traj. + piecewise linear mapping bounded by performance of linear traj. in acoustic space Near optimal performance achieved –For more than 3 formant parameters –For 6 or more linear mappings Formants and articulatory parameters gave qualitatively similar results What next? SUMMARY

Complete experiments with lang. model Include segment duration models Derive pseudo-articulatory representations by unsupervised (embedded) training Implement non-linear mapping (i.e., RBF) Further information: –here and now –p.jackson@bham.ac.uk –web.bham.ac.uk/p.jackson/balthasar SUMMARY Further work

Philip Jackson and Martin Russell Electronic Electrical and Computer Engineering Models of speech dynamics in a segmental-HMM recognizer using intermediate.

Similar presentations

Presentation on theme: "Philip Jackson and Martin Russell Electronic Electrical and Computer Engineering Models of speech dynamics in a segmental-HMM recognizer using intermediate."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Philip Jackson and Martin Russell Electronic Electrical and Computer Engineering Models of speech dynamics in a segmental-HMM recognizer using intermediate.

Similar presentations

Presentation on theme: "Philip Jackson and Martin Russell Electronic Electrical and Computer Engineering Models of speech dynamics in a segmental-HMM recognizer using intermediate."— Presentation transcript:

Similar presentations

About project

Feedback