Advances in WP2 Trento Meeting – January 2007 www.loquendo.com.

Advances in WP2 Trento Meeting – January 2007 www.loquendo.com

2 Activities on WP2 since last meeting Focus on WP1 (PEQ), WP3 (mobile platform) and WP4 (assessment) Test of adaptation on a project corpus: –Hiwire Noisy Non-Native Corpus

3 LHN Adaptation Output layer …. Input layer 1 st hidden layer 2 nd hidden layer Emission Probabilities Acoustic phonetic Units Speech Signal parameters …. Speaker Independent MLP SI-MLP LHN

4 LHN Training The global SI-MLP+LHN system is trained with vocal material from the target speaker; The LHN is initialized with an identity matrix; LHN weights are trained with error back- propagation through the last layer of weights; The original NN weights are kept frozen

5 Hiwire Noisy Corpus Recorded in cockpit simulator with two noise levels Microphone Array + Beamforming (ITC) 5 non-native speakers. Each speaker has pronounced 1 list of 100 sentences. Sentences from the Hiwire Fixed-Demo grammar

6 Experimental conditions Starting models: -standard Loquendo ASR EN-US -Telephone models (8 kHz) -Training set: LDC Macrophone Adaptation: first 50 utterances of each speaker Test:last 50 utterances of each speaker LM: Hiwire grammar (134 words voc.) Signal proc.: down-sampling to 8 kHz

7 Results on Hiwire Noisy corpus (High noise ) Recognition model: ANN/HMM Adaptation Model: LIN - LHN SpeakerDefault models WA Adapt LINAdapt LHNAdapt LIN+LHN WAER %WAER %WAER % spk033.343.515.356.534.858.037.0 spk122.729.58.818.8-5.027.86.6 spk226.929.0 2.917.9-12.318.6-11.3 spk350.462.023.4 64.528.473.646.8 spk423.235.4 15.923.20.037.018.0 Average 31.339.912.536.27.143.017.0

8 Results on Hiwire Noisy corpus (Low noise ) Recognition model: ANN/HMM Adaptation Model: LIN - LHN SpeakerDefault models WA Adapted LINAdapted LHNAdapted LIN+LHN WAER %WAER %WAER % spk058.082.658.679.050.089.174.0 spk147.269.341.964.232.268.239.8 spk260.779.347.382.154.485.563.1 spk371.181.034.370.2-3.185.148.4 spk437.656.430.158.032.765.745.0 Average 54.973.741.770.735.078.752.8

9 Discussion In the case of Hiwire Noisy DB there are 3 main problems: –Noise level; –Non-Native Speakers –Channel: far-field microphone array + beamforming If the WA of the default models is too low (~20-30%) adaptation is unable to improve because too many segmentation errors are present in the adaptation material If the WA of the default models is acceptable (> 40%) adaptation can improve performances On this corpus, where the channel + noise component is preponderant, LIN is in some cases better than LHN The combination LIN+LHN is always better that the single techniques

10 Workplan Selection of suitable benchmark databases (m6) Baseline set-up for the selected databases (m8) LIN adaptation method implemented and experimented on the benchmarks (m12) Experimental results on Hiwire database with LIN (m18) Innovative NN adaptation methods and algorithms for acoustic modeling and experimental results (m21) Further advances on new adaptation methods (m24) Unsupervised Adaptation: algorithms and experimentation (m33)

Advances in WP2 Trento Meeting – January 2007 www.loquendo.com.

Similar presentations

Presentation on theme: "Advances in WP2 Trento Meeting – January 2007 www.loquendo.com."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Advances in WP2 Trento Meeting – January 2007 www.loquendo.com.

Similar presentations

Presentation on theme: "Advances in WP2 Trento Meeting – January 2007 www.loquendo.com."— Presentation transcript:

Similar presentations

About project

Feedback