Download presentation
Presentation is loading. Please wait.
Published byJayce Root Modified over 10 years ago
1
Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute Carnegie Mellon University
2
Background Speech-enabled systems use models of the user s language Such models are tailored for native speech Great loss of performance for non-native users who don t follow typical native patterns
3
Previous Work on Non-Native Speech Recognition Assumes knowledge about/data from a specific non-native population Often based on read speech Focuses on acoustic mismatch: Acoustic adaptation Multilingual acoustic models
4
Linguistic Particularities of Non-Native Speakers Non-native speakers might use different lexical and syntactic constructs Non-native speakers are in a dynamic process of L2 acquisition
5
Outline of the Talk Baseline system and data collection Study of non-native/native mismatch and effect of additional non-native data Adaptive lexical entrainment
6
The CMU Let s Go!! System: Bus Schedule Information for the Pittsburgh Area ASR Sphinx II Parsing Phoenix Dialogue Management RavenClaw Speech Synthesis Festival HUB Galaxy NLG Rosetta
7
Data Collection Baseline system accessible since February 2003 Experiments with scenarios Publicized the phone number inside CMU in Fall 2003
8
Data Collection Web Page
9
Data Directed experiments: 134 calls 17 non-native speakers (5 from India, 7 from Japan, 5 others) Spontaneous: 30 calls Total: 1768 utterances Evaluation Data: Non-Native: 449 utterances Native: 452 utterances
10
Speech Recognition Baseline Acoustic Models: semi-continuous HMMs (codebook size: 256) 4000 tied states trained on CMU Communicator data Language Model: class-based backoff 3-gram trained on 3074 utterances from native calls
11
Speech Recognition Results NativeNon-Native 20.4%52.0% Causes of discrepancy: Acoustic mismatch (accent) Linguistic mismatch (word choice, syntax) Word Error Rate:
12
Language Model Performance Evaluation on transcripts. Initial model: 3074 native utterances
13
Adding non-native data: 3074 native+1308 non-native utterances Initial (native) model Mixed model Language Model Performance
14
Natural Language Understanding Grammar manually written incrementally, as the system was being developed Initially built with native speakers in mind Phoenix: robust parser (less sensitive to non-standard expressions)
15
Grammar Coverage Initial grammar: Manually written for native utterances
16
Grammar Coverage Grammar designed to accept some non- native patterns: reach = arrive What is the next bus? = When is the next bus?
17
Relative Improvement due to Additional Data
18
Effect of Additional Data on Speech Recognition
19
Adaptive Lexical Entrainment If you can t adapt the system, adapt the user System should use the same expressions it expects from the user But non-native speakers might not master all target expressions Use expressions that are close to the non- native speaker s language Use prosody to stress incorrect words
20
Adaptive Lexical Entrainment: Example Iwanttogotheairport Iwanttogotheairport?TO Did you mean:
21
Adaptive Lexical Entrainment: Algorithm Target Prompts ASR Hypothesis DP-based Alignment Prompt Selection Emphasis Confirmation Prompt I want to go the airport
22
Adaptive Lexical Entrainment: Algorithm Target Prompts ASR Hypothesis DP-based Alignment Prompt Selection Emphasis Confirmation Prompt I want to go the airport Id like to go to the airport
23
Adaptive Lexical Entrainment: Algorithm Target Prompts ASR Hypothesis DP-based Alignment Prompt Selection Emphasis Confirmation Prompt I want to go the airport Id like to go to the airport I want to go to the airport
24
Adaptive Lexical Entrainment: Algorithm Target Prompts ASR Hypothesis DP-based Alignment Prompt Selection Emphasis Confirmation Prompt I want to go the airport Id like to go to the airport I want to go to the airport
25
Adaptive Lexical Entrainment: Algorithm Target Prompts ASR Hypothesis DP-based Alignment Prompt Selection Emphasis Confirmation Prompt I want to go the airport Id like to go to the airport I want to go to the airport
26
Adaptive Lexical Entrainment: Algorithm Target Prompts ASR Hypothesis DP-based Alignment Prompt Selection Emphasis Confirmation Prompt I want to go the airport Id like to go to the airport I want to go to the airportDid you mean:?
27
Adaptive Lexical Entrainment: Study 24 dialogues, 12.4 turns/dialogue, 2.5 entrainment prompts/dialogue Trigger entrainment whenever best target does not perfectly match ASR Too simple? 40.7% false positive 64.4% prompts are triggered by misrecognitions
28
Conclusion LM and Grammar perform significantly worse on non-native speech Including non-native data reduces the gap in linguistic metrics but not WER Future work: Improve Speech Recognition More data for LM Acoustic/Lexical adaptation Integrate adaptive lexical entrainment in the dialog management framework
29
Example Dialogue S:Welcome to the Lets Go!! bus information system. What can I do for you? U: Excuse me. Im in the place which uh where the Forbes Avenue and Murray Avenue are crossing and I want to go to the place mm Forbes where the Forbes Avenue and Craig Street are crossing. S:I heard SIX. Is that correct? U:Which bus uh should I take? Uh sorry uh maybe 61C bus I have… What time the next 61C bus will come? S:The time is currently 5:37 PM. Where are you leaving from? …
30
Example Dialogue S:Welcome to the Lets Go!! bus information system. What can I do for you? U: I want to go to downtown now Im at Fifth and Bigelow. S:Sorry, did you say I want to go to downtown? U:Which number bus can I take and when will the next bus come here? S:... What can I do for you? …
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.