Presentation is loading. Please wait.

Presentation is loading. Please wait.

MAI Internship April-May 2002. MAI Internship 2002 Slide 2 of 14 What? The AST Project promotes development of speech technology for official languages.

Similar presentations


Presentation on theme: "MAI Internship April-May 2002. MAI Internship 2002 Slide 2 of 14 What? The AST Project promotes development of speech technology for official languages."— Presentation transcript:

1 MAI Internship April-May 2002

2 MAI Internship 2002 Slide 2 of 14 What? The AST Project promotes development of speech technology for official languages of South Africa SAEnglish, Afrikaans, Zulu, Xhosa, Sesotho Create reusable databases & software Prototype hotel booking dialogue system 2000-2003

3 MAI Internship 2002 Slide 3 of 14 AST dialogue system: basics Telephone Network Speech Recognition Natural Language Understanding Dialogue Manager Speech Synthesis DATABASEDATABASE

4 MAI Internship 2002 Slide 4 of 14 Use?  input ASR: acoustic training  output ASR: dictionary Start from scratch, even for SAE Telephone data based on SpeechDat –Datasheet utterances –Hierarchical recruiting method Labeling Tool: PRAAT AST Speech Database

5 MAI Internship 2002 Slide 5 of 14 Language SpokenCodeNo. of Speakers 1 English (E) Speech varieties: Mother-tongue English Black English Coloured English Asian English Afrikaans English EE BE CE ASE AE 1500-2000 300-400 2 isiXhosa (X)XX300-400 3 Sesotho (S)SS300-400 4 isiZulu (Z)ZZ300-400 5 Afrikaans (A) Speech varieties: Mother-tongue Afrikaans Black Afrikaans Coloured Afrikaans AA BA CA 900-1200 300-400

6 MAI Internship 2002 Slide 6 of 14 AST Speech Database Orthographic annotation Phonemic transcription Acoustic signal Phonetic alignment Manual labour Rules & dictionary: Patana Forced alignment: HTK

7 MAI Internship 2002 Slide 7 of 14 Difficult: –Speaker independent, noisy conditions –Medium-size vocabulary (10.000 words) –Training data sparse Not so difficult: –Dialogue Manager helps Phoneme-based HMMs  future diphones Finite-state language model Pitch & clicks African languages ignored AST Speech Recognition

8 MAI Internship 2002 Slide 8 of 14 Same finite-state network as language model recogniser  +: all utterances ‘understood’ -: FSG are limited Makes no sense to recognise more than we can understand Semantic labels are activated Alternative: robust parsing (Phoenix, ATIS) AST Natural Language Understanding

9 MAI Internship 2002 Slide 9 of 14 Speech Recognition NLU Dialogue Manager FSG Recognised utterance Grammar ID Meaning AST Natural Language Understanding

10 MAI Internship 2002 Slide 10 of 14 Embedded semantic tags: ‘drie honderd duisend agt en neëntig’  3 0 0 0 9 8 V6=3 V5=0 V4=0 V3=0 V2=9 V1=8 t1=3 t2=0 t3=0 AST Natural Language Understanding

11 MAI Internship 2002 Slide 11 of 14 Trade-off: naturalness  response restriction System-directed: predictability user utterances, simple dialogues Mixed-initiative: shorter dialogues, more recognition errors User-initiative: unpopular AST Dialogue Manager

12 MAI Internship 2002 Slide 12 of 14 Design: Early focus on users and task Wizard-of-Oz: pay no attention to the man behind the curtain System-in-the-loop Finite-state structure because of simplicity and functionality Possible frame-based approach in future AST Dialogue Manager

13 MAI Internship 2002 Slide 13 of 14 Fixed machine utterances: pre-recorded speech Database queries: limited-domain synthesis (Festival platform) AST Speech Synthesis

14 MAI Internship 2002 Slide 14 of 14 Conclusion Finite-state approach in –Recogniser –NLU component –Dialogue manager  Workable prototype  New fundings 2003


Download ppt "MAI Internship April-May 2002. MAI Internship 2002 Slide 2 of 14 What? The AST Project promotes development of speech technology for official languages."

Similar presentations


Ads by Google