MAI Internship April-May 2002. MAI Internship 2002 Slide 2 of 14 What? The AST Project promotes development of speech technology for official languages.

Slides:



Advertisements
Similar presentations
1 Using the HTK speech recogniser to analyse prosody in a corpus of German spoken learners English Toshifumi Oba, Eric Atwell University of Leeds, School.
Advertisements

Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.
Markpong Jongtaveesataporn † Chai Wutiwiwatchai ‡ Koji Iwano † Sadaoki Furui † † Tokyo Institute of Technology, Japan ‡ NECTEC, Thailand.
Spoken Language Understanding in Dialogue Systems Svetlana Stoyanchev 02/02/2015.
Speech Recognition Part 3 Back end processing. Speech recognition simplified block diagram Speech Capture Speech Capture Feature Extraction Feature Extraction.
Speech Translation on a PDA By: Santan Challa Instructor Dr. Christel Kemke.
INCORPORATING MULTIPLE-HMM ACOUSTIC MODELING IN A MODULAR LARGE VOCABULARY SPEECH RECOGNITION SYSTEM IN TELEPHONE ENVIRONMENT A. Gallardo-Antolín, J. Ferreiros,
Spoken Language Technologies: A review of application areas and research issues Analysis and synthesis of F0 contours Agnieszka Wagner Department of Phonetics,
1/7 INFO60021 Natural Language Processing Harold Somers Professor of Language Engineering.
Advances in WP2 Trento Meeting – January
Machine Learning Motivation for machine learning How to set up a problem How to design a learner Introduce one class of learners (ANN) –Perceptrons –Feed-forward.
Resources Primary resources – Lexicons, structured vocabularies – Grammars (in widest sense) – Corpora – Treebanks Secondary resources – Designed for a.
Bootstrapping pronunciation models: a South African case study Presented at the CSIR Research and Innovation Conference Marelie Davel & Etienne Barnard.
Text-To-Speech System for Marathi Miss. Deepa V. Kadam Indian Institute of Technology, Bombay.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
May 20, 2006SRIV2006, Toulouse, France1 Acoustic Modeling of Accented English Speech for Large-Vocabulary Speech Recognition ATR Spoken Language Communication.
Introduction to Automatic Speech Recognition
Data-driven approach to rapid prototyping Xhosa speech synthesis Albert Visagie Justus Roux Centre for Language and Speech Technology Stellenbosch University.
An Evaluation Framework for Natural Language Understanding in Spoken Dialogue Systems Joshua B. Gordon and Rebecca J. Passonneau Columbia University.
1M4 speech recognition University of Sheffield M4 speech recognition Martin Karafiát*, Steve Renals, Vincent Wan.
1 7-Speech Recognition (Cont’d) HMM Calculating Approaches Neural Components Three Basic HMM Problems Viterbi Algorithm State Duration Modeling Training.
Midterm Review Spoken Language Processing Prof. Andrew Rosenberg.
Some Thoughts on HPC in Natural Language Engineering Steven Bird University of Melbourne & University of Pennsylvania.
Develop a fast semantic decoder for dialogue systems Capability to parse 10 – 100 ASR hypotheses in real time Robust to speech recognition noise Semantic.
7-Speech Recognition Speech Recognition Concepts
May 2006CLINT-CS Verbmobil1 CLINT-CS Dialogue II Verbmobil.
A brief overview of Speech Recognition and Spoken Language Processing Advanced NLP Guest Lecture August 31 Andrew Rosenberg.
Suléne Pilon & Danie Prinsloo Overview: Teaching and Training in South Africa 25 November 2008;
Lessons Learned Mokusei: Multilingual Conversational Interfaces Future Plans Explore language-independent approaches to speech understanding and generation.
1M4 speech recognition University of Sheffield M4 speech recognition Vincent Wan, Martin Karafiát.
National anthem oid=nl http://nl.netlog.com/go/explore/videos/vide oid=nl Lord Bless Africa.
Introduction to CL & NLP CMSC April 1, 2003.
Rundkast at LREC 2008, Marrakech LREC 2008 Ingunn Amdal, Ole Morten Strand, Jørn Almberg, and Torbjørn Svendsen RUNDKAST: An Annotated.
DIALOG SYSTEMS FOR AUTOMOTIVE ENVIRONMENTS Presenter: Joseph Picone Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi.
1 Boostrapping language models for dialogue systems Karl Weilhammer, Matthew N Stuttle, Steve Young Presenter: Hsuan-Sheng Chiu.
Dirk Van CompernolleAtranos Workshop, Leuven 12 April 2002 Automatic Transcription of Natural Speech - A Broader Perspective – Dirk Van Compernolle ESAT.
ICS 482: Natural language Processing Pre-introduction
Artificial Intelligence 2004 Speech & Natural Language Processing Speech Recognition acoustic signal as input conversion into written words Natural.
DIALOG SYSTEMS FOR AUTOMOTIVE ENVIRONMENTS Presenter: Joseph Picone Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi.
Voice Activity Detection based on OptimallyWeighted Combination of Multiple Features Yusuke Kida and Tatsuya Kawahara School of Informatics, Kyoto University,
The HTK Book (for HTK Version 3.2.1) Young et al., 2002.
Performance Comparison of Speaker and Emotion Recognition
© 2013 by Larson Technical Services
Rapid Development in new languages Limited training data (6hrs) provided by NECTEC from 34 speakers, + 8 spks for development and test Romanization of.
金聲玉振 Taiwan Univ. & Academia Sinica 1 Spoken Dialogue in Information Retrieval Jia-lin Shen Oct. 22, 1998.
Thomas Krippgans Tel.: FAX: Natural Voice Recognition.
Develop a fast semantic decoder for dialogue systems Capability to parse 10 – 100 ASR hypothesis in real time Robust to speech recognition noise Trainable.
Network Training for Continuous Speech Recognition Author: Issac John Alphonso Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng.
Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning Speech Communication, 2000 Authors: S. M. Witt, S. J. Young Presenter:
1 7-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches Recognition Theories Bayse Rule Simple Language Model P(A|W) Network Types.
Audio Books for Phonetics Research CatCod2008 Jiahong Yuan and Mark Liberman University of Pennsylvania Dec. 4, 2008.
2014 Development of a Text-to-Speech Synthesis System for Yorùbá Language Olúòkun Adédayọ̀ Tolulope Department of Computer Science.
Speaker Recognition UNIT -6. Introduction  Speaker recognition is the process of automatically recognizing who is speaking on the basis of information.
Arnar Thor Jensson Koji Iwano Sadaoki Furui Tokyo Institute of Technology Development of a Speech Recognition System For Icelandic Using Machine Translated.
G. Anushiya Rachel Project Officer
Automatic Speech Recognition
Chapter 6. Data Collection in a Wizard-of-Oz Experiment in Reinforcement Learning for Adaptive Dialogue Systems by: Rieser & Lemon. Course: Autonomous.
Why Study Spoken Language?
Spoken Dialog System.
PHONETICS.
Ewald van der Westhuizen Digital signal processing group
Why Study Spoken Language?
Audio Books for Phonetics Research
Extracting Recipes from Chemical Academic Papers
Natural Language to SQL(nl2sql)
Automatic Speech Recognition
Automatic Speech Recognition
Network Training for Continuous Speech Recognition
Artificial Intelligence 2004 Speech & Natural Language Processing
Huawei CBG AI Challenges
Presentation transcript:

MAI Internship April-May 2002

MAI Internship 2002 Slide 2 of 14 What? The AST Project promotes development of speech technology for official languages of South Africa SAEnglish, Afrikaans, Zulu, Xhosa, Sesotho Create reusable databases & software Prototype hotel booking dialogue system

MAI Internship 2002 Slide 3 of 14 AST dialogue system: basics Telephone Network Speech Recognition Natural Language Understanding Dialogue Manager Speech Synthesis DATABASEDATABASE

MAI Internship 2002 Slide 4 of 14 Use?  input ASR: acoustic training  output ASR: dictionary Start from scratch, even for SAE Telephone data based on SpeechDat –Datasheet utterances –Hierarchical recruiting method Labeling Tool: PRAAT AST Speech Database

MAI Internship 2002 Slide 5 of 14 Language SpokenCodeNo. of Speakers 1 English (E) Speech varieties: Mother-tongue English Black English Coloured English Asian English Afrikaans English EE BE CE ASE AE isiXhosa (X)XX Sesotho (S)SS isiZulu (Z)ZZ Afrikaans (A) Speech varieties: Mother-tongue Afrikaans Black Afrikaans Coloured Afrikaans AA BA CA

MAI Internship 2002 Slide 6 of 14 AST Speech Database Orthographic annotation Phonemic transcription Acoustic signal Phonetic alignment Manual labour Rules & dictionary: Patana Forced alignment: HTK

MAI Internship 2002 Slide 7 of 14 Difficult: –Speaker independent, noisy conditions –Medium-size vocabulary ( words) –Training data sparse Not so difficult: –Dialogue Manager helps Phoneme-based HMMs  future diphones Finite-state language model Pitch & clicks African languages ignored AST Speech Recognition

MAI Internship 2002 Slide 8 of 14 Same finite-state network as language model recogniser  +: all utterances ‘understood’ -: FSG are limited Makes no sense to recognise more than we can understand Semantic labels are activated Alternative: robust parsing (Phoenix, ATIS) AST Natural Language Understanding

MAI Internship 2002 Slide 9 of 14 Speech Recognition NLU Dialogue Manager FSG Recognised utterance Grammar ID Meaning AST Natural Language Understanding

MAI Internship 2002 Slide 10 of 14 Embedded semantic tags: ‘drie honderd duisend agt en neëntig’  V6=3 V5=0 V4=0 V3=0 V2=9 V1=8 t1=3 t2=0 t3=0 AST Natural Language Understanding

MAI Internship 2002 Slide 11 of 14 Trade-off: naturalness  response restriction System-directed: predictability user utterances, simple dialogues Mixed-initiative: shorter dialogues, more recognition errors User-initiative: unpopular AST Dialogue Manager

MAI Internship 2002 Slide 12 of 14 Design: Early focus on users and task Wizard-of-Oz: pay no attention to the man behind the curtain System-in-the-loop Finite-state structure because of simplicity and functionality Possible frame-based approach in future AST Dialogue Manager

MAI Internship 2002 Slide 13 of 14 Fixed machine utterances: pre-recorded speech Database queries: limited-domain synthesis (Festival platform) AST Speech Synthesis

MAI Internship 2002 Slide 14 of 14 Conclusion Finite-state approach in –Recogniser –NLU component –Dialogue manager  Workable prototype  New fundings 2003