1 BILC SEMINAR 2009 Speech Recognition: Is It for Real? Tony Mirabito Defense Language Institute English Language Center (DLIELC) DLIELC.

Slides:



Advertisements
Similar presentations
Introduction to Computational Linguistics
Advertisements

Introduction to Computational Linguistics
Seminar on Language Teaching IKG 743 (2) A LECTURE BY Rida Wahyuningrum ENGLISH DEPARTMENT FACULTY OF LANGUAGE AND SCIENCE SURABAYA WIJAYA KUSUMA UNIVERSITY.
Light Oaks Junior School Year 5 Computing Curriculum The computing curriculum across all year groups will be made up of six units; - Algorithms and Programs.
Natalie Fong English Centre, The University of Hong Kong Good Practices in a Second Language Classroom: An Alternating Use of ICT in Independent Learning.
EE3P BEng Final Year Project – 1 st meeting SLaTE – Speech and Language Technology in Education Martin Russell
Acoustic Model Adaptation Based On Pronunciation Variability Analysis For Non-Native Speech Recognition Yoo Rhee Oh, Jae Sam Yoon, and Hong Kook Kim Dept.
Development of Automatic Speech Recognition and Synthesis Technologies to Support Chinese Learners of English: The CUHK Experience Helen Meng, Wai-Kit.
PHONEXIA Can I have it in writing?. Discuss and share your answers to the following questions: 1.When you have English lessons listening to spoken English,
The Computerized ACTFL- based Speech Tool (CAST) Dr. Mary Ann Lyman-Hager and Ms. Kirsten Barber San Diego State University Merlot Conference, August 2004.
The Beatbox Voice-to-Drum Synthesizer A BSTRACT The Beatbox is a real time voice-to-drum synthesizer intended primarily for the entertainment of small.
ITCS 6010 Spoken Language Systems: Architecture. Elements of a Spoken Language System Endpointing Feature extraction Recognition Natural language understanding.
Feature vs. Model Based Vocal Tract Length Normalization for a Speech Recognition-based Interactive Toy Jacky CHAU Department of Computer Science and Engineering.
A Gift of Fire, 2edChapter 1: Unwrapping the Gift1 Appreciating the Benefits  Benefits of Computer Applications Computer games attribute to new technologies,
1 Security problems of your keyboard –Authentication based on key strokes –Compromising emanations consist of electrical, mechanical, or acoustical –Supply.
CALL: Computer-Assisted Language Learning. 2/14 Computer-Assisted (Language) Learning “Little” programs Purpose-built learning programs (courseware) Using.
Teaching Tool For French Speech Pronunciation Capstone Design Project 2008 Joseph Ciaburri Advisor: Professor Catravas.
Natural Language Processing and Speech Enabled Applications by Pavlovic Nenad.
Assistive Technology By: Roxanne Majeski, Oscar Guerin, Tasha Reaves, Elias Luna.
Natural Language Understanding
Empowering Teachers, Building Writers September 16 th (3 rd Grade) Selma Unified School District Presenter: Raquel Velasco, District Literacy Coach.
Copyright © 2010 by Educational Testing Service. All rights reserved. ETS, the ETS logo, LISTENING. LEARNING. LEADING. GRE and TOEFL are registered trademarks.
Review of Building Intelligent.NET Applications Stu Egli Andre Inistotov Frenny Saldana Kate Styers Nishant Zinzuwadia MSE 614 February 26, 2008.
[1] Processing the Prosody of Oral Presentations Rebecca Hincks KTH, The Royal Institute of Technology Department of Speech, Music and Hearing The Unit.
BravoBrava Mississippi State University Can Advances in Speech Recognition make Spoken Language as Convenient and as Accessible as Online Text? Joseph.
Knowledge Base approach for spoken digit recognition Vijetha Periyavaram.
1 7-Speech Recognition (Cont’d) HMM Calculating Approaches Neural Components Three Basic HMM Problems Viterbi Algorithm State Duration Modeling Training.
Artificial Intelligence 2004 Speech & Natural Language Processing Natural Language Processing written text as input sentences (well-formed) Speech.
Data collection and experimentation. Why should we talk about data collection? It is a central part of most, if not all, aspects of current speech technology.
Using ICT to Support Students who are Deaf. 2 Professional Development and Support: Why? Isolation Unique and common problems Affirmation Pace of change.
Midterm Review Spoken Language Processing Prof. Andrew Rosenberg.
Some Thoughts on HPC in Natural Language Engineering Steven Bird University of Melbourne & University of Pennsylvania.
Supervisor: Dr. Eddie Jones Electronic Engineering Department Final Year Project 2008/09 Development of a Speaker Recognition/Verification System for Security.
How Spread Works. Spread Spread stands for Speech and Phoneme Recognition as Educational Aid for the Deaf and Hearing Impaired Children It is a game used.
Input Devices.  Identify audio and video input devices  List the function of the respective devices.
1 Computational Linguistics Ling 200 Spring 2006.
CMPD273 Multimedia System Prepared by Nazrita Ibrahim © UNITEN2002 Multimedia System Characteristic Reference: F. Fluckiger: “Understanding networked multimedia,
2. Needs Assessment Reading Prosody Jessica Rauth.
By: Colleen Shannon, August Mendes. Literacy technology is the ability to responsibly, creatively, and effectively use appropriate technology. Uses: Communication.
Chapter 3.2 Speech Communication Human Performance Engineering Robert W. Bailey, Ph.D. Third Edition.
LML Speech Recognition Speech Recognition Introduction I E.M. Bakker.
BILC Oct 09 Effective EFL Pedagogy with 21 st Century Technology Rebecca Jungen BILC Seminar 2009 Media & Technology Branch Defense Language Institute.
Introduction to Computational Linguistics
ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent Recognition of foreign names spoken by native speakers Frederik Stouten & Jean-Pierre Martens Ghent University.
1 APST – Intelligibility Assessment APST – Intelligibility Assessment Linda J. Ferrier-Reid, Ph.D. Rita MacAuslan, Ph.D. Robert MacAuslan, Ph.D.
Data Structures and Algorithms Dr. Tehseen Zia Assistant Professor Dept. Computer Science and IT University of Sargodha Lecture 1.
Speech Recognition with CMU Sphinx Srikar Nadipally Hareesh Lingareddy.
© 2013 by Larson Technical Services
Basic structure of sphinx 4
Using Voice to Solve Ergonomic Problems Dr. William Lenharth, CHFP UNH – Project54.
ARTIFICIAL INTELLIGENCE FOR SPEECH RECOGNITION. Introduction What is Speech Recognition?  also known as automatic speech recognition or computer speech.
Chapter 7 Speech Recognition Framework  7.1 The main form and application of speech recognition  7.2 The main factors of speech recognition  7.3 The.
Natural Language and Speech (parts of Chapters 8 & 9)
Speech Recognition Created By : Kanjariya Hardik G.
TEACHING AND LEARNING WITH TECHNOLOGY IN ENGLISH AND LANGUAGE ARTS INSTRUCTION BY CHRISTEN BURKE.
Message Source Linguistic Channel Articulatory Channel Acoustic Channel Observable: MessageWordsSounds Features Bayesian formulation for speech recognition:
1 7-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches Recognition Theories Bayse Rule Simple Language Model P(A|W) Network Types.
Definition, purposes/functions, elements of IR systems Lesson 1.
PREPARED BY MANOJ TALUKDAR MSC 4 TH SEM ROLL-NO 05 GUKC-2012 IN THE GUIDENCE OF DR. SANJIB KR KALITA.
Speaker Recognition UNIT -6. Introduction  Speaker recognition is the process of automatically recognizing who is speaking on the basis of information.
Using Speech Recognition to Predict VoIP Quality
Speech Recognition
Natural Language Processing and Speech Enabled Applications
Automatic Speech Recognition
Speech Processing AEGIS RET All-Hands Meeting
Artificial Intelligence for Speech Recognition
Retrieval of audio testimonials via voice search
Assistive System Progress Report 1
David Cyphert CS 2310 – Software Engineering
Command Me Specification
Presentation transcript:

1 BILC SEMINAR 2009 Speech Recognition: Is It for Real? Tony Mirabito Defense Language Institute English Language Center (DLIELC) DLIELC

2 OVERVIEW ● The technology evolution in language teaching ● What speech recognition is ● How speech recognition works ● Shortcomings of speech recognition ● Conclusions DLIELC

3 THE TECHNOLOGY EVOLUTION The Classroom

4 WHAT SPEECH RECOGNITION IS ● A formal definition: -- A system of spoken input into a computer in which software can “recognize” the input and transform it into digitized signals—that is, “react” in various ways to the spoken input ● Examples: -- Speech to text -- “Telephony”: airlines, transportation, etc. -- Commercial software for learning a language (e.g. Rosetta Stone) DLIELC

5 WHAT SPEECH RECOGNITION IS ● A computer program that takes verbal input and “matches” it against models—acoustic and language models ● A computer program that allows speech to be evaluated as “correct” or “acceptable” or “incorrect” or “unacceptable” ● A computer program that “talks to“ an authoring software and allows the software to branch in different directions based on the evaluation DLIELC

6 WHAT SPEECH RECOGNITION IS ● “Speaker independent” vs. “speaker dependent” -- Speaker independent: a speech recognition program that recognizes all speakers (used in language learning) -- Speaker dependent: a speech recognition program that is “trained” to recognize a particular speaker (speech to text) DLIELC

7 WHAT SPEECH RECOGNITION IS ● “Discreet speech input” vs. “continuous speech input” -- Discreet speech input: requires a user to pause between words (e.g. “I + want + to + leave.”) -- Continuous speech input: blending of sounds between words is allowed (e.g. “next + week” becomes “neksweek’) ● Cannot “understand” free speech; it “matches” speech input with stored data and pre-determined parameters DLIELC

8 HOW SPEECH RECOGNITION WORKS ● Speech recognition technology is based on the Markov Chain Theory, a mathematical formula that deals with probabilities and changes ● Most speech recognition engines contain common databases for a particular language: -- Grammar -- Lexicon (dictionary) -- Supra-segmental models (prosody) -- Acoustic speech samples DLIELC

9 HOW SPEECH RECOGNITION WORKS ● Speech is input through a microphone, analyzed by databases (“search aligners”), and then scored against a norm ● A developer can determine the score as being “acceptable” or “unacceptable”; appropriate feedback from the authoring software can be given to the user (text, audio, video, etc.) DLIELC

10 DLIELC Speech Input SR Engine Databases Authoring Software Feedback HOW SPEECH RECOGNITION WORKS ● Communication between the speech engine and the authoring software is essential That’s a table.

11 SHORTCOMINGS OF SPEECH RECOGNITION TECHNOLOGY ● Quiet environment needed plus noise- reduction microphones ● General problems with consonants ● Prosody and fluency problems which requires re-engineering the engine ● Perpetual issues: false positives false negatives ● Most recognizers are effective only in limited domains DLIELC

12 SHORTCOMINGS OF SPEECH RECOGNITION TECHNOLOGY ● If used as a diagnostic tool to pinpoint pronunciation problems, it must be carefully re-engineered to do so, and it must be accurate DLIELC

13 CONCLUSIONS ● Most speech recognizers have a couple of domains where they are effective—the recognizer should be used only in these domains (“low stakes” vs. “high stakes”) ● Speech recognition technology is not a “black box” or a “magic pill”. It’s a tool that has to be used very carefully. ● Research needs to be done in order to use a recognizer effectively ● We must never forget that technology is effective only if it allows people to learn DLIELC

14 BILC SEMINAR 2009 QUESTIONS? COMMENTS? DLIELC