How to integrate automatic speech recognition (ASR) into CALL applications Helmer Strik Department of Linguistics Centre for Language and Speech Technology (CLST) Radboud University Nijmegen, The Netherlands Radboud University Nijmegen
LESLLA, Antwerpen, Overview Introduction ASR: automatic speech recognition ASR-based tutoring ASR-based CALL ASR-based literacy training Conclusions
Radboud University Nijmegen LESLLA, Antwerpen, Introduction Students who receive 1-on-1 instruction perform as well as the top two percent of students who receive traditional classroom instruction [Bloom 1984] A human tutor for every student is not feasible computer tutors For language learning: CALL Many text-based CALL systems Include speech speech-based CALL system
Radboud University Nijmegen LESLLA, Antwerpen, Speech inside Many applications with ‘speech’: Screen readers [#] Reading pen Mobile phone: photo + OCR + TTS Some also (useful) for CALL [#]
Radboud University Nijmegen LESLLA, Antwerpen, Speech inside (cont’d) Many applications with ‘speech’ Screen readers, reading pen, etc. Some also (useful) for CALL However, usually the learner can only listen (TTS: text-to-speech) or, also speak, but … no assessment, or the learner has to carry out the assessment, e.g. by comparing with examples use ASR / speech technology Is it feasible?
Radboud University Nijmegen LESLLA, Antwerpen, ASR: automatic speech recognition What is ASR? Speech to text conversion Applications: Dictation Command and control Spoken dialogue systems (information) etc. ASR is not flawless, and it will probably never be esp. for non-native speech Note: this is not even the case for humans!
Radboud University Nijmegen LESLLA, Antwerpen, Speech Recognition cgn2-s vb nn mii
Radboud University Nijmegen LESLLA, Antwerpen, ASR-based tutoring ITS: Intelligent Tutoring Systems Spoken dialogue system for learning Subject matter: math, physics, etc. Examples: ITSPOKE, Univ. of Pittsburgh, Litman et al. Topic: Physics SCoT, Stanford Univ., Peters et al. Topic (SCoT-DC): shipboard damage control Communicate with speech the subject matter doesn’t have to be speech
Radboud University Nijmegen LESLLA, Antwerpen, ASR-based CALL The subject matter is speech (language) Late 1990’s: 1998: STiLL, Marholmen (Sweden); 1 st time the CALL and Speech communities met 1999: Special Issue of CALICO, 'Tutors that Listen‘, focusing on ASR (mainly ‘discrete ASR’)
Radboud University Nijmegen LESLLA, Antwerpen, ASR-based literacy training What has been done? Reading tutors (the learner reads, not the PC): Listen, CMU, Pittsburgh; Mostow et al. (1994) STAR system, UK; Russel et al. (1996) SPACE, KU Leuven; Van hamme, Duchateau, et al. … and many others [#] FtL: Foundations to Literacy, Boulder; Cole, Wise, et al.
Radboud University Nijmegen LESLLA, Antwerpen, ASR-based literacy training Foundations to Literacy Interactive Books Teach fluent reading & comprehension Foundational Skills Tutors Teach underlying reading skills Phonics
Radboud University Nijmegen LESLLA, Antwerpen, ASR-based literacy training (cont’d) What has been done? Reading tutors: Listen, CMU, Pittsburgh; Mostow et al. (1994) STAR system, UK; Russel et al. (1996) SPACE, KU Leuven; Van hamme, Duchateau, et al. …, and many others FtL: Foundations to Literacy, Boulder; Cole, Wise, et al. Mostly for children And for adults? What is needed? What is possible, and what is not? …
Radboud University Nijmegen LESLLA, Antwerpen, ASR-based CALL ASR is not flawless, and it will probably never be esp. for non-native speech Be aware of what is (not) possible with ASR technology Problematic issues and possible solutions: Noise, esp. background speech min., head-sets Disfluencies min., improve autom. handling Non-native pronunciation Recognizing utterances utterance verification Detect pronunciation errors classifiers
Radboud University Nijmegen LESLLA, Antwerpen, ASR-based CALL Our research: Non-natives Assessment of oral proficiency Dutch-CAPT – pronunciation oASR / UV – Utterance Verification oPED – Pronunciation Error Detection DISCO – pronunciation, morphology, syntax TST-AAP People with speech disability for training & as communication aid (AAC) ASR for dysarthric speech EST: E-learning based Speech Therapy
Radboud University Nijmegen LESLLA, Antwerpen, ASR-based CALL Project Dutch-CAPT (Computer Assisted Pronuciation Training)
Radboud University Nijmegen LESLLA, Antwerpen,
Radboud University Nijmegen LESLLA, Antwerpen, ASR-based CALL (cont’d) Project Dutch-CAPT (CAPT: Computer Assisted Pronuciation Training) Exp. group: used the Dutch-CAPT system 2 control groups: didn’t use Dutch-CAPT The reduction in the number of pronunciation errors made was significantly larger for the exp. group, Training: 4 weeks x 1 session of 30’ – 60’
Radboud University Nijmegen LESLLA, Antwerpen, ASR-based CALL (cont’d) ASR is not flawless, and it will probably never be esp. for non-native speech Be aware of what is (not) possible with ASR technology Problematic issues and possible solutions: Noise, esp. background speech min., head-sets Disfluencies min., improve autom. handling Non-native pronunciation Recognizing utterances utterance verification Detect pronunciation errors classifiers Mix of expertise needed: ASR techn., L-acq., pedagogy, design, …
Radboud University Nijmegen LESLLA, Antwerpen, ASR-based literacy training Demonstration project TST-AAP Existing course Add speech technology: Detect whether words & sounds were pronounced (correctly)
Radboud University Nijmegen LESLLA, Antwerpen, ASR-based literacy training Listening; PC: produces speech Text-To-Speech (TTS); quality good enough? Recorded speech, concatenation Speaking;PC: recognizes speech Phonics (see FtL) PC: Recognize words, utterances: CMs for Utt. Ver. PC: Recognize sounds: CMs for Phon. Ver. (contrasts) Reading (reading tutors) PC: Recognize words, utterances PC: Pointer in the text (‘track’ the reader) PC: Help when encountering problems PC: Change tempo read faster
Radboud University Nijmegen LESLLA, Antwerpen, ASR-based CALL Advantages of using speech (vs. writing) Self-explanation Extra information: Prosody (stress, accent) Emotions Confidence Other useful techniques: VTH [#]
Radboud University Nijmegen LESLLA, Antwerpen, Conclusions ASR is not flawless ASR-based tutoring is possible (restricted domain) general topics; ITS: ITSPOKE, SCoT CALL; many systems: non-natives, disabled, etc. Literacy training So far mainly for children And for adults !? Needed Mix of expertise: techn., L-acq., pedagogy, design, … Improved ASR, speech technology Projects, funds
Radboud University Nijmegen LESLLA, Antwerpen, Questions? Why are there so few ASR-based CALL / literacy applications for adults? What are, in this context, important differences between children & adults? What is needed? Listening; PC: produces speech Speaking;PC: recognizes speech Phonics Reading (reading tutors) What else?
Radboud University Nijmegen LESLLA, Antwerpen, Questions? Why are there so few ASR-based CALL / literacy applications for adults? What are, in this context, important differences between children & adults? What is needed? Listening; PC: produces speech Speaking;PC: recognizes speech Phonics Reading (reading tutors) What else?
Radboud University Nijmegen LESLLA, Antwerpen,