The TBALL Project Data Collection: Making a Young Children's Speech Corpus Abe Kazemzadeh*, Hong You +, Markus Iseli +, Barbara Jones +, Xiaodong Cui +,

Slides:



Advertisements
Similar presentations
How to use Elementary Advantage 2010 of Training Module Design Plan Analinda Corona 03/09/2010 EDTC 3332 Professor Joseph Rene Corbeil.
Advertisements

The Nature of Language Learning
English only VS. L1 support Looking at English language acquisition of native Spanish speaking children.
Sound – Print Connection. Learning to read entails… Normally developed language skills Normally developed language skills Knowledge of phonological structures.
Phonology Organization and interaction of sounds in a language sound system.
Spoken Vs Written Language. Introduction Languages are first spoken, then written, and then an understanding.
Tone, Accent and Stress February 14, 2014 Practicalities Production Exercise #2 is due at 5 pm today! For Monday after the break: Yoruba tone transcription.
Alphabet and Pronunciation
Running Records.
TBALL Data Collection Abe Kazemzadeh, Hong You, Markus Iseli, Barbara Jones, Xiaodong Cui, Margaret Heritage, Patti Price, Elaine Anderson, Shrikanth Narayanan,
Recognition of Voice Onset Time for Use in Detecting Pronunciation Variation ● Project Description ● What is Voice Onset Time (VOT)? – Physical Realization.
SPOKEN LANGUAGE SYSTEMS MIT Computer Science and Artificial Intelligence Laboratory Mitchell Peabody, Chao Wang, and Stephanie Seneff June 19, 2004 Lexical.
Chapter 12: 2 nd Lang. Acq. As coffee is an ACQUIRED taste…
Input-Output Relations in Syntactic Development Reflected in Large Corpora Anat Ninio The Hebrew University, Jerusalem The 2009 Biennial Meeting of SRCD,
Research on teaching and learning pronunciation
Chapter three Phonology
NOVA Comprehensive Perspectives on Child Speech Development and Disorders Chapter 14 Acquisition of the English Voicing Contrast by Native Spanish-Speaking.
How we use effective strategies for teaching ESL learners (Whole School) January 27, 2014.
Presenter: Dung Thi Nguyen Date: September 15, 2011.
Teaching English to Korean Students Understanding Their Particular Problems.
Materials For The Module Training Book DIBELS ® Next Assessment Manual Stopwatch, clipboard and pen/pencil DIBELS ® Next Kindergarten benchmark scoring.
CSD 2230 HUMAN COMMUNICATION DISORDERS
English Pronunciation Practice A Practical Course for Students of English By Wang Guizhen Faculty of English Language & Culture Guangdong University of.
Phonemic Awareness A brief overview. Phonemic Awareness is vital to language, vocabulary, listening comprehension, spelling, writing, and word recognition.
Materials For The Module Training Book DIBELS ® Next Assessment Manual Stopwatch Pen or pencil DIBELS ® Next Kindergarten and 1 st Grade Scoring Booklets.
WWB Training Kit #2 Understanding the Impact of Language Differences on Classroom Behavior.
The first steps in Tashelhiyt Berber language acquisition: a longitudinal case study Kern, Sophie & Mohamed Lahrouchi Laboratoire Dynamique du Langage.
Sebastián-Gallés, N. & Bosch, L. (2009) Developmental shift in the discrimination of vowel contrasts in bilingual infants: is the distributional account.
Applying New Voice Recognition Technology to Formative Assessment Margaret Heritage UCLA Graduate School of Education & Information Studies National Center.
Speech Perception 4/6/00 Acoustic-Perceptual Invariance in Speech Perceptual Constancy or Perceptual Invariance: –Perpetual constancy is necessary, however,
Voice Onset Time In Chinese Learners of English Major Sharpe.
Assessing Reading Skills in Young Children: Assessing Reading Skills in Young Children: The TBALL Project (Technology Based Assessment of Language and.
Published by the California Department of Education (2009)
EA in ESL Teacher Training Workshops June 4, 6, & 8, 2007 – 4:30 to 7:30 p.m. Kapi‘olani Community College Teacher Preparation Program Shawn Ford and Veronica.
Arizona English Language Learner Assessment AZELLA
By: Alfred Toole, Jr. 1) Most ESL students can learn English in 2 to 4 years by being exposed to and surrounded by native language speakers. A) True.
American Speechsounds How to Use the Program. AmericanSpeechsounds Why use American Speechsounds? Practice the problem sounds of American English Learn.
PED 392 Child Growth and Development. Definitions Language A symbolic system: a series of sounds or gestures in which words represent an idea, object.
American School of Warsaw
Leadership Presentation Alex Price PPS /10/2011.
Alphabet and Pronunciation El Alfabeto y la pronunciación.
Phonetics: consonants
GRDG626: Language, Literacy, and Diversity in American Education Using Linguistic Analysis Dr. Gloria E. Jacobs.
Direct Method.
Words Which Way? CURR 511. What are you wondering? How does WTW work? Is it an assessment or a program? How do WTW levels relate to GR/DRA levels? What.
TEACHING PRONUNCIATION
General Education Special Education Inclusion Classroom Self- Contained Classroom Bilingual Education Resource Room Collaborative Teaching Home School.
Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning Speech Communication, 2000 Authors: S. M. Witt, S. J. Young Presenter:
EXPRESS YOURSELF. NEUTRAL ACCENT Neutral accent is a way of speaking a language without regionalism. Accent means variation in pronunciation and it should.
Outline  I. Introduction  II. Reading fluency components  III. Experimental study  1) Method and participants  2) Testing materials  IV. Interpretation.
Phonetics: More applicaitons Raung-fu Chung Southern Taiwan University
TEACHING LITERACY SKILLS – READING & WRITING LING 322.
Whip Around If you were stranded on an island, what two things would you like to have with you? Think about this question and be prepared to share aloud.
Welcome to All S. Course Code: EL 120 Course Name English Phonetics and Linguistics Lecture 1 Introducing the Course (p.2-8) Unit 1: Introducing Phonetics.
Chapter 2: The variation problem 1: Inter-speaker variation J. Jenkins The phonology of English as an international language Presented by: Carrie Newdall.
The Interference of Southern Min in Lugang Students‘ English Pronunciation 戴孜妤 (2000) M98C0103 黃俐雯.
Predicting Children’s Reading Ability using Evaluator-Informed Features Matthew Black, Joseph Tepperman, Sungbok Lee, and Shrikanth Narayanan Signal Analysis.
Students & Programs By Katie Hampton The Facts At least 3.5 million children identified as limited in English proficiency (LEP) are enrolled in U.S.
Lesson 2. NEEDS ANALYSIS Student want to work on: Speaking about complex topics Speaking on the phone (companies) Speaking with doctors Practicing for.
A Bayesian Network Classifier for Word-level Reading Assessment Joseph Tepperman 1, Matthew Black 1, Patti Price 2, Sungbok Lee 1, Abe Kazemzadeh 1, Matteo.
an Introduction to English
Early Reading Skills: Phonological Awareness
CHAPTER 8: Language and Bilingual Assessment
Phonological Rules of English
Lesson Plan: Phonemic awareness
English Phonetics and Phonology
Content-Based Language Teaching
Section 2: Developing Language Arts Programs
Automating Early Assessment of Academic Standards for Very Young Native and Non-Native Speakers of American English better known as The TBALL Project.
Where to start? Think of the area of the child’s communication which is impacting on them the most, in your opinion. Answer the following questions based.
Presentation transcript:

The TBALL Project Data Collection: Making a Young Children's Speech Corpus Abe Kazemzadeh*, Hong You +, Markus Iseli +, Barbara Jones +, Xiaodong Cui +, Margaret Heritage +, Patti Price^, Elaine Anderson*, Shrikanth Narayanan*, and Abeer Alwan + * University of Southern California, + University of California Los Angeles, and ^ PPRICE Speech and Language Techology Reading Tactics ● Sounding out words generally helped children. ● Mispronunciations when a subword portion is confused with another word (e.g., once, using). ● Confusion with the different sounds an orthographic symbol may have (e.g., “now” pronounced as “no”). Project Goals ● Automation of literacy assessment measures using speech and language technology. ● Development of standards and methods for reliable, objective assessment. ● One-on-one interaction with child, which leaves teachers with more time for teaching. ● Focused on fair assessment robust to dialect variation including nonnative speakers. ● Support for teacher feedback and database records. Data Collection Motivation ● Establish a corpus for studying child and non-native speech. ● Build speech applications for under- represented populations. ● Analyze pronunciation variation. ● Test bed for our target child-computer interface. ● Test hardware, animations, timing, vocabulary, etc. ● Measure children's ability with respect to grade level and other factors Age/Grade Effects ● Position in school year is important (children learn a lot between the beginning and end of the school year). ● Younger children are more timid. ● Less social and reading experience. ● Less exposure to computers. Pronunciation Variation ● Read speech is slower. ● Long breaks in fricatives followed by stops (e.g., s-tart). ● Long liquids, nasals, and fricatives. ● Syllables spoken slowly (e.g., a-long). ● Final consonants may be delayed or dropped (e.g., par-t or par- ). ● Difficulty with “am” and “an” in isolation. ● At times, children speak in an exaggerated voice. Language Background Effects ● Difficulty associating words with pictures. ● Sometimes reading sentences was performed better than individual words by children who could read in Spanish but not English. ● Sounding out words with Spanish letter-to- sound rules. Higher Level Phenomena ● Using “a, an, some” in picture naming. ● Perhaps due to grammatical differences in English and Spanish. ● Verb tense changes when reading sentences. ● Formation of contractions from long forms (but not vice-versa). ● Reanalysis of sentence after the child realizes a mistake he/she has made. ● Enhanced ARPABET symbols to represent phenomena peculiar to non-native and children's speech: ● Dental stops ● Unaspirated voiceless stops ● Negative VOT (prevoiced) stops ● Lispy /s/ ● Glottalized /t/ ● Long frication of /f/ ● Trill ● Syllabic Consonants ● With a convention to represent vowel space with respect to English vowels ● Non-native vowels defined by the two nearest English vowels, with the highest vowel first (e.g., /iyih/). ● 82% phone-level transcriber agreement. ● Transcribers started with 100% overlap (each file transcribed twice), 25% after agreement was established. ● Sentences are transcribed by word-level alignments, with phone-level detail if there was pronunciation variation. Our recording setup was similar to our target application: ● Two visible operators: ● The “techie” controls the presentation of stimuli and monitors recording quality. ● The “tester” instructs and guides the child. ● The target application would include just the child and computer. ● 20 min. per child maximum. ● A secure database stores child demographic info and speech recordings: ● Age, grade, English level, ● native language, language used at home, language used with friends, ● parents' native language, parents' birthplaces. ● Accommodations for children. ● ~15 children recorded per day (~1.9 hours of recorded speech per day). ● Stimuli: pictures, colors, alphabet, numbers, words, and sentences. Transcriptions Wizard of Oz Interface Results and Observations Project Overview: Technology Based Assessment of Language and Literacy Acknowledgements Corpus Statistics ● 256 Children recorded. ● ~ 30,000 utterances. ● 40 hours of speech. ● 13 GB of speech data sampled at 44.1 kHz. This project is supported in part by the NSF. In addition, this work would not be possible with out the hard work of transcribers Daylen Riggs and Nathan Go; the patience and bilingualism of Kimberly Reynolds and Blanca Martinez; the careful recordings of Erdem Unal, Vivek Rangarajan, Shiva Sundaram, Yirong Yang, Jinjin Ye, and Yijian Bai; and the planning of Larry Casey and Christy Boscardin. “techie” “tester” child Native Language Distribution of Recorded Subjects