Detecting Prosody Improvement in Oral Rereading Minh Duong and Jack Mostow Project LISTEN www.cs.cmu.edu/~listen Carnegie Mellon University The research.

Slides:



Advertisements
Similar presentations
Chapter 10 Fluency Instruction
Advertisements

Fluency Assessment Bryan Karazia.
Chapter 9 - Fluency Assessment
The Role of F0 in the Perceived Accentedness of L2 Speech Mary Grantham O’Brien Stephen Winters GLAC-15, Banff, Alberta May 1, 2009.
MAP: Basics Overview Jenny McEvoy and Page Powell.
Reading Fluency Instruction and Its Effect on Student Achievement By: Kelly Shea.
Chapter 9: Fluency Assessment
Fluency Grades 2-5 Planning Session Presentation October 2010.
Toward Automatic Music Audio Summary Generation from Signal Analysis Seminar „Communications Engineering“ 11. December 2007 Patricia Signé.
Model Adequacy Checking in the ANOVA Text reference, Section 3-4, pg
Section IV: Reading Fluency Teaching Reading Sourcebook 2 nd edition.
O RAL R EADING F LUENCY Goal: Help you child be a Superhero Reader! Created and Presented by Diane M. Leja Literacy Coach.
Carnegie Mellon 1. Toward Exploiting EEG Input in a Reading Tutor Jack Mostow, Kai-min Chang, and Jessica Nelson Project LISTEN
Combining Prosodic and Text Features for Segmentation of Mandarin Broadcast News Gina-Anne Levow University of Chicago SIGHAN July 25, 2004.
Mining Data from Randomized Within-Subject Experiments in an Automated Reading Tutor Joseph E. Beck and Jack Mostow Project LISTEN (
Descriptive Statistics Primer
1 Evaluating the Effect of Predicting Oral Reading Miscues Satanjeev Banerjee, Joseph Beck, Jack Mostow Project LISTEN ( Carnegie.
Lessons from generating, scoring, and analyzing questions in a Reading Tutor for children Jack Mostow Project LISTEN (
Today Concepts underlying inferential statistics
How to Generate Cloze Questions from Definitions: a Syntactic Approach Donna Gates, Gregory Aist (Iowa State University), Jack Mostow, Margaret McKeown.
Chapter 9 Fluency Assessment Tina Jensen. What? Fluency Assessment Consists of listening to students read aloud for a given time to collect information.
Robust Recognition of Emotion from Speech Mohammed E. Hoque Mohammed Yeasin Max M. Louwerse {mhoque, myeasin, Institute for Intelligent.
Answering questions about life with statistics ! The results of many investigations in biology are collected as numbers known as _____________________.
® Automatic Scoring of Children's Read-Aloud Text Passages and Word Lists Klaus Zechner, John Sabatini and Lei Chen Educational Testing Service.
[1] Processing the Prosody of Oral Presentations Rebecca Hincks KTH, The Royal Institute of Technology Department of Speech, Music and Hearing The Unit.
Fluency Assessment Ch. 9. What? Fluency should be assessed often! –Listen to students read aloud –Collect information about oral reading accuracy & rate.
Chapter 9 Comparing More than Two Means. Review of Simulation-Based Tests  One proportion:  We created a null distribution by flipping a coin, rolling.
Statistics. Intro to statistics Presentations More on who to do qualitative analysis Tututorial time.
Descriptive Statistics And related matters. Two families of statistics Descriptive statistics – procedures for summarizing, organizing, graphing, and,
On Speaker-Specific Prosodic Models for Automatic Dialog Act Segmentation of Multi-Party Meetings Jáchym Kolář 1,2 Elizabeth Shriberg 1,3 Yang Liu 1,4.
Improving the Help Selection Policy in a Reading Tutor that Listens Cecily Heiner, Joseph E. Beck, Jack Mostow Project LISTEN
A prosodically sensitive diphone synthesis system for Korean Kyuchul Yoon Linguistics Department The Ohio State University.
Chapter 1: Science of Psychology Daily Objective (concept map): Apply basic statistical concepts to explain research findings: - Descriptive Statistics:
The Effects of Repeated Reading Instruction on Oral Reading Fluency By Lana Titus CI 843 Spring 2013 Online.
Trial Group AGroup B Mean P value 2.8E-07 Means of Substances Group.
Background: Speakers use prosody to distinguish between the meanings of ambiguous syntactic structures (Snedeker & Trueswell, 2004). Discourse also has.
Normal Distribution.
One-way ANOVA: - Comparing the means IPS chapter 12.2 © 2006 W.H. Freeman and Company.
Carnegie Mellon Mostow 12/7/2015, p. 1 The Sounds of Silence: Towards Automated Evaluation of Student Learning in a Reading Tutor that Listens Jack Mostow.
A Missing Ingredient: Oral Reading Fluency Timothy Shanahan Timothy Shanahan University of Illinois at Chicago
Prosody Training Jessica Rauth. Training Objectives This training is designed to facilitate understanding of reading prosody and how to effectively measure.
Surface Statistics 1.How can you tell if data sets are “good”? SEM/SD 2.How can you tell if data from a control and an experimental groups are different.
T tests comparing two means t tests comparing two means.
Carnegie Mellon How does the amount of context in which words are practiced affect fluency growth? Experimental results Jack Mostow, Jessica Nelson, Martin.
Results: How to interpret and report statistical findings Today’s agenda: 1)A bit about statistical inference, as it is commonly described in scientific.
FLUENCY INSTRUCTION DEFINITION OF FLUENCY Reading at a just right pace, accurately and with expression Combines rate and accuracy Requires automaticity.
T tests comparing two means t tests comparing two means.
Extensive Reading Interventions in Grades K - 3: From Research to Practice Scammacca, Vaughn, Roberts, Wanzek, & Torgesen (2007)
1 - COURSE 4 - DATA HANDLING AND PRESENTATION UNESCO-IHE Institute for Water Education Online Module Water Quality Assessment.
Prosodic Cues to Disengagement and Uncertainty in Physics Tutorial Dialogues Diane Litman, Heather Friedberg, Kate Forbes-Riley University of Pittsburgh.
Data analysis and basic statistics KSU Fellowship in Clinical Pathology Clinical Biochemistry Unit
A Text-free Approach to Assessing Nonnative Intonation Joseph Tepperman, Abe Kazemzadeh, and Shrikanth Narayanan Signal Analysis and Interpretation Laboratory,
Improving Reading Fluency
Wei Chen, Jack Mostow Gregory Aist Project LISTEN
Towards Emotion Prediction in Spoken Tutoring Dialogues
Discussion and Conclusion
Studying Intonation Julia Hirschberg CS /21/2018.
Micro-analysis of Fluency Gains in a Reading Tutor that Listens:
Detecting Prosody Improvement in Oral Rereading
Memory and Melodic Density : A Model for Melody Segmentation
University of Illinois at Chicago
Fluency.
Independent versus Computer-Guided Oral Reading:
An Embedded Experiment to Evaluate the Effectiveness of Vocabulary Previews in an Automated Reading Tutor Jack Mostow, Joe Beck, Juliet Bey, Andrew Cuneo,
Title of session For Event Plus Presenters 12/5/2018.
Jack Mostow* and Joseph Beck Project LISTEN (
Experimental Design Data Normal Distribution
Reading With Your Child
This material is based upon work supported by the National Science Foundation under Grant #XXXXXX. Any opinions, findings, and conclusions or recommendations.
Automatic Prosodic Event Detection
Presentation transcript:

Detecting Prosody Improvement in Oral Rereading Minh Duong and Jack Mostow Project LISTEN Carnegie Mellon University The research reported here was supported by the Institute of Education Sciences, U.S. Department of Education, through Grants R305A The opinions expressed are those of the authors and do not necessarily represent the views of the Institute or the U.S. Department of Education. 1

Motivation A tutor should be able to detect improvement in: Oral reading rate And prosody (rhythm, intensity, intonation, pauses) When? In reading new texts [our AIED2009 paper] In rereading same text [this SLaTE 2009 paper] o Faster improvement than on new text o Easier to detect because controlling for text reduces variance o Popular, effective form of practice This work: detect prosody improvement in rereading. 2

A real example 3 A second grader’s 1 st reading Adult narration Her 2 nd reading, 5 days later Text: “Don’t you like surprises?”

Prosodic attributes 4 Duration: how long ProductionLatency Mean pitch: how high Mean intensity: how loud Normalized Time

Raw features Raw duration, latency, and production Absolute and normalized by word length Averaged within sentence Pause frequency Percent of words with latency > 10 ms (or rejected) Pitch variation Standard deviation of word pitch within sentence 5

Correlational features Expressive reading reflects comprehension Expressive readers’ prosody resembles adults’ [Schwanenflugel et al. ’04, ’06, ’08; Mostow & Duong AIED ’09] Correlate child and adult prosodic contours For each prosodic attribute (latency, pitch, intensity, …) Represented as its sequence of values For the words in the same sentence E.g. pitch correlation increases from 0.37 to

Data source: Project LISTEN’s Reading Tutor See Videos page at 7

Data set: 29,794 sentence rereadings 164 students in grades 2-4 Used the Reading Tutor at school in Read 77,693 sentences; 38% were rereadings Students averaged 132 rereadings (1-1891) and reread those sentences 1.66 times on average 8

Analyses 1.Which features were most sensitive to gains? 2.Can we detect learning? 3.Which features detected individual gains? 9

1. How to measure feature sensitivity? As median per-student effect size for a sentence-level feature (e.g. correlation with adult pitch) 10 Correl pitch Mary Effect size = ….. 1 st reading 2 nd reading Gain: ……. Student:Feature:Sentence: {+0.53, -0.05, …} {0.08, 0.12, …}  Median effect size = 0.10 Joe  Effect size = mean/SD = 0.08

1. Which features were most sensitive? raw duration raw norm duration raw norm production pause frequency raw production raw norm latency raw latency correl duration correl production correl latency correl norm latency correl norm duration pitch variation correl norm production correl pitch correl intensity Median effect size Median effect size ± 95 % confidence interval

1. Temporal features most sensitive 12 raw duration* raw norm duration * raw norm production * pause frequency * raw production * raw norm latency * raw latency * correl duration * correl production * correl latency * correl norm latency * correl norm duration* pitch variation* correl norm production* correl pitch correl intensity Median effect size * : significantly > 0 at 0.05 level Median effect size ± 95 % confidence interval

2. Can we detect learning? Do gains indicate learning, or just recency effects? We split rereadings into: The same day: could be short-term recency effect A later day: actual “learning” Then computed effect sizes as before. 13

2. Which features detected learning? raw duration raw norm duration raw norm production pause frequency raw production raw norm latency raw latency correl duration correl production correl latency correl norm latency correl norm duration pitch variation correl norm production correl pitch correl intensity

15 2. Temporal features detected learning 15 raw duration* raw norm duration* raw norm production* pause frequency* raw production* raw norm latency* raw latency* correl duration* correl production* correl latency correl norm latency* correl norm duration* pitch variation correl norm production correl pitch correl intensity * : significantly > 0 at 0.05 level

2. How did learning compare with recency effects? 16 raw duration raw norm duration raw norm production pause frequency raw production raw norm latency raw latency correl duration correl production correl latency correl norm latency correl norm duration pitch variation correl norm production correl pitch correl intensity

17 2. Recency ~ 2x learning where differed significantly 17 raw duration* raw norm duration* raw norm production* pause frequency* raw production* raw norm latency raw latency correl duration correl production correl latency correl norm latency correl norm duration pitch variation correl norm production correl pitch correl intensity * : Recency > Learning at 0.05 level

3. Which features detected individual gains? What % of students gained significantly (on a 1- tailed paired T-test) from one reading to the next: Overall When rereading on the same day (“recency effect”) When rereading on a later day (“learning”) 18

3. Which features detected individual gains? Overall 19 raw duration raw norm duration raw norm production pause frequency raw production raw norm latency raw latency correl duration correl production correl latency correl norm latency correl norm duration pitch variation correl norm production correl pitch correl intensity

3. Which features detected individual gains? Same day vs. later day 20 raw duration raw norm duration raw norm production pause frequency raw production raw norm latency raw latency correl duration correl production correl latency correl norm latency correl norm duration pitch variation correl norm production correl pitch correl intensity

3. How does individual gains’ significance depend on amount of data? Raw normalized duration (most sensitive overall) 21

Conclusions 1.Which features were most sensitive to gains? Temporal, especially raw 2.Did we detect learning, not just recency effects? Yes, but recency was ~2x stronger where different 3.Did we reliably detect individual gains? Yes, for ~40% of all students No, for students who reread fewer than 21 sentences Yes, for 60% of students who reread 336+ sentences 22

Limitations and future work Why didn’t pitch or intensity features detect gains? Less sensitive to growth, e.g. than word decoding? Measurement error, e.g. in pitch tracking? Within-student variance, e.g. mic position or noise? Across-sentence variance, e.g. adult narration? Extend to work without adult narrations? Extend to detect gains in reading new text? 23

Thank you! Questions? 24

25 6/11/2015 Long words are slower.