Presentation is loading. Please wait.

Presentation is loading. Please wait.

Jack Mostow* and Joseph Beck Project LISTEN (

Similar presentations


Presentation on theme: "Jack Mostow* and Joseph Beck Project LISTEN ("— Presentation transcript:

1 Refined Micro-analysis of Fluency Gains in a Reading Tutor that Listens
Jack Mostow* and Joseph Beck Project LISTEN ( Carnegie Mellon University * Consultant and Scientific Advisory Board Chair, Soliloquy Learning Society for the Scientific Study of Reading 13th Annual Meeting, July, 2006 Funding: National Science Foundation, Heinz Endowments Add SL affiliation 1

2 Research questions and approach
Guided oral reading builds fluency [NRP 00] Typically repeated oral reading … but how do its benefits vary? How good is repeated vs. wide reading? How good is massed vs. spaced practice? How do the answers vary with student proficiency? Approach: micro-analyze oral reading data Massive: hundreds of children Longitudinal: entire school year Fine-grained: word by word Massive: hundreds of children Fine-grained: individual words Longitudinal: entire school year 2

3 Project LISTEN’s Reading Tutor: Rich source of guided oral reading data
Massive 650 students age 5-14 Mostly grades 1-4 Longitudinal school year 55,000 sessions Fine-grained 6.9 million words “Heard” by recognizer Get hours from Joe; say less Video at 3

4 Reading speeds up with practice: example
Initial encounter of muttered: I’ll have to mop up all this (5630 ms) muttered Dennis to himself but how 5 weeks later (different word pair in different sentence): Dennis (110 ms) muttered oh I forgot to ask him for the money Word reading time = latency + production time  1/fluency How does word reading time change in general? 4

5 Learning curve for mean reading time of first 20 encounters, excluding top 50 words
Do some types of encounters help more than others? 5

6 Four types of word encounters
New context? First time today? 1. Read muttered in a new story. Wide Spaced 2. Read muttered in another sentence. Massed 3. On a later day, reread sentence 1. Reread 4. Then reread sentence 2. Predict reading time for 770,858 type 1 encounters from prior encounters of all 4 types. 6

7 Predictor variables Number of word encounters so far of each type
Wide vs. reread Spaced vs. massed Word difficulty # of letters # of past help requests (controls for difficulty for that student) Student proficiency WRMT Word Identification grade-equivalent score, e.g. 2.3 Interpolated for each encounter from pre- and post-test scores 7

8 Exponential model of word reading time
= L * # letters + (P * proficiency + constant A) * e - learning rate B *Exposure Define weights for each type of encounter r for rereading vs. 1 for wide reading m for massed vs. 1 for spaced h for help requests Exposure = weighted sum of # of word encounters so far 1 * # of wide, spaced encounters + r * # of reread, spaced encounters + m * # of wide, massed encounters + r * m * # of reread, massed encounters + h * # of help requests [Beck, J. Using learning decomposition to analyze student fluency development. ITS2006 Educational Data Mining Workshop, Taiwan.] 8

9 Analysis Use SPSS non-linear regression to fit parameters
Caveat: 770,858 trials are not independent So be conservative: Split 650 students into 10 groups Fit r, m, … for each group From the 10 estimates of each parameter, compute: Mean ± standard error Differs significantly from 1? A, L, P, B, h, r, m 9

10 Overall results Wide reading beats rereading
r < 1 (p = .007) 2 new stories ≈ 3 old stories Spaced beats massed practice m = .67 ± .13 m < 1 (p = .007) 2 spaced encounters ≈ 3 massed encounters Do these results vary by proficiency? 10

11 Effects of proficiency
Bottom third Middle third Top third Word ID GE 1.8 (0-2.3) 2.7 ( ) 4.5 ( ) Reread (r) .93 ± .23 .99 ± .23 .79 ± .25 Massed (m) 1.73 ± .46 * .41 ± .08 ** .41 ± .21 ** When does wide reading beat rereading? Maybe only for high readers? Seeing a word again the same day May help low readers more than waiting (p = .058) Helps higher readers less than seeing it later (p < .01) Say significantly different from 1. The lowest group had a WRMT word ID score of 0 to 2.3 with a mean of 1.8 and SD of Medium was 2.3 to 3.1 with a mean of 2.7 and SD of The high proficiency group scores ranged from 3.1 to 10.2, with a mean of 4.5 and SD of 1.38. 11

12 Conclusion: type of practice matters!
Wide reading beats rereading At least for higher readers Advantage of spaced practice varies with proficiency Low readers: seeing a word again the same day may help more Higher readers: better to wait Fluency growth is slow (learning curve is gradual) So differences in practice quality are hard to detect But possible by micro-analysis of massive, longitudinal, fine-grained data Future work Clarify interaction with proficiency Refine model of fluency practice Test correlational results experimentally Thank you! Questions? See papers & videos at To do: 1. Vet revised conclusions. The results are more intuitive now, but how much should we believe them given the shift due to averaging differently? 2. The by-proficiency results contradict the overall result on rereading. What to say? 3. Thanks for "compare asr with transcription.xls". I was disappointed that the correlation was so low and error was so high. How long did the queries take, especially joining lex_view with aligned_word? I'm curious to compare to text-space false alarm rates. 4. lex_view for 2005_2006? Joe doesn't know, but will when done. 12

13 Predictive models of word reading in text
SSSR2005 SSSR2006 Predict Growth from word encounter i to i+1 Performance at encounter i+1 Outcome Reading time speedup Reading time, errors, help requests Predictor Encounter i of word Encounters 1..i Model Linear Exponential 13

14 Outcome variable Combine reading time, errors, help requests
Cap reading time at 3 seconds (0.1% of data) Treat error as 3 seconds Treat help request as 3 seconds the following words were excluded: The 50 most frequent words in English Encounters after the first 20 with a word There were multiple reasons to exclude these data, including running out of memory (even being clever about which columns were really needed) and altering the shape of the learning curve due to a poor model of prior encounters. 3. The first word of a sentence. Since we define reading time = latency + production, and latency is undefined for the first word of the sentence, we do not model first word performance. 4, One character "words" as they might be spelling. 14


Download ppt "Jack Mostow* and Joseph Beck Project LISTEN ("

Similar presentations


Ads by Google