Download presentation
Presentation is loading. Please wait.
1
Carnegie Mellon Project LISTEN 17/22/2004 Some Useful Design Tactics for Mining ITS Data Jack Mostow Project LISTEN (www.cs.cmu.edu/~listen)www.cs.cmu.edu/~listen Carnegie Mellon University Funding: National Science Foundation ITS 04 Workshop on Analyzing Student-Tutor Interaction Logs to Improve Educational Outcomes, Maceio, Brazil
2
Carnegie Mellon Project LISTEN 27/22/2004 Outline 1. Project LISTEN’s Reading Tutor 2. Modify tutor to get mineable data 3. Map data stream to analyzable data set 4. Mine data set to discover insights
3
Carnegie Mellon Project LISTEN 37/22/2004 Project LISTEN’s Reading Tutor (video)
4
Carnegie Mellon Project LISTEN 47/22/2004 Project LISTEN’s Reading Tutor (video) John Rubin (2002). The Sounds of Speech (Show 3). On Reading Rockets (Public Television series commissioned by U.S. Department of Education). Washington, DC: WETA. Available at www.cs.cmu.edu/~listen.www.cs.cmu.edu/~listen
5
Carnegie Mellon Project LISTEN 57/22/2004 Thanks to fellow LISTENers Tutoring: Dr. Joseph Beck, mining tutorial data Prof. Albert Corbett, cognitive tutors Prof. Rollanda O’Connor, reading Prof. Kathy Ayres, stories for children Joe Valeri, activities and interventions Becky Kennedy, linguist Listening: Dr. Mosur Ravishankar, recognizer Dr. Evandro Gouvea, acoustic training John Helman, transcriber Programmers: Andrew Cuneo, application Karen Wong, Teacher Tool Field staff: Dr. Roy Taylor Kristin Bagwell Julie Sleasman Grad students: Hao Cen, HCI Cecily Heiner, MCALL Peter Kant, Education Shanna Tellerman, ETC Plus: Advisory board Research partners DePaul UBC U. Toronto Schools
6
Carnegie Mellon Project LISTEN 67/22/2004 Project LISTEN’s Reading Tutor: A rich source of experimental data 2003-2004 database: 9 schools > 200 computers > 50,000 sessions > 1.5M tutor responses > 10M words recognized Embedded experiments Randomized trials
7
Carnegie Mellon Project LISTEN 77/22/2004 Modify tutor to get mineable data Log operations at grain size and level of interest Click at time t: motor control Click “Goldilocks”: item selection Reify operations to log them analyzably Handwriting or speech typed input Freehand drawing graphical palette (Geometry Tutor) Free-form responses menu selection (Self 88) Natural language sentence starters (Goodman 03) Time student and tutor actions Time allocation reflects motivation (ITS 02) Hasty responses indicate guessing (TICL 04) Latency reflects automaticity (TICL 04)
8
Carnegie Mellon Project LISTEN 87/22/2004 Modify tutor: add relevant data Randomize tutorial decisions What skill to test, what help to give Probe skills Assess cognitive development (Arroyo 00) Test vocabulary words (IJAIE 01) Insert automated comprehension questions (TICL 04) Import student data Gender, age, IQ (Shute 96) Prior knowledge (Corbett 00) Pretest scores (TICL 04) Hand-label when appropriate Transcribe (some) spoken input (FLET 04)
9
Carnegie Mellon Project LISTEN 97/22/2004 Modify tutor: an example Randomize: explain some new words but not others. Probe: test each new word the next day. Did kids do better on explained vs. unexplained words? Overall: NO; 38% 36%, N = 3,171 trials (IJAIE 01). Rare, 1-sense words tested 1-2 days later: YES! 44% >> 26%, N = 189.
10
Carnegie Mellon Project LISTEN 107/22/2004 Map data stream to data set: structure data into a single type Data stream: heterogeneous events over time Data set: elements with the same features Segment into shorter episodes Tutorial action(s) + student response (Beck 00) Slice into narrower strands Successive encounters of a specific word (AMLDP 98) Successive instances of a specific skill (learning curves) Measure aggregated events Allocation of time among activities (ITS 02) Formulate data as experimental trials Context where the trial occurred Decision made in this trial Outcome based on subsequent events
11
Carnegie Mellon Project LISTEN 117/22/2004 Data stream: Map data stream to data set: Formulate data as experimental trials Outcome: read fluently? Decision (randomized) Student clicks ‘read.’ ‘I love to read stories.’ ‘People sit down and …’ ‘… read a book.’ Student is reading a story Student needs help on a word Tutor chooses what help to give Student continues reading Student sees word in a later sentence Time passes… Context:
12
Carnegie Mellon Project LISTEN 127/22/2004 Map data stream to data set: trials Context:Decision: Outcome:
13
Carnegie Mellon Project LISTEN 137/22/2004 Mine data set to make discoveries Count outcome frequency Success rate of each help type (ICALL 04) Fit a parametric model Knowledge tracing (Corbett 95) Train a model Statistics, e.g. regression (TICL 04) Machine learning, e.g. decision trees (AIED 01)
14
Carnegie Mellon Project LISTEN 147/22/2004 Count outcome frequency: which help types worked best? Same day:Later day: Grade 1 words:Say In ContextSay In Context, Onset Rime Grade 2 words:Say In ContextSay In Context, Rhymes With Rhymes With Grade 3 words:Say In ContextRhymes WithRhymes With, One Grapheme One Grapheme Best: Rhymes With 69.2% ± 0.4%Rhymes With Worst: Recue 55.6% ± 0.4%Recue Compare within level to control for word difficulty. Supplying the word helped best in the short term… But rhyming hints had longer lasting benefits.
15
Carnegie Mellon Project LISTEN 157/22/2004 Summary: modify, map, mine. 1. Modify tutor to make data mineable. Log, reify, time, hand-label, import, probe, randomize. 2. Map data streams to data sets. Segment, slice, measure. 3. Mine data set to make discoveries. Count, fit, train. See videos, papers, etc. at www.cs.cmu.edu/~listen.www.cs.cmu.edu/~listen Thank you! Questions?
16
Carnegie Mellon Project LISTEN 167/22/2004 Modify tutor to get mineable data word features
17
Carnegie Mellon Project LISTEN 177/22/2004 Structure of Reading Tutor database Story Encounter List storiesPick stories Sentence Encounter Read sentence Show one sentence at a time Word Encounter Read each word Listens and helps StudentReading Tutor Session Login List readers
18
Carnegie Mellon Project LISTEN 187/22/2004 Map data stream to data set: formulate data as experimental trials ContextDecisionOutcome Student is stuck Prompt or cough? Next event in dialog FF 2000 Before a new word Explain it or not? Test word next day IJAIE 01 Click on wordWhat help to give? Word read OK next time? SSSR 04 Context where the trial occurred Decision made in this trial Outcome based on subsequent events
19
Carnegie Mellon Project LISTEN 197/22/2004 Learning curves for students’ help requests Try to predict subset Grade 1-2 level 1-6 prior encounters Selected data 53 students 175,961 words 29,278 help requests Train predictive model Count help requests 5x Predict other kids’ data 71% accuracy
20
Carnegie Mellon Project LISTEN 207/22/2004 Count outcome frequency (average success rate 66.1%) Whole word: 24,841 Say In ContextSay In Context 56,791 Say WordSay Word Decomposition: 6,280 SyllabifySyllabify 14,223 Onset RimeOnset Rime 19,677 Sound OutSound Out 22,933 One GraphemeOne Grapheme Analogy: 13,165 Rhymes WithRhymes With 13,671 Starts LikeStarts Like Semantic: 14,685 RecueRecue 2,285 Show PictureShow Picture 488 Sound EffectSound Effect Which types stood out? Best: Rhymes With 69.2% ± 0.4%Rhymes With Worst: Recue 55.6% ± 0.4%Recue Example: ‘People sit down and read a book.’
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.