Presentation is loading. Please wait.

Presentation is loading. Please wait.

Carnegie Mellon Project LISTEN17/22/2004 If I Have a Hammer: Computational Linguistics in a Reading Tutor that Listens Jack Mostow Project LISTEN (www.cs.cmu.edu/~listen)www.cs.cmu.edu/~listen.

Similar presentations


Presentation on theme: "Carnegie Mellon Project LISTEN17/22/2004 If I Have a Hammer: Computational Linguistics in a Reading Tutor that Listens Jack Mostow Project LISTEN (www.cs.cmu.edu/~listen)www.cs.cmu.edu/~listen."— Presentation transcript:

1

2 Carnegie Mellon Project LISTEN17/22/2004 If I Have a Hammer: Computational Linguistics in a Reading Tutor that Listens Jack Mostow Project LISTEN (www.cs.cmu.edu/~listen)www.cs.cmu.edu/~listen Carnegie Mellon University “To a man with a hammer, everything looks like a nail.” – Mark Twain Funding: National Science Foundation Keynote at 42nd Annual Meeting of the Association for Computational Linguistics, Barcelona, Spain

3 Carnegie Mellon Project LISTEN27/22/2004 If I had a hammer… [Hays & Seeger] If I had a hammer, I’d hammer in the morning I’d hammer in the evening, All over this land I’d hammer out danger, I’d hammer out a warning, I’d hammer out love between my brothers and my sisters, All over this land.

4 Carnegie Mellon Project LISTEN37/22/2004 Outline 1. Project LISTEN’s Reading Tutor 2. Roles of computational linguistics in the tutor 3. So… Conclusions

5 Carnegie Mellon Project LISTEN47/22/2004 Project LISTEN’s Reading Tutor (video)

6 Carnegie Mellon Project LISTEN57/22/2004 Project LISTEN’s Reading Tutor (video) John Rubin (2002). The Sounds of Speech (Show 3). On Reading Rockets (Public Television series commissioned by U.S. Department of Education). Washington, DC: WETA. Available at www.cs.cmu.edu/~listen.www.cs.cmu.edu/~listen

7 Carnegie Mellon Project LISTEN67/22/2004 Thanks to fellow LISTENers Tutoring:  Dr. Joseph Beck, mining tutorial data  Prof. Albert Corbett, cognitive tutors  Prof. Rollanda O’Connor, reading  Prof. Kathy Ayres, stories for children  Joe Valeri, activities and interventions  Becky Kennedy, linguist Listening:  Dr. Mosur Ravishankar, recognizer  Dr. Evandro Gouvea, acoustic training  John Helman, transcriber Programmers:  Andrew Cuneo, application  Karen Wong, Teacher Tool Field staff:  Dr. Roy Taylor  Kristin Bagwell  Julie Sleasman Grad students:  Hao Cen, HCI  Cecily Heiner, MCALL  Peter Kant, Education  Shanna Tellerman, ETC Plus:  Advisory board  Research partners  DePaul  UBC  U. Toronto  Schools

8 Carnegie Mellon Project LISTEN77/22/2004 Computational linguistics models in an intelligent tutor Language models predict word sequences for a task.  E.g. expect ‘once upon a time…’ Domain models describe skills to learn.  E.g. pronounce ‘c’ as /k/. Production models describe student behavior.  E.g. which mistakes do students make? Student models estimate a student’s skills.  E.g. which words will a student need help on? Pedagogical models guide tutorial decisions.  E.g. which types of help work best? Theme: use data to train models automatically.

9 Carnegie Mellon Project LISTEN87/22/2004 Language model of oral reading [Mostow, Roth, Hauptmann, & Kane AAAI94] Problem: which word sequences to expect? Language model specifies word transition probabilities  Given sentence text (e.g. ‘Once upon a time…’)  Expect correct reading  But allow for deviations  With heuristic probabilities Result:  Accepted 96% of correctly read words.  Detected about half the serious mistakes.onceupaPrRepeat PrJump... PrTruncate oncePrCorrectupon

10 Carnegie Mellon Project LISTEN97/22/2004 Using ASR errors to tune a language model [Banerjee, Mostow, Beck, & Tam ICAAI03] Training data: 3,421 oral reading utterances  Spoken by 50 children aged 6-10  Recognized (imperfectly) by speech recognizer  Transcribed by hand Method: learn to classify language model transitions  Reward good transitions that match transcript  Penalize bad  transitions that cause recognizer errors  Generalize from features (kid age, text length, word type, …) Result: reduced tracking error by 24% relative to baseline

11 Carnegie Mellon Project LISTEN107/22/2004 Domain model of pronunciation Problem: what should students learn? Data: pronunciation dictionary for children’s text  ‘teach’  /T IY CH/ Method: align spelling against pronunciation  ‘t’  /T/, ‘ea’  /IY/, ‘ch’  /CH/ How frequent is each grapheme-phoneme mapping?  ‘t’  /T/ occurred 622 times in 9776 mappings  ‘z’  /S/ occurred once (in ‘quartz’) How consistently is each grapheme pronounced?  ‘v’  /V/ always  ‘e’  /EH/ (‘bed’), /AH/ (‘the’), /IY/ (‘be’), /IH/ (‘destroy’)  + ‘ea’, ‘eau’, ‘ed’, ‘ee’, ‘ei’, ‘eigh’, ‘eo’, ‘er’, ‘ere’, ‘eu’, …

12 Carnegie Mellon Project LISTEN117/22/2004 Production model of pronunciation [Fogarty, Dabbish, Steck, & Mostow AIED2001] Problem: Which mistakes to expect? Data: U. Colorado database of oral reading mistakes  ‘bed’  /B IY D/ Method: train G  P  P’ malrules for decoding  ‘e’  /EH/  /IY/

13 Carnegie Mellon Project LISTEN127/22/2004 Top five G  P  P’ decoding errors Drop ‘s’. Add ‘n’. Add ‘s’. Drop ‘n’. Result: predicted mistakes in unseen test data  Context-sensitive rules improved accuracy. Later work: predict real-word mistakes  [Mostow, Beck, Winter, Wang, & Tobin ICSLP2002] GPP’Example ‘s’/S///‘plants’ ‘s’/Z///‘arms’ ‘’///N/‘ha_d’ ‘’///Z/‘car_’ ‘n’/N///‘land’

14 Carnegie Mellon Project LISTEN137/22/2004 Student model of help requests [Beck, Jia, Sison, & Mostow UM2003] Problem: when will a student request help on a word? Data: 7 months of Reading Tutor use by 87 students  Average ~20 hours per student  Transactions logged in detail  Help request rate excluding common words: 0.5%–54% Method: train classifier using word, student, history Result: predict words that unseen students click on

15 Carnegie Mellon Project LISTEN147/22/2004 Learning curves for students’ help requests Try to predict subset  Grade 1-2 level  1-6 prior encounters Selected data  53 students  175,961 words  29,278 help requests Train predictive model  Count help requests 5x  Predict other kids’ data  71% accuracy

16 Carnegie Mellon Project LISTEN157/22/2004 Features used Information about the student  Help request rate, overall reading proficiency, etc. Information about the word  Word length, position in sentence, etc. Student’s history with reading word  Percent of times accepted by Reading Tutor, time to read, etc. Student’s prior help on this word  Was the word helped previously? Earlier today? How to get all this data??

17 Carnegie Mellon Project LISTEN167/22/2004 Data collection and translation word features

18 Carnegie Mellon Project LISTEN177/22/2004 Structure of Reading Tutor database Story Encounter List storiesPick stories Sentence Encounter Read sentence Show one sentence at a time Word Encounter Read each word Listens and helps StudentReading Tutor Session Login List readers

19 Carnegie Mellon Project LISTEN187/22/2004 Project LISTEN’s Reading Tutor: A rich source of experimental data The Reading Tutor beats independent practice…  Effect sizes up to 1.3  [Mostow SSSR02, Poulsen 04]Mostow SSSR02Poulsen 04 …but how? Use embedded experiments to investigate! 2003-2004 database:  9 schools  > 200 computers  > 50,000 sessions  > 1.5M tutor responses  > 10M words recognized  Embedded experiments  Randomized trials

20 Carnegie Mellon Project LISTEN197/22/2004 Pedagogical model of help on decoding [Mostow, Beck, & Heiner SSSR2004] Problem: Which types of help work best? Data: 270 students’ assisted reading in the Reading Tutor Method: randomize choice of help and analyze its effects Result: detected significant differences in effectiveness

21 Carnegie Mellon Project LISTEN207/22/2004 Within-subject experiment design: 270 students, 180,909 randomized trials Outcome: success = ASR accepts word as read fluently (How) does the type of help affect the next encounter? Randomized choice among feasible types Student clicks ‘read.’ ‘I love to read stories.’ ‘People sit down and …’ ‘… read a book.’ Student is reading a story Student needs help on a word Tutor chooses what help to give Student continues reading Student sees word in a later sentence Time passes…

22 Carnegie Mellon Project LISTEN217/22/2004 180,909 word hints (average success rate 66.1%) Whole word:  24,841 Say In ContextSay In Context  56,791 Say WordSay Word Decomposition:  6,280 SyllabifySyllabify  14,223 Onset RimeOnset Rime  19,677 Sound OutSound Out  22,933 One GraphemeOne Grapheme Analogy:  13,165 Rhymes WithRhymes With  13,671 Starts LikeStarts Like Semantic:  14,685 RecueRecue  2,285 Show PictureShow Picture  488 Sound EffectSound Effect Which types stood out?  Best: Rhymes With 69.2% ± 0.4%Rhymes With  Worst: Recue 55.6% ± 0.4%Recue Example: ‘People sit down and read a book.’

23 Carnegie Mellon Project LISTEN227/22/2004 What helped which words best? Same day:Later day: Grade 1 words:Say In ContextSay In Context, Onset Rime Grade 2 words:Say In ContextSay In Context, Rhymes With Rhymes With Grade 3 words:Say In ContextRhymes WithRhymes With, One Grapheme One Grapheme Compare within level to control for word difficulty. Supplying the word helped best in the short term… But rhyming hints had longer lasting benefits.

24 Carnegie Mellon Project LISTEN237/22/2004 So…. what can your computational linguistics model in an intelligent tutor? What problem is important to solve?  Language models predict word sequences for a task.  Domain models describe skills to learn.  Production models describe student behavior.  Student models estimate a student’s skills.  Pedagogical models guide tutorial decisions. …… What data is available to train on? What method is suitable to apply? What result is appropriate to evaluate?

25 Carnegie Mellon Project LISTEN247/22/2004 …Well I got a hammer Well I got a hammer, And I got a bell, And I got a song to sing, all over this land. It’s the hammer of Justice, It’s the bell of Freedom, It’s the song about Love between my brothers and my sisters, All over this land.

26 Carnegie Mellon Project LISTEN257/22/2004 Conclusions… See papers & videos at www.cs.cmu.edu/~listen.www.cs.cmu.edu/~listen Muchas gracias Molto grazie Obrigado Merci beaucoup Danke schön Dank U well Spaseeba Blagodaria Tak Todah rabah Shukra Efcharisto Xeh-xeh Arigato gozaymas Kop-kun krap Thank you! Questions? Thanks


Download ppt "Carnegie Mellon Project LISTEN17/22/2004 If I Have a Hammer: Computational Linguistics in a Reading Tutor that Listens Jack Mostow Project LISTEN (www.cs.cmu.edu/~listen)www.cs.cmu.edu/~listen."

Similar presentations


Ads by Google