A Tutorial Dialogue System that Adapts to Student Uncertainty Diane Litman Computer Science Department & Intelligent Systems Program & Learning Research.

Slides:

Advertisements

Similar presentations

Spoken Dialogue for Intelligent Tutoring Systems: Opportunities and Challenges Diane Litman Computer Science Department Learning Research & Development.

Advertisements

Mihai Rotaru Diane J. Litman DoD Group Meeting Presentation

Detecting Certainness in Spoken Tutorial Dialogues Liscombe, Hirschberg & Venditti Using System and User Performance Features to Improve Emotion Detection.

Uncertainty Corpus: Resource to Study User Affect in Complex Spoken Dialogue Systems Kate Forbes-Riley, Diane Litman, Scott Silliman, Amruta Purandare.

The interaction plateau CPI 494, April 9, 2009 Kurt VanLehn 1.

Student simulation and evaluation DOD meeting Hua Ai 03/03/2006.

What can humans do when faced with ASR errors? Dan Bohus Dialogs on Dialogs Group, October 2003.

© Anselm Spoerri Lecture 13 Housekeeping –Term Projects Evaluations –Morse, E., Lewis, M., and Olsen, K. (2002) Testing Visual Information Retrieval Methodologies.

+ Doing More with Less : Student Modeling and Performance Prediction with Reduced Content Models Yun Huang, University of Pittsburgh Yanbo Xu, Carnegie.

Today Concepts underlying inferential statistics

Click to edit the title text format An Introduction to TuTalk: Developing Dialogue Agents for Learning Studies Pamela Jordan University of Pittsburgh Learning.

Topics = Domain-Specific Concepts Online Physics Encyclopedia ‘Eric Weisstein's World of Physics’ Contains total 3040 terms including multi-word concepts.

Annotating Student Emotional States in Spoken Tutoring Dialogues Diane Litman and Kate Forbes-Riley Learning Research and Development Center and Computer.

Predicting Student Emotions in Computer-Human Tutoring Dialogues Diane J. Litman and Kate Forbes-Riley University of Pittsburgh Pittsburgh, PA USA.

Modeling User Satisfaction and Student Learning in a Spoken Dialogue Tutoring System with Generic, Tutoring, and User Affect Parameters Kate Forbes-Riley.

Click to edit the title text format Methodology & Basics of Authoring TuTalk Dialogue Agents Pamela Jordan University of Pittsburgh Learning Research and.

Interactive Dialogue Systems Professor Diane Litman Computer Science Department & Learning Research and Development Center University of Pittsburgh Pittsburgh,

circle A Comparison of Tutor and Student Behavior in Speech Versus Text Based Tutoring Carolyn P. Rosé, Diane Litman, Dumisizwe Bhembe, Kate Forbes, Scott.

Click to edit the title text format Methodology & Basics of Authoring TuTalk Dialogue Agents Pamela Jordan University of Pittsburgh Learning Research and.

Experimental Design. Experimental Investigation The organized procedure used to study an aspect of the natural world under controlled conditions.

Relationship between Physics Understanding and Paragraph Coherence Reva Freedman November 15, 2012.

Kate’s Ongoing Work on Uncertainty Adaptation in ITSPOKE.

Speech Analysing Component in Automatic Tutoring Systems Presentation by Doris Diedrich and Benjamin Kempe.

Adviser: Ming-Puu Chen Presenter: Pei-Chi Lu van den Boom, G., Pass, F., van Merrienbore, J.J.G., & van Gog, T. (2004). Reflection prompts and tutor feedback.

Spoken Dialogue for Intelligent Tutoring Systems: Opportunities and Challenges Diane Litman Computer Science Department & Learning Research & Development.

On Speaker-Specific Prosodic Models for Automatic Dialog Act Segmentation of Multi-Party Meetings Jáchym Kolář 1,2 Elizabeth Shriberg 1,3 Yang Liu 1,4.

Click to edit the title text format An Introduction to TuTalk: Developing Dialogue Agents for Learning Studies Pamela Jordan University of Pittsburgh Learning.

circle Adding Spoken Dialogue to a Text-Based Tutorial Dialogue System Diane J. Litman Learning Research and Development Center & Computer Science Department.

Individual Preferences for Uncertainty: An Ironically Pleasurable Stimulus Bankert, M., VanNess, K., Hord, E., Pena, S., Keith, V., Urecki, C., & Buchholz,

Comparing Synthesized versus Pre-Recorded Tutor Speech in an Intelligent Tutoring Spoken Dialogue System Kate Forbes-Riley and Diane Litman and Scott Silliman.

Crowdsourcing for Spoken Dialogue System Evaluation Ling 575 Spoken Dialog April 30, 2015.

Adaptive Spoken Dialogue Systems & Computational Linguistics Diane J. Litman Dept. of Computer Science & Learning Research and Development Center University.

Correlations with Learning in Spoken Tutoring Dialogues Diane Litman Learning Research and Development Center and Computer Science Department University.

Experiments with ITSPOKE: An Intelligent Tutoring Spoken Dialogue System Dr. Diane Litman Associate Professor, Computer Science Department and Research.

Peer review systems, e.g. SWoRD [1], need intelligence for detecting and responding to problems with students’ reviewing performance E.g. problem localization.

1 USC Information Sciences Institute Yolanda GilFebruary 2001 Knowledge Acquisition as Tutorial Dialogue: Some Ideas Yolanda Gil.

Collaborative Research: Monitoring Student State in Tutorial Spoken Dialogue Diane Litman Computer Science Department and Learning Research and Development.

Predicting Student Emotions in Computer-Human Tutoring Dialogues Diane J. Litman&Kate Forbes-Riley University of Pittsburgh Department of Computer Science.

Modeling Student Benefits from Illustrations and Graphs Michael Lipschultz Diane Litman Intelligent Tutoring Systems Conference (2014)

Using Artificial Intelligence to Support Peer Review of Writing Diane Litman Department of Computer Science, Intelligent Systems Program, & Learning Research.

Why predict emotions? Feature granularity levels [1] uses pitch features computed at the word-level Offers a better approximation of the pitch contour.

Using Word-level Features to Better Predict Student Emotions during Spoken Tutoring Dialogues Mihai Rotaru Diane J. Litman Graduate Research Competition.

Speech and Language Processing for Educational Applications Professor Diane Litman Computer Science Department & Intelligent Systems Program & Learning.

Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.

Instructors’ General Perceptions on Students’ Self-Awareness Frances Feng-Mei Choi HUNGKUANG UNIVERSITY DEPARTMENT OF ENGLISH.

Diane Litman Learning Research & Development Center

Spoken Dialogue in Human and Computer Tutoring Diane Litman Learning Research and Development Center and Computer Science Department University of Pittsburgh.

Speech and Language Processing for Adaptive Training Diane Litman Professor, Computer Science Department Senior Scientist, Learning Research & Development.

Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -

Spoken Dialog Systems Diane J. Litman Professor, Computer Science Department.

Using Prosody to Recognize Student Emotions and Attitudes in Spoken Tutoring Dialogues Diane Litman Department of Computer Science and Learning Research.

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

(Speech and Affect in Intelligent Tutoring) Spoken Dialogue Systems Diane Litman Computer Science Department and Learning Research and Development Center.

Metacognition and Learning in Spoken Dialogue Computer Tutoring Kate Forbes-Riley and Diane Litman Learning Research and Development Center University.

circle Spoken Dialogue for the Why2 Intelligent Tutoring System Diane J. Litman Learning Research and Development Center & Computer Science Department.

Modeling Student Benefits from Illustrations and Graphs Michael Lipschultz Diane Litman Computer Science Department University of Pittsburgh.

circle Towards Spoken Dialogue Systems for Tutorial Applications Diane Litman Reprise of LRDC Board of Visitors Meeting, April 2003.

Spoken Dialogue for Intelligent Tutoring Systems: Opportunities and Challenges Diane Litman Computer Science Department Learning Research & Development.

Improving (Meta)cognitive Tutoring by Detecting and Responding to Uncertainty Diane Litman & Kate Forbes-Riley University of Pittsburgh Pittsburgh, PA.

Experiments with ITSPOKE: An Intelligent Tutoring Spoken Dialogue System Diane Litman Computer Science Department and Learning Research and Development.

User Simulation for Spoken Dialogue Systems Diane Litman Computer Science Department & Learning Research and Development Center University of Pittsburgh.

Using Natural Language Processing to Analyze Tutorial Dialogue Corpora Across Domains and Modalities Diane Litman, University of Pittsburgh, Pittsburgh,

Detecting and Adapting to Student Uncertainty in a Spoken Tutorial Dialogue System Diane Litman Computer Science Department & Learning Research & Development.

Prosodic Cues to Disengagement and Uncertainty in Physics Tutorial Dialogues Diane Litman, Heather Friedberg, Kate Forbes-Riley University of Pittsburgh.

Predicting and Adapting to Poor Speech Recognition in a Spoken Dialogue System Diane J. Litman AT&T Labs -- Research

Predicting Emotion in Spoken Dialogue from Multiple Knowledge Sources Kate Forbes-Riley and Diane Litman Learning Research and Development Center and Computer.

Applications of Discourse Structure for Spoken Dialogue Systems

Irene-Angelica Chounta, Bruce M. McLaren Carnegie Mellon University

Towards Emotion Prediction in Spoken Tutoring Dialogues

Dialogue-Learning Correlations in Spoken Dialogue Tutoring

Presentation transcript:

A Tutorial Dialogue System that Adapts to Student Uncertainty Diane Litman Computer Science Department & Intelligent Systems Program & Learning Research and Development Center

Outline  Motivation  The ITSPOKE System and Corpora  Detecting and Adapting to Student Uncertainty (joint work with Kate Forbes-Riley) – Uncertainty Detection and Adaptation – Experimental Evaluation »Wizard-of-Oz »Fully-Automated  Summing Up

Tutorial Dialogue Systems  Why is one-on-one tutoring so effective? “...there is something about discourse and natural language (as opposed to sophisticated pedagogical strategies) that explains the effectiveness of unaccomplished human [tutors].” [Graesser, Person et al. 2001]  Goal: improve Intelligent Tutoring Systems using Natural Language Processing

More generally... Natural Language Processing and Tools for Learning

More generally... Natural Language Processing and Tools for Learning Learning Language (reading, writing, speaking) Tutors Scoring

More generally... Natural Language Processing and Tools for Learning Learning Language (reading, writing, speaking) Using Language (to teach everything else) Tutors Scoring Conversational Tutors / Peers CSCL

More generally... Natural Language Processing and Tools for Learning Learning Language (reading, writing, speaking) Using Language (to teach everything else) Tutors Scoring Readability Processing Language Conversational Tutors / Peers CSCL Discourse Coding Lecture Retrieval Questioning & Answering

Outline  Motivation  The ITSPOKE System and Corpora  Detecting and Adapting to Student Uncertainty – Uncertainty Detection and Adaptation – Experimental Evaluation  Summing Up

ITSPOKE: Intelligent Tutoring Spoken Dialogue System  Back-end is Why2-Atlas [VanLehn, Jordan, Rose et al. 2002]  Speech Enhanced – Sphinx2 speech recognition – Cepstral text-to-speech  Reimplemented, other changes

10

ITSPOKE Corpora  Wizard Tutoring (ITSPOKE-WOZ) –81 students / 405 dialogues –human performs speech recognition, semantic analysis –computer performs dialogue management  Computer Tutoring (ITSPOKE-AUTO) –72 students / 360 dialogues

Experimental Procedure  College students without physics –Read a small background document –Took a multiple-choice Pretest –Worked 5 problems (dialogues) with ITSPOKE –Took an isomorphic Posttest  Goal was to optimize Learning Gain – e.g., Posttest – Pretest

Outline  Motivation  The ITSPOKE System and Corpora  Detecting and Adapting to Student Uncertainty – Uncertainty Detection and Adaptation – Experimental Evaluation  Summing Up

Why Uncertainty?  Most frequent student state in our dialogue corpora [Litman and Forbes-Riley 2004]  Focus of other learning sciences, speech and language processing, and psycholinguistic studies [Craig et al. 2004; Liscombe et al. 2005; Pon-Barry et al. 2006; Dijkstra et al. 2006] .73 Kappa [Forbes-Riley et al. 2008]

Corpus-Based Detection Methodology  Learn detection models from training corpora –Use spoken language processing to automatically extract features from user turns –Use extracted features (e.g., prosodic, lexical) to predict uncertainty annotations  Evaluate learned models on testing corpora –Significant reduction of error compared to baselines [Litman and Forbes-Riley 2006; Litman et al. 2007]

System Adaptation: How to Respond?  Theory-based –[VanLehn et al. 2003; Craig et al. 2004]  Corpus-based –How do humans respond? e.g. [Forbes-Riley, Rotaru, Litman, and Tetreault 2007] * –What are optimal responses? e.g. [Chi, VanLehn and Litman 2010] * * Best paper awards

Theory-Based Adaptation: Uncertainty as Learning Opportunity  Uncertainty represents one type of learning impasse, and is also associated with cognitive disequilibrium – An impasse motivates a student to take an active role in constructing a better understanding of the principle. [VanLehn et al. 2003] –A state of failed expectations causing deliberation aimed at restoring equilibrium. [Craig et al. 2004]  Hypothesis: The system should adapt to uncertainty in the same way it responds to other impasses (e.g., incorrectness)

Outline  Motivation  The ITSPOKE System and Corpora  Detecting and Adapting to Student Uncertainty – Uncertainty Detection and Adaptation – Experimental Evaluation  Summing Up

Adaptation to Student Uncertainty in ITSPOKE  Most systems respond only to (in)correctness  Literature suggests uncertain as well as incorrect student answers signal learning impasses  Experimentally manipulate tutor responses to student uncertainty, over and above correctness, and investigate impact on learning –Platform: Adaptive version(s) of ITSPOKE

Normal (non-adaptive) ITSPOKE  System Initiative Dialogue Format: –Tutor Question – Student Answer – Tutor Response  Tutor Response Types: –to Corrects (C): positive feedback (e.g. “Fine”) –to Incorrects (I): negative feedback (e.g. “Well…”) and »Bottom Out: correct answer with reasoning »Subdialogue: questions walk through reasoning

 Our Prior Work: Rank correctness (C, I) + uncertainty (U, nonU) states in terms of impasse severity State:I+nonUI+UC+UC+nonU Severity:mostlessleastnone Adaptive ITSPOKE

 Our Prior Work: Rank correctness (C, I) + uncertainty (U, nonU) states in terms of impasse severity State:I+nonUI+UC+UC+nonU Severity:mostlessleastnone  Adaptation Hypothesis: –ITSPOKE already resolves I impasses (I+nonU, I+U), but it ignores one type of U impasse (C+U) –Performance improvement if ITSPOKE provides additional content to resolve all impasses Adaptive ITSPOKE(s)

 Simple Adaptation –Same response for all 3 impasses –Feedback on only (in)correctness  Complex Adaptation –Different responses for the 3 impasses –Feedback on both uncertainty and (in)correctness Two Uncertainty Adaptations

Simple Adaptation Example: C+U TUTOR1: By the same reasoning that we used for the car, what’s the overall net force on the truck equal to? STUDENT1: The force of the car hitting it?? [C+U] TUTOR2: Fine. [FEEDBACK] We can derive the net force on the truck by summing the individual forces on it, just like we did for the car. First, what horizontal force is exerted on the truck during the collision? [SUBDIALOGUE]  Same TUTOR2 subdialogue if student was I+U or I+nonU

Experiment 1: ITSPOKE-WOZ  Wizard of Oz version of ITSPOKE –Human recognizes speech, annotates correctness and uncertainty –Provides upper-bound language performance  Conditions –Simple Adaptation: used same response for all impasses –Complex Adaptation: used different responses for each impasse –Normal Control: used original system (no adaptation) –Random Control: gave Simple Adaptation to random 20% of correct answers (to control for additional tutoring)

Results I: Learning MetricConditionNMeanDiffp Learning Gain (Posttest – Pretest) Normal Control21.183< Simple Adaptation.03 Random Control Simple Adaptation Complex Adaptation F(3, 77) = 3.275, p = 0.02

Results I: Learning MetricConditionNMeanDiffp Learning Gain (Posttest – Pretest) Normal Control21.183< Simple Adaptation.03 Random Control Simple Adaptation Complex Adaptation  Simple Adaptation yields more student learning than Normal Control (original ITSPOKE) [Forbes-Riley and Litman 2010] F(3, 77) = 3.275, p = 0.02

Results I: Learning MetricConditionNMeanDiffp Learning Gain (Posttest – Pretest) Normal Control21.183< Simple Adaptation.03 Random Control Simple Adaptation Complex Adaptation  Simple Adaptation yields more student learning than Normal Control (original ITSPOKE) [Forbes-Riley and Litman 2010]  Similar results for learning efficiency [Forbes-Riley and Litman 2009] F(3, 77) = 3.275, p = 0.02

Additional Evaluations - Metacognition  Do metacognitive performance measures differ across experimental conditions? –e.g., Monitoring Accuracy [Nietfield et al. 2006]  Do metacognitive and cognitive performance measures (i.e. learning) correlate?

Metacognitive Results  Simple (and random) increased monitoring accuracy compared to normal (p <.06 in paired contrasts)  Monitoring Accuracy is positively correlated with learning [Litman and Forbes-Riley 2009]

Experiment 2: ITSPOKE-AUTO  Fully automated ITSPOKE –Sphinx2 speech recognizer / TuTalk semantic analyzer »Correctness Accuracy of 85% –Weka uncertainty model »Logistic regression (includes lexical, prosodic, dialogue features) »Uncertainty Accuracy of 80%  Only 3 Conditions –Simple Adaptation –Normal Control –Random Control

Preliminary Results: ITSPOKE-AUTO  Simple Adaptation yields more student learning than Normal and Random Controls  Differences only significant for a subset of students  Noisy uncertainty detection is the system bottleneck  3 of the 4 metacognitive metrics remain correlated with learning [Forbes-Riley and Litman, 2010]

Current and Future Research  More sophisticated ITSPOKE adaptations –User modeling (domain knowledge, gender) –Multiple student states (disengagement) –Motivation [Ward 2010]  Remediate metacognition, not just domain content

Summing Up  Spoken dialogue contributes to the success of human tutors  Using presently available technology, successful tutorial dialogue systems can also be built  Adapting to uncertainty can further improve performance –Learning gains, efficiency, metacognition  Tutors can serve as platforms for learning science studies

Related Projects Natural Language Processing and Tools for Learning Learning Language (reading, writing, speaking) Using Language (to teach everything else) Processing Language Conversational Tutors

Related Projects Natural Language Processing and Tools for Learning Learning Language (reading, writing, speaking) Using Language (to teach everything else) Processing Language Conversational Tutors Tutor Abstraction and Specialization during Reflective Conversation [Katz/Jordan/Litman poster]

Related Projects Natural Language Processing and Tools for Learning Learning Language (reading, writing, speaking) Using Language (to teach everything else) Processing Language Conversational Tutors Tutor Abstraction and Specialization during Reflective Conversation [Katz/Jordan/Litman poster] Semantic Class Acquisition via Web-Learning [Lipschultz/Litman poster]

Related Projects Natural Language Processing and Tools for Learning Learning Language (reading, writing, speaking) Using Language (to teach everything else) Processing Language Computer-Supported Peer Review for Writing [Xiong/Litman/Schunn poster]

Acknowledgements  ITSPOKE group past and present –Hua Ai, Min Chi, Joanna Drummond, Kate Forbes-Riley, Heather Friedberg, Alison Huettner, Michael Lipschultz, Beatriz Maeireizo-Tokeshi, Greg Nicholas, Amruta Purandare, Mihai Rotaru, Scott Silliman, Joel Tetreault, Art Ward, Wenting Xiong  –Jan Wiebe, Rebecca Hwa, Wendy Chapman  Why2-Atlas and Human Tutoring groups –Kurt Vanlehn, Pamela Jordan, Carolyn Rose –Micki Chi, Scotty Craig, Bob Hausmann, Margueritte Roy, Sandra Katz

Thank You!  Questions?  Further Information –

The End

Example Student States in ITSPOKE ITSPOKE: What else do you need to know to find the box‘s acceleration? Student: the direction [UNCERTAIN] ITSPOKE : If you see a body accelerate, what caused that acceleration? Student: force [CERTAIN] ITSPOKE : Good job. Say there is only one force acting on the box. How is this force, the box's mass, and its acceleration related? Student: velocity [UNCERTAIN] ITSPOKE : Could you please repeat that? Student: velocity [ANNOYED]

WOZ-TUT Screenshot

Bigram Dependency Analysis EXPECTED Tutor IncludePos Tutor OmitsPos neutral certain uncertain mixed OBSERVED Tutor IncludesPos Tutor OmitsPos neutral certain uncertain mixed71161 χ2 = (critical χ2 value at p =.001 is 16.27) - “Student Certainness – Tutor Positive Feedback” Bigrams

Bigram Dependency Analysis (cont.) EXPECTED Includes Pos Omits Pos neutral OBSERVED Includes Pos Omits Pos neutral Less Tutor Positive Feedback after Student Neutral turns

Bigram Dependency Analysis (cont.) EXPECTED Includes Pos Omits Pos neutral certain uncertain mixed OBSERVED Includes Pos Omits Pos neutral certain uncertain mixed Less Tutor Positive Feedback after Student Neutral turns - More Tutor Positive Feedback after “Emotional” turns

Survey Tutoring Uncertainty Spoken Dialogue

Learning Efficiency Results MetricConditionNMeanDiffp Normalized learning gain / total tutoring time in minutes Normal Control21.010< Simple Adapt.004 Random Control Simple Adaptation Complex Adaptation20.011< Simple Adapt.013  Given same amount of tutoring time, Simple Adaptation yields more student learning than either Normal Control or Complex Adaptation  Results also hold using raw learning gain, and total number of student turns F(3, 77) = 3.56, p = 0.02

Bias CorrectIncorrect NonUncertainCnonUInonU UncertainCUIU Bias scores greater than and less than zero indicate over-confidence and under-confidence, with zero indicating best performance

Discrimination CorrectIncorrect NonUncertainCnonUInonU UncertainCUIU Discrimination scores greater than zero indicate higher metacognitive performance, in terms of certainty for correct responses and uncertainty for incorrect responses

Results I: Means across Conditions Metacognitive Measure Complex Adaptation (20) Simple Adaptation (20) Random Control (20) Normal Control (21) Average Impasse Severity Monitoring Accuracy Bias Discrimination  No statistically significant differences or trends for bias

Results I: Means across Conditions Metacognitive Measure Complex Adaptation (20) Simple Adaptation (20) Random Control (20) Normal Control (21) Average Impasse Severity Monitoring Accuracy Bias Discrimination  Trend for discrimination differences overall (p =.09)  However, contrary to our predictions, complex reduced discrimination ability, compared to random and simple (p <.04 in paired contrasts)

Intelligent Tutoring

Corpus-Based Adaptation: How Do Human Tutors Respond?  An empirical method for designing dialogue systems adaptive to student state –extraction of “dialogue bigrams” from annotated human tutoring corpora –χ 2 analysis to identify dependent bigrams –generalizable to any domain with corpora labeled for user state and system response

Example Human Tutoring Excerpt S: So the- when you throw it up the acceleration will stay the same? [Uncertain] T: Acceleration uh will always be the same because there is- that is being caused by force of gravity which is not changing. [Restatement, Expansion] S: mm-k. [Neutral] T: Acceleration is– it is in- what is the direction uh of this acceleration- acceleration due to gravity? [Short Answer Question] S: It’s- the direction- it’s downward. [Certain] T: Yes, it’s vertically down. [Positive Feedback, Restatement]

Findings  Statistically significant dependencies exist between students’ state of certainty and the responses of an expert human tutor –After uncertain, tutor Bottoms Out and avoids expansions –After certain, tutor Restates –After any emotion, tutor increases Feedback  Dependencies suggest adaptive strategies for implementation in our computer tutor [Forbes-Riley and Litman 2010]

TUTOR2: That’s exactly right, but you seem unsure, so let’s sum up. [FEEDBACK] The net force on the truck is equal to the impact force on it. We can prove this just like we did for the car. First, we know that gravity and the normal force on the truck must cancel each other, otherwise the truck would not be at rest vertically. Second, we know that the impact force is the only horizontal force exerted on the truck. [LONG BOTTOM OUT] Complex Adaptation Example: C+U

TUTOR1: By the same reasoning that we used for the car, what’s the overall net force on the truck equal to? STUDENT1: Zero?? [I+U] TUTOR2: That’s not correct, but don’t worry. You seem to be aware of your mistake. Let’s resolve it. [FEEDBACK] The net force on the truck is equal to the impact force on it. Let’s walk through this answer step by step. [SHORT BOTTOM OUT] We can derive the net force on the truck by summing the individual forces on it, just like we did for the car. First, what horizontal force is exerted on the truck during the collision? [SUBDIALOGUE] Complex Adaptation Example: I+U

TUTOR1: By the same reasoning that we used for the car, what’s the overall net force on the truck equal to? STUDENT1: The force of gravity. [I+nonU] TUTOR2: I’m sorry, but I see an error in your answer. Let’s fix it. [FEEDBACK] We can derive the net force on the truck by summing the individual forces on it, just like we did for the car. First, what horizontal force is exerted on the truck during the collision? [SUBDIALOGUE] Complex Adaptation Example: I+nonU

Discussion u Predictions versus results: - Complex Adaptation > Simple Adaptation > Random Control > Normal Control u Why didn’t Simple Adaptation and Complex Adaptation outperform Random Control? –Random Control adapted to some C+U, diminishing differences –Adapting to C+nonU may increase certainty u Why didn’t Complex Adaptation outperform Simple Adaptation? –Complex Adaptation’s human-based content responses were based on frequency, not effectiveness

 Depending on if answer is C+U, I+U, I+nonU: –ITSPOKE gives same content but varies dialogue act »Based on human tutor responses significantly associated with C+U, I+U, I+nonU answers –ITSPOKE gives complex feedback on uncertainty and (in)correctness »Based on empathetic computer tutor literature (Wang et al., 2005; Hall et al., 2004; Burleson et al., 2004) Complex Adaptation to Uncertainty

Impasse Severity  Use the scalar value associated with each student turn to compute an average impasse severity, per student Nominal State:I+nonUI+UC+UC+nonU Scalar State: Severity:mostlessleastnone

Results II Metacognitive Measure (n=81)Rp Average Impasse Severity Monitoring Accuracy  Correlations of Metacognitive Measures with Posttest, after controlling for Pretest  Average Impasse Severity (where smaller is better) is negatively correlated with learning [Litman and Forbes-Riley 2009]

Additional Results II Metacognitive Measure (n=81)Rp Average Impasse Severity Monitoring Accuracy  Monitoring Accuracy (where higher is better) is positively correlated with learning [Litman and Forbes-Riley 2009]

Preliminary Results: ITSPOKE-AUTO Metacognitive Measure WOZAUTO RpRp Average Impasse Severity Monitoring Accuracy  Impasse Severity and Monitoring Accuracy remain correlated with learning in ITSPOKE-AUTO corpus [Forbes-Riley and Litman, submitted]

Monitoring Accuracy CorrectIncorrect NonUncertainCnonUInonU UncertainCUIU The wizard's annotations for each student are first represented in an array, where each cell represents a mutually exclusive option motivated by Feeling of (Another’s) Knowing [Smith and Clark 1993; Brennan and Williams 1995] which is closely related to uncertainty [Dijkstra et al. 2006] The array is then used to compute monitoring accuracy

Monitoring Accuracy CorrectIncorrect NonUncertainCnonUInonU UncertainCUIU Ranges from -1 (no monitoring accuracy) to 1 (perfect monitoring accuracy)

 Knowledge monitoring accuracy (HC) (Nietfeld et al., 2006)  Monitoring one’s own knowledge ≈ one’s Certainty level ≈ one’s Feeling of Knowing (FOK) –HC has been used to measure FOK accuracy (Smith & Clark, 1993): the accuracy with which one’s certainty corresponds to correctness  Feeling of Another’s Knowing (FOAK): inferring the FOK of someone else (Brennan & Williams, 1995) –We use HC to measure FOAK accuracy (our certainty is inferred) HC = (COR_CER + INC_UNC) – (INC_CER + COR_UNC) (COR_CER + INC_UNC) + (INC_CER + COR_UNC) Metacognitive Performance Metrics

 Knowledge monitoring accuracy (HC) (Nietfeld et al., 2006)  Monitoring one’s own knowledge ≈ one’s Certainty level ≈ one’s Feeling of Knowing (FOK) –HC has been used to measure FOK accuracy (Smith & Clark, 1993): the accuracy with which one’s certainty corresponds to correctness  Feeling of Another’s Knowing (FOAK): inferring the FOK of someone else (Brennan & Williams, 1995) –We use HC to measure FOAK accuracy (our certainty is inferred) HC = (COR_CER + INC_UNC) – (INC_CER + COR_UNC) (COR_CER + INC_UNC) + (INC_CER + COR_UNC) Metacognitive Performance Metrics Denominator sums over all cases

 Knowledge monitoring accuracy (HC) (Nietfeld et al., 2006)  Monitoring one’s own knowledge ≈ one’s Certainty level ≈ one’s Feeling of Knowing (FOK) –HC has been used to measure FOK accuracy (Smith & Clark, 1993): the accuracy with which one’s certainty corresponds to correctness  Feeling of Another’s Knowing (FOAK): inferring the FOK of someone else (Brennan & Williams, 1995) –We use HC to measure FOAK accuracy (our certainty is inferred) HC = (COR_CER + INC_UNC) – (INC_CER + COR_UNC) (COR_CER + INC_UNC) + (INC_CER + COR_UNC) Metacognitive Performance Metrics cases where (un)certainty and (in)correctness agree

 Knowledge monitoring accuracy (HC) (Nietfeld et al., 2006)  Monitoring one’s own knowledge ≈ one’s Certainty level ≈ one’s Feeling of Knowing (FOK) –HC has been used to measure FOK accuracy (Smith & Clark, 1993): the accuracy with which certainty corresponds to correctness  Feeling of Another’s Knowing (FOAK): inferring the FOK of someone else (Brennan & Williams, 1995) –We use HC to measure FOAK accuracy (our uncertainty is inferred) HC = (COR_CER + INC_UNC) – (INC_CER + COR_UNC) (COR_CER + INC_UNC) + (INC_CER + COR_UNC) Metacognitive Performance Metrics cases where (un)certainty and (in)correctness are at odds

 Knowledge monitoring accuracy (HC) (Nietfeld et al., 2006)  Monitoring one’s own knowledge ≈ one’s Certainty level ≈ one’s Feeling of Knowing (FOK) –HC has been used to measure FOK accuracy (Smith & Clark, 1993): the accuracy with which certainty corresponds to correctness  Feeling of Another’s Knowing (FOAK): inferring the FOK of someone else (Brennan & Williams, 1995) –We use HC to measure FOAK accuracy (our uncertainty is inferred) HC = (COR_CER + INC_UNC) – (INC_CER + COR_UNC) (COR_CER + INC_UNC) + (INC_CER + COR_UNC) Metacognitive Performance Metrics Scores range from -1 (no accuracy) to 1 (perfect accuracy)