(Speech and Affect in Intelligent Tutoring) Spoken Dialogue Systems Diane Litman Computer Science Department and Learning Research and Development Center.

Slides:

Advertisements

Similar presentations

Mihai Rotaru Diane J. Litman DoD Group Meeting Presentation

Advertisements

Detecting Certainness in Spoken Tutorial Dialogues Liscombe, Hirschberg & Venditti Using System and User Performance Features to Improve Emotion Detection.

5/10/20151 Evaluating Spoken Dialogue Systems Julia Hirschberg CS 4706.

INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING NLP-AI IIIT-Hyderabad CIIL, Mysore ICON DECEMBER, 2003.

Uncertainty Corpus: Resource to Study User Affect in Complex Spoken Dialogue Systems Kate Forbes-Riley, Diane Litman, Scott Silliman, Amruta Purandare.

Natural Language and Speech Processing Creation of computational models of the understanding and the generation of natural language. Different fields coming.

The Unreasonable Effectiveness of Data Alon Halevy, Peter Norvig, and Fernando Pereira Kristine Monteith May 1, 2009 CS 652.

Predicting Text Quality for Scientific Articles Annie Louis University of Pennsylvania Advisor: Ani Nenkova.

InfoMagnets : Making Sense of Corpus Data Jaime Arguello Language Technologies Institute.

The C++ Tracing Tutor: Visualizing Computer Program Behavior for Beginning Programming Courses Rika Yoshii Alastair Milne Computer Science Department California.

Classification of Discourse Functions of Affirmative Words in Spoken Dialogue Julia Agustín Gravano, Stefan Benus, Julia Hirschberg Shira Mitchell, Ilia.

All about Empirical Research Articles What’s in them and how to read them… Developed by Debbie Lahav and Elana Spector-Cohen.

Topics = Domain-Specific Concepts Online Physics Encyclopedia ‘Eric Weisstein's World of Physics’ Contains total 3040 terms including multi-word concepts.

Annotating Student Emotional States in Spoken Tutoring Dialogues Diane Litman and Kate Forbes-Riley Learning Research and Development Center and Computer.

Predicting Student Emotions in Computer-Human Tutoring Dialogues Diane J. Litman and Kate Forbes-Riley University of Pittsburgh Pittsburgh, PA USA.

Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.

CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.

Modeling User Satisfaction and Student Learning in a Spoken Dialogue Tutoring System with Generic, Tutoring, and User Affect Parameters Kate Forbes-Riley.

Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.

Interactive Dialogue Systems Professor Diane Litman Computer Science Department & Learning Research and Development Center University of Pittsburgh Pittsburgh,

circle A Comparison of Tutor and Student Behavior in Speech Versus Text Based Tutoring Carolyn P. Rosé, Diane Litman, Dumisizwe Bhembe, Kate Forbes, Scott.

Chapter 1 Computing Tools Analytic and Algorithmic Solutions Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

TagHelper and InfoMagnets Technologies for Exploring the effect of Language Interactions in Learning Carolyn Penstein Rosé, Jaime Arguello, Yue Cui, Rohit.

Relationship between Physics Understanding and Paragraph Coherence Reva Freedman November 15, 2012.

Kate’s Ongoing Work on Uncertainty Adaptation in ITSPOKE.

Spoken Dialogue for Intelligent Tutoring Systems: Opportunities and Challenges Diane Litman Computer Science Department & Learning Research & Development.

On Speaker-Specific Prosodic Models for Automatic Dialog Act Segmentation of Multi-Party Meetings Jáchym Kolář 1,2 Elizabeth Shriberg 1,3 Yang Liu 1,4.

circle Adding Spoken Dialogue to a Text-Based Tutorial Dialogue System Diane J. Litman Learning Research and Development Center & Computer Science Department.

Comparing Synthesized versus Pre-Recorded Tutor Speech in an Intelligent Tutoring Spoken Dialogue System Kate Forbes-Riley and Diane Litman and Scott Silliman.

This work is supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center contract number.

Patterns of Event Causality Suggest More Effective Corrective Actions Abstract: The Occurrence Reporting and Processing System (ORPS) has used a consistent.

Recognition of spoken and spelled proper names Reporter : CHEN, TZAN HWEI Author :Michael Meyer, Hermann Hild.

Adaptive Spoken Dialogue Systems & Computational Linguistics Diane J. Litman Dept. of Computer Science & Learning Research and Development Center University.

Correlations with Learning in Spoken Tutoring Dialogues Diane Litman Learning Research and Development Center and Computer Science Department University.

Spoken Dialog Systems and Voice XML Lecturer: Prof. Esther Levin.

Experiments with ITSPOKE: An Intelligent Tutoring Spoken Dialogue System Dr. Diane Litman Associate Professor, Computer Science Department and Research.

인공지능 연구실 황명진 FSNLP Introduction. 2 The beginning Linguistic science 의 4 부분 –Cognitive side of how human acquire, produce, and understand.

NLP ? Natural Language is one of fundamental aspects of human behaviors. One of the final aim of human-computer communication. Provide easy interaction.

Collaborative Research: Monitoring Student State in Tutorial Spoken Dialogue Diane Litman Computer Science Department and Learning Research and Development.

1 Natural Language Processing Lecture Notes 14 Chapter 19.

Introduction to Computational Linguistics

Predicting Student Emotions in Computer-Human Tutoring Dialogues Diane J. Litman&Kate Forbes-Riley University of Pittsburgh Department of Computer Science.

Why predict emotions? Feature granularity levels [1] uses pitch features computed at the word-level Offers a better approximation of the pitch contour.

Using Word-level Features to Better Predict Student Emotions during Spoken Tutoring Dialogues Mihai Rotaru Diane J. Litman Graduate Research Competition.

Speech and Language Processing for Educational Applications Professor Diane Litman Computer Science Department & Intelligent Systems Program & Learning.

Building & Evaluating Spoken Dialogue Systems Discourse & Dialogue CS 359 November 27, 2001.

PSY 219 – Academic Writing in Psychology Fall Çağ University Faculty of Arts and Sciences Department of Psychology Inst. Nilay Avcı Week 3.

Diane Litman Learning Research & Development Center

Spoken Dialogue in Human and Computer Tutoring Diane Litman Learning Research and Development Center and Computer Science Department University of Pittsburgh.

The Impact of Student Self-e ﬃ cacy on Scientiﬁc Inquiry Skills: An Exploratory Investigation in River City, a Multi-user Virtual Environment Presenter:

Speech and Language Processing for Adaptive Training Diane Litman Professor, Computer Science Department Senior Scientist, Learning Research & Development.

Spoken Dialog Systems Diane J. Litman Professor, Computer Science Department.

Experiences with Undergraduate Research (Natural Language Processing for Educational Applications) Professor Diane Litman University of Pittsburgh.

Using Prosody to Recognize Student Emotions and Attitudes in Spoken Tutoring Dialogues Diane Litman Department of Computer Science and Learning Research.

Metacognition and Learning in Spoken Dialogue Computer Tutoring Kate Forbes-Riley and Diane Litman Learning Research and Development Center University.

circle Spoken Dialogue for the Why2 Intelligent Tutoring System Diane J. Litman Learning Research and Development Center & Computer Science Department.

A Tutorial Dialogue System that Adapts to Student Uncertainty Diane Litman Computer Science Department & Intelligent Systems Program & Learning Research.

circle Towards Spoken Dialogue Systems for Tutorial Applications Diane Litman Reprise of LRDC Board of Visitors Meeting, April 2003.

Spoken Dialogue for Intelligent Tutoring Systems: Opportunities and Challenges Diane Litman Computer Science Department Learning Research & Development.

Improving (Meta)cognitive Tutoring by Detecting and Responding to Uncertainty Diane Litman & Kate Forbes-Riley University of Pittsburgh Pittsburgh, PA.

Experiments with ITSPOKE: An Intelligent Tutoring Spoken Dialogue System Diane Litman Computer Science Department and Learning Research and Development.

User Simulation for Spoken Dialogue Systems Diane Litman Computer Science Department & Learning Research and Development Center University of Pittsburgh.

Using Natural Language Processing to Analyze Tutorial Dialogue Corpora Across Domains and Modalities Diane Litman, University of Pittsburgh, Pittsburgh,

Detecting and Adapting to Student Uncertainty in a Spoken Tutorial Dialogue System Diane Litman Computer Science Department & Learning Research & Development.

Prosodic Cues to Disengagement and Uncertainty in Physics Tutorial Dialogues Diane Litman, Heather Friedberg, Kate Forbes-Riley University of Pittsburgh.

Predicting and Adapting to Poor Speech Recognition in a Spoken Dialogue System Diane J. Litman AT&T Labs -- Research

Predicting Emotion in Spoken Dialogue from Multiple Knowledge Sources Kate Forbes-Riley and Diane Litman Learning Research and Development Center and Computer.

Chapter 6. Data Collection in a Wizard-of-Oz Experiment in Reinforcement Learning for Adaptive Dialogue Systems by: Rieser & Lemon. Course: Autonomous.

Towards Emotion Prediction in Spoken Tutoring Dialogues

Dialogue-Learning Correlations in Spoken Dialogue Tutoring

Presentation transcript:

(Speech and Affect in Intelligent Tutoring) Spoken Dialogue Systems Diane Litman Computer Science Department and Learning Research and Development Center

Outline  Introduction  The ITSPOKE System and Corpora  Spoken versus Typed Dialogue Tutoring  Recognizing and Adapting to Student State  Current Directions and Summary

Natural Language Processing  The field of Natural Language Processing (NLP), or Computational Linguistics (CL), or Human Language Technology (HLT), is primarily concerned with the creation of computer programs that perform useful and interesting tasks with human languages. –enable computers to interact with humans using natural language (Prof. Litman) –serve as useful adjuncts to humans in tasks involving language by providing services such as automatic translation (Prof. Hwa), summarization and question- answering (Prof. Wiebe)  The foundations of the field are in computer science, artificial intelligence, linguistics, mathematics and statistics, electrical engineering, and psychology.  Studying NLP involves studying natural languages, formal representations, and algorithms for their manipulation.  See nlp.cs.pitt.edu for further details on NLP at Pitt

Spoken Dialogue Research Group   Current Projects –Monitoring Student State in Tutorial Spoken Dialogue –Adding Speech to a Text-Based Dialogue Tutor –Tutoring Scientific Explanations via Natural Language Dialogue –TuTalk: Infrastructure for authoring and experimenting with natural language dialogue in tutoring systems and learning research –Natural Language Processing Technology for Guided Study of Bioinformatics

Spoken Dialogue Research Group (cont.)  PhD Students (CS and ISP) –Hua Ai (simulated users for reinforcement learning) –Amruta Purandare (unsupervised clustering for topic tracking) –Mihai Rotaru (machine learning, speech analysis, affective dialogue systems) –Art Ward (dialogue coherency and learning)  Also 1 Undergraduate, 2 Postdocs, and Programmer  Alumni –Beatriz Maeireizo-Tokeshi (Computer Science MS Project: Applying Co-training for Predicting Student Emotions with Spoken Dialogue Data, 2005)

Spoken Dialogue Tutoring: Motivation  Working hypothesis regarding learning gains –Human Dialogue > Computer Dialogue > Text  Most human tutoring involves face-to-face spoken interaction, while most computer dialogue tutors are text-based  Can the effectiveness of dialogue tutorial systems be further increased by using spoken interactions?

Potential Benefits of Speech  Self-explanation correlates with learning and occurs more in speech  Speech contains prosodic information, providing new sources of information for dialogue adaptation  Spoken computational environments may prime a more social interpretation that enhances learning  Potential for hands-free interaction

Spoken Tutorial Dialogue Systems  Recent tutoring systems have begun to add spoken language capabilities  However, little empirical analysis of the learning ramifications of using speech

Outline  Introduction  The ITSPOKE System and Corpora  Spoken versus Typed Dialogue Tutoring  Recognizing and Adapting to Student State  Current Directions and Summary

ITSPOKE: Intelligent Tutoring SPOKEn Dialogue System  Back-end is text-based Why2-Atlas tutorial dialogue system (VanLehn et al., 2002)  Student speech digitized from microphone input; Sphinx2 speech recognizer  Tutor speech played via headphones/speakers; Cepstral text-to-speech synthesizer

Spoken Tutoring Corpora  Human-Human Corpus –14 students, 128 physics problems (dialogues), 5948 student turns, 5505 tutor turns –Student and tutor turns were manually transcribed for content and segmented for boundaries

Spoken Tutoring Corpora (cont.)  Computer-Human Corpus –20 students, 100 physics problems (dialogues), 2445 student turns, 2967 tutor turns –Noisy student turn boundaries and transcriptions extractable from ITSPOKE logs –Content of student turns also manually transcribed

ITSPOKE Corpora Comparison Human-Human Human-Computer …1.3 minutes into session… Student: Can I ask you questions? Tutor: Yes Student: Is that ok? Tutor: Mm-hm Student: Um from what we learned from the last problem they're gonna have the same- the keys and you have the same acceleration right? Tutor: Yes Student: So that means they'll both hit the bottom at the same time. But I don't understand where- I don't understand if you're in the elevator- Tutor: You see Student: Where are you going to-? Tutor: The uh let me uh the key uh- the person holds the key in front of- Student: Their face yeah- Tutor: Uh his or her face and then lets it uh uh it is let go so the the the question relates to the relative position of the key and the face uh- Student: So the key and the face- Tutor: Uh how will they- as they go- as they both fall what will be their relative position? That is the question. Student: (sigh) …3.5 minutes into session… Tutor: I didn't understand. What force acts on all objects within close proximity to earth? Student: uh the force of gravity Tutor: Fine. Besides the Earth's gravitational force, are there any other forces acting on the person, keys, and elevator after the release? Student: no Tutor: Yep. Thus, keep in mind that after the release, the only force on the person, keys, and elevator is the force of gravity. If gravity is the only force acting on an object, will it be moving or staying still? Student: it will be moving Tutor: Fine. Not only are the person, keys, and elevator moving, they have only gravitational forces on them. When an object is falling and has only gravitational force on it, it is said to be in what? Student: freefall Tutor: Good. Thus, after the release the only force on the keys and man is the downward force of earth's gravity, so they are in freefall. Before the release, what is the relationship between the man's velocity and the keys' velocity? Student: they're equal Tutor: Correct. We know that after the release, the keys, the person, and the elevator are in freefall. Is freefall acceleration near earth a constant? Or does it vary depending upon which object is in freefall? Student: it's a constant

Outline  Introduction  The ITSPOKE System and Corpora  Spoken versus Typed Dialogue Tutoring  Recognizing and Adapting to Student State  Current Directions and Summary

Spoken Versus Typed Human and Computer Dialogue Tutoring Diane Litman, Carolyn Penstein Ros é, Kate Forbes-Riley, Kurt VanLehn, Dumisizwe Bhembe, and Scott Silliman  Proceedings of the Seventh International Conference on Intelligent Tutoring Systems (2004)  International Journal of Artificial Intelligence in Education (to appear) Project: Adding Spoken Language to a Text-Based Dialogue Tutor

Research Questions  Given that natural language tutoring systems are becoming more common, is it worth the extra effort to develop spoken rather than text-based systems?  Given the current limitations of speech and natural processing technologies, how do computer tutors compare to the upper bound performance of human tutors ?

Common Experimental Aspects  Students take a physics pretest  Students read background material  Students use web interface to work through up to 10 problems with either a computer or a human tutor  Students take a posttest –40 multiple choice questions, isomorphic to pretest

Human Tutoring: Experiment 1  Same human tutor, subject pool, physics problems, web interface, and experimental procedure across two conditions  Typed dialogue condition (20 students, 171 dialogues/physics problems) –Strict turn-taking enforced  Spoken dialogue condition (14 students, 128 dialogues/physics problems) –Interruptions and overlapping speech permitted –Dialogue history box remains empty

Typed versus Spoken Tutoring: Overview of Analyses  Tutoring and Dialogue Evaluation Measures –learning gains –efficiency  Correlation of Dialogue Characteristics and Learning –do dialogue means differ across conditions? –which dialogue aspects correlate with learning in each condition?

Learning and Training Time Dependent Measure Human Spoken (14) Human Typed (20) Pretest Mean Adj. Posttest Mean Dialogue Time Key:statistical trend statistically significant

Discussion  Students in both conditions learned during tutoring (p=0.000)  The adjusted posttest scores suggest that students learned more in the spoken condition (p=0.053)  Students in the spoken condition completed their tutoring in less than half the time (p=0.000)

Dialogue Characteristics Examined  Motivated by previous learning correlations with student language production and interactivity (Core et al., 2003; Rose et al.; Katz et al., 2003) –Average length of turns (in words) –Total number of words and turns –Initial values and rate of change –Ratios of student and tutor words and turns –Interruption behavior (in speech)

Human Tutoring Dialogue Characteristics (means) Dependent Measure Spoken (14) Typed (20) p Tot. Stud. Words Tot. Stud. Turns Ave. Stud. Words/Turn Slope: Stud. Words/Turn Intercept: Stud. Words/Turn Tot. Tut. Words Tot. Tut. Turns Ave. Tut. Words/Turn Stud-Tut Tot. Words Ratio Stud-Tut Words/Turn Ratio

Discussion  For every measure examined, the means across conditions are significantly different –Students and the tutor take more turns in speech, and use more total words –Spoken turns are on average shorter –The ratio of student to tutor language production is higher in text

Learning Correlations after Controlling for Pretest Dependent Measure Human Spoken (14) Human Typed (20) RpRp Ave. Stud. Words/Turn Intercept: Stud. Words/Turn Ave. Tut. Words/Turn

Discussion  Measures correlating with learning in the typed condition do not correlate in the spoken condition –Typed results suggest that students who give longer answers, or who are inherently verbose, learn more  Deeper analyses needed (requires manual coding) –e.g., do longer student turns reveal more explanation? –results need to be further examined for student question types, substantive contributions, etc.

Computer Tutoring: Experiment 2  Same as Experiment 1; however –only 5 problems (dialogues) per student –pretest taken after background reading –strict turn taking enforced in both conditions  Typed dialogue condition (23 students, 115 dialogues) –Why2-Atlas  Spoken dialogue condition (20 students, 100 dialogues) –ITSPOKE –(noisy) speech recognition output rather than actual student utterances

Results: Learning and Training Time  Students in both conditions learned during tutoring (p=0.000)  Students learned the same in both conditions (p=0.950)  Students in the typed condition completed their tutoring in less time than in the spoken condition (p=0.004)

Results: Dialogue Characteristics and Correlations with Learning  Means across conditions are no longer significantly different for many measures –total words produced by students –average length of student turns and initial verbosity –ratios of student to tutor language production  Different measures again correlate with learning –Speech: student language production –Text: less subdialogues/KCD –Degradation due to speech does not correlate!

Recap  Human Tutoring: spoken dialogue yielded significant performance improvements –Greater learning gains –Reduced dialogue time –Many differences in superficial dialogue characteristics  Computer Tutoring: spoken dialogue made little difference –No change in learning –Increased dialogue time –Fewer dialogue differences

Outline  Introduction  The ITSPOKE System and Corpora  Spoken versus Typed Dialogue Tutoring  Recognizing and Adapting to Student State  Current Directions and Summary

Predicting Student Emotions in Computer-Human Tutoring Dialogues Diane J. Litman and Kate Forbes-Riley  Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (2004)  Speech Communication (to appear) Project: Monitoring Student State in Tutorial Spoken Dialogue

Monitoring Student State (motivation) Tutor26: Let me ask my question in a more general way. What happens to the velocity of a body when there is no force acting on it? Student27:dammit (ASR: it is) Tutor28 :Could you please repeat that? Student29 :same (ASR: i same) Tutor30 :Right. And what happens to the HORIZONTAL velocity of a body when there is no HORIZONTAL force acting on it? Student31:zero (ASR: the zero) Tutor32 :Well... OK, let's consider how forces in the vertical direction affect the motion of a body in a horizontal direction in a different scenario Student33:oh gravity you already said this (ASR: beats gravity you're exert less) Tutor34:Fine. Are there any other forces acting on the apple as it falls? Student35:no why are you doing this again (ASR: no y and to it yes) Tutor36:Correct. The only force acting on the apple as it falls to the earth is the earth's gravitational force. In which direction does gravitational force act? Student37:downward you computer (ASR: downward you computer)

Methodology Emotion Annotation Machine Learning Experiments –extract linguistic features from student turns –use different feature sets to predict emotions »significant reduction of baseline error

Emotion Annotation Scheme  ‘Emotion’: emotions/attitudes that may impact learning  Annotation of Student Turns  Emotion Classes negative e.g. uncertain, bored, irritated, confused, sad positive e.g. confident, enthusiastic neutral no weak or strong expression of negative or positive emotion

Example Annotated Excerpt ITSPOKE: What happens to the velocity of a body when there is no force acting on it? Student: dammit (NEGATIVE) ASR: it is ITSPOKE : Could you please repeat that? Student: same (NEUTRAL) ASR: i same

Feature Extraction per Student Turn  Three feature types 1.Acoustic-prosodic 2.Lexical 3.Identifiers  Research questions –Relative predictive utility of acoustic-prosodic, lexical and identifier features Impact of speech recognition –Comparison across computer and human tutoring

Feature Types (1) Acoustic-Prosodic Features  4 pitch (f0) : max, min, mean, standard dev.  4 energy (RMS) : max, min, mean, standard dev.  4 temporal: turn duration (seconds) pause length preceding turn (seconds) tempo (syllables/second) internal silence in turn (zero f0 frames)  available to ITSPOKE in real time

Feature Types (2) Word Occurrence Vectors  Human-transcribed lexical items in the turn  ITSPOKE-recognized lexical items

Feature Types (3) Identifier Features  student number  student gender  problem number

Summary of Results (Computer Tutoring)

Comparison with Human Tutoring - In human tutoring dialogues, emotion prediction (and annotation) is more accurate and based on somewhat different features

Recap  Recognition of annotated student emotions in spoken computer and human tutoring dialogues, using multiple knowledge sources  Significant improvements in predictive accuracy compared to majority class baselines  A first step towards implementing emotion prediction and adaptation in ITSPOKE

Outline  Introduction  The ITSPOKE System and Corpora  Spoken versus Typed Dialogue Tutoring  Recognizing and Adapting to Student State  Current Directions and Summary

Recent Directions  Manual coding of “deeper” dialogue phenomena –Proceedings Artificial Intelligence in Education (2005)  Analysis beyond the turn level –Natural Language Engineering (to appear)  Learning, emotion, and speech recognition –Proceedings of Interspeech (2005)  System adaptation to user emotion –Proceedings Discourse and Dialogue (2005)  Pre-recorded (human) versus synthesized (machine) voice –submitted

Summary  Goal: an empirically-based understanding of the implications of adding speech and affective computing to dialogue tutors  Accomplishments –ITSPOKE –Collection and analysis of two spoken tutoring corpora –Comparisons of typed and spoken tutorial dialogues –Models for emotion prediction  Results will impact the design of future systems incorporating speech, by highlighting the performance gains that can be expected, and the requirements for their achievement

Thank You! Questions? Interested? Take my seminar this spring.

Architecture Cepstral www server www browser java ITSpoke Text Manager Spoken Dialogue Manager essay dialogue student text (xml) tutor turn (xml) html xml text Speech Analysis (Sphinx) dialogue repair goals Essay Analysis (Carmel, Tacitus- lite+) Content Dialogue Manager (Ape, Carmel) Why2 tutorial goals text essay

Speech Recognition: Sphinx2 (CMU)  Probabilistic language models for different dialogue states  Initial training data –typed student utterances from Why2-Atlas corpora  Later training data –spoken utterances obtained during development and pilot testing of ITSPOKE  Total vocabulary – 1240 unique words  “Semantic Accuracy” Rate = 92.4%

Corpus Transcription & Annotation

Spoken Computer Tutoring Excerpt ITSPOKE:What else do you need to know to find the box's acceleration? Student: the direction ASR: add directions ITSPOKE : If you see a body accelerate, what caused that acceleration? Student: force ITSPOKE : Good job. Say there is only one force acting on the box. How is this force, the box's mass, and its acceleration related? Student: velocity ITSPOKE : Could you please repeat that? ASR: REJECT Student: velocity