circle Adding Spoken Dialogue to a Text-Based Tutorial Dialogue System Diane J. Litman Learning Research and Development Center & Computer Science Department University of Pittsburgh Pittsburgh, PA USA
circle Why Spoken Dialogue Tutoring? Motivation –Promote learning gains by enhancing communication richness Benefits for Intelligent Tutoring Systems –Naturalness and ease of use –New sources of evidence regarding dialogue and pedagogy –Complement to current talking heads –Hands-free aspect can support multimodal interaction Benefits for Spoken Dialogue Systems –Tutoring is a challenging and largely unexplored application
circle Research Questions What are the advantages – and disadvantages – of using speech over text? Can prosody be used to infer pedagogically significant information? Can the tutoring system make use of such inferences? Testbed - Why2: an existing University of Pittsburgh text-based dialogue tutoring system
circle Why2-Atlas screen shot 4. A rock is thrown straight upward with a velocity v. What is its acceleration at the top of its path? What is its velocity when it comes back to the starting point? Why? At the very top of the path, where the velocity is equal to 0, the acceleration will be -9.8 m/(s*s), the acceleration of gravity. It will no longer have any upward acceleration. When it comes back the its starting point (assuming that it is the ground) its velocity will be equal to 0 because its downward progress will be stopped by the ground Tutor: What is the velocity just before it hits the ground? Student: That will depend on the time that the ball is in the air/distance ball traveled. Tutor: How will it compare with the velocity with which it was thrown up? They will be equal.The height the ball reaches will depend on the initial velocity
circle Recall Previous Work Prediction of dialogue and utterance level problems –Automatically computable features (prosody, ASR, history, efficiency, quality, lexical) can accurately predict annotated turn classes (misrecognitions - 93%; aware sites - 89%; corrections -84%) –Prosodic features include pitch, energy, duration, pause, tempo, and silence Adaptation –Task success improved from 23% to 65% by detecting and reacting to problematic subdialogues
circle Example with Annotations S: Hi, this is AT&T Amtrak schedule system. This is TOOT. How may I help you? U: I'd like to find a train from Ch- to Chicago from Baltimore. MISRECOGNITION S: Sorry, I can't understand you. Please repeat your utterance. U: I'd like to find a train from Baltimore to Chicago. AWARE, CORRECTION
circle Implications for Tutorial Dialogue Hypothesis –Similar features will prove useful for predicting and adapting to problematic pedagogical situations Challenge –What pedagogical situations should (and can) be annotated? Starting Points –“Tutorial-level misrecognitions” (e.g., a correct student answer is understood to be incorrect by the system) –Problematic affective states (e.g., confusion, boredom, anger, frustration [Evens 2002]) –Off-talk, subjective language –Initiative (e.g., papers in this session) –Dialogue acts / discourse structure
circle Emotion and Prosody (Shriberg et al. 2001) u Neutral –July 30 –Yes u Disappointed/tired –No u Amused/surprised –No u Annoyed –Yes –Late morning (HYP) u Frustrated –Yes –No –No, I am … (HYP) –There is no Manila
circle Status About to start a “wizard” study to collect corpus of human- human, spoken tutoring dialogues Implementation of human-computer system in progress Lots of opportunities for manual and automated dialogue annotation, analysis via machine learning, and incorporation of insights back into the system
circle Summary Adding spoken dialogue to tutoring systems provides both opportunities and challenges Expected Contributions –Empirical comparisons with text-based tutoring dialogue systems –Annotation schemes for dialogue states of potential pedagogical interest –Use of prosodic and other features to predict such states –Exploitation of such predictions by the tutoring system
circle Example with Annotations S: Hi, this is AT&T Amtrak schedule system. This is TOOT. How may I help you? U: I'd like to find a train from Ch- to Chicago from Baltimore. MISRECOGNITION S: Sorry, I can't understand you. Please repeat your utterance. U: I'd like to find a train from Baltimore to Chicago. MISRECOGNITION, AWARE, CORRECTION S: I heard you say “leave today around 5 thirty in the morning." Which city do you want to go to? U: I would like to leave Saturday at eight in the morning. MISRECOGNITION, AWARE, CORRECTION S: You can say the arrival city name, such as “New York City." U: Baltimore to Chicago.AWARE, CORRECTION