Dialogue-Learning Correlations in Spoken Dialogue Tutoring Kate Forbes-Riley, Diane Litman, Alison Huettner, and Arthur Ward Learning Research and Development Center University of Pittsburgh Pittsburgh, PA USA
Outline Introduction Dialogue Data and Coding Correlations with Learning Current Directions and Summary
Motivation An empirical basis for optimizing dialogue behaviors in spoken tutorial dialogue systems What aspects of dialogue correlate with learning? Student behaviors Tutor behaviors Interacting student and tutor behaviors Do correlations generalize across tutoring situations? Human-human tutoring Human-computer tutoring
Approach Initial: learning correlations with superficial dialogue characteristics [Litman et al., Intelligent Tutoring Systems Conf., 2004] Easy to compute automatically and in real-time, but… Correlations in the literature did not generalize to our spoken or human-computer corpora Results were difficult to interpret e.g., do longer student turns contain more explanations? Current: learning correlations with deeper “dialogue act” codings
Back-end is Why2-Atlas system (VanLehn et al., 2002) Sphinx2 speech recognition and Cepstral text-to-speech
Back-end is Why2-Atlas system (VanLehn et al., 2002) Sphinx2 speech recognition and Cepstral text-to-speech
Back-end is Why2-Atlas system (VanLehn et al., 2002) Sphinx2 speech recognition and Cepstral text-to-speech
Two Spoken Tutoring Corpora Human-Human Corpus 14 students 128 physics problems (dialogues) 5948 student turns, 5505 tutor turns Computer-Human Corpus 20 students 100 physics problems (dialogues) 2445 student turns, 2967 tutor turns
Dialogue Acts Dialogue Acts represent intentions behind utterances used in prior studies of correlations with learning e.g., tutor acts in AutoTutor (Jackson et al., 2004), dialogue acts in human tutoring (Chi et al., 2001) ITSPOKE Study Student and tutor dialogue acts Human and computer tutoring Spoken input and output
Tagset (1): (Graesser and Person, 1994) • Tutor and Student Question Acts Short Answer Question: basic quantitative relationships Long Answer Question: definition/interpretation of concepts Deep Answer Question: reasoning about causes/effects
Tagset (2): inspired by (Graesser et al., 1995) • Tutor Feedback Acts Positive Feedback: overt positive response Negative Feedback: overt negative response • Tutor State Acts Restatement: repetitions and rewordings Recap: restating earlier-established points Request/Directive: directions for argument Bottom Out: complete answer after problematic response Hint: partial answer after problematic response Expansion: novel details
Tagset (3): inspired by (Chi et al., 2001) • Student Answer Acts Deep Answer: at least 2 concepts with reasoning Novel/Single Answer: one new concept Shallow Answer: one given concept Assertion: answers such as “I don’t know” • Tutor and Student Non-Substantive Acts: do not contribute to physics discussion
Correlations with Learning For each student, and each student and tutor dialogue act tag, compute Tag Total: number of turns containing the tag Tag Percentage: (tag total) / (turn total) Tag Ratio: (tag total) / (turns containing tag of that type) Correlate measures with posttest, after regressing out pretest
Human-Computer Results (20 students) Student Dialogue Acts Mean R p # Deep Answer 11.90 .48 .04
Human-Computer Results (continued) Tutor Dialogue Acts Mean R p # Deep Answer Question 9.59 .41 .08 % Deep Answer Question 6.27% .45 .05 % Question Act 76.89% .57 .01 (Short Answer Question)/Question .88 -.47 .04 (Deep Answer Question) /Question .42 .07 # Positive Feedback 76.10 .38 .10
Discussion Computer Tutoring: knowledge construction Positive correlations Student answers displaying reasoning Tutor questions requiring reasoning
Human-Human Results (14 students) Student Dialogue Acts Mean R p # Novel/Single Answer 19.29 .49 .09 # Deep Answer 68.50 -.49 (Novel/Single Answer)/Answer .14 .47 .10 (Short Answer Question)/Question .91 .56 .05 (Long Answer Question) /Question .03 -.57 .04
Human-Human Results (continued) Tutor Dialogue Acts Mean R p # Request/Directive 19.86 -.71 .01 %Request/Directive 5.65% -.61 .03 # Restatement 79.14 -.56 .05 # Negative Feedback 14.50 -.60
Discussion Human Tutoring: more complex Positive correlations Student utterances introducing a new concept Mostly negative correlations Student attempts at deeper reasoning Tutor attempts to direct the dialogue Despite mostly negative correlations, students are learning!
Current Directions “Correctness” annotation Beyond the turn level Are more Deep Answers “correct” in the human- computer corpus? Do correct answers positively correlate with learning? Beyond the turn level Learning correlations with dialogue act sequences Computation and use of hierarchical discourse structure
Current Directions (continued) Online dialogue act annotation during computer tutoring Tutor acts can be authored Student acts need to be recognized Other types of learning correlations speech recognition and text-to-speech performance student affect and attitude
Summary Many dialogue act correlations Stay tuned … positive correlations with deep reasoning and questioning in computer tutoring correlations in human tutoring more complex student, tutor (and interactive) perspectives all useful Stay tuned … New dialogue act patterns and “correctness” analysis
Thank You! Questions? Further information: http://www.cs.pitt.edu/~litman/itspoke.html
Annotated Human-Human Excerpt T: Which one will be faster? [Short Answer Question] S: The feathers. [Novel/Single Answer] T: The feathers - why? [Restatement, Deep Answer Question] S: Because there’s less matter. [Deep Answer] All turns in both corpora were manually coded for dialogue acts (Kappa > .6)
Bigram Results Many bigrams incorporate, as either the first or second element, a dialogue act corresponding to one of the unigram results, e.g. [Student Deep Answer – Tutor Deep Answer Question] [Tutor Recap - Student Deep Answer] Other dialogue acts only correlate with learning as part of a larger dialogue pattern, e.g. [Student Shallow Answer - Tutor Restatement] [Tutor Restatement – Student Shallow Answer]
Architecture ITSpoke Why2 java Text Manager www browser www server ITSpoke Why2 html java xml Text Manager student text (xml) essay www browser Essay Analysis (Carmel, Tacitus-lite+) essay text Speech Analysis (Sphinx) dialogue tutorial goals repair goals dialogue Content Dialogue Manager (Ape, Carmel) Spoken Dialogue Manager Cepstral tutor turn (xml) dialogue text
Speech Recognition: Sphinx2 (CMU) Probabilistic language models for different dialogue states Initial training data typed student utterances from Why2-Atlas corpora Later training data spoken utterances obtained during development and pilot testing of ITSPOKE Total vocabulary 1240 unique words “Semantic Accuracy” Rate = 92.4%
Common Experimental Aspects Students take a physics pretest Students read background material Students use web interface to work through up to 10 problems with either a computer or a human tutor Students take a posttest 40 multiple choice questions, isomorphic to pretest
ITSPOKE Corpora Comparison Human-Human Human-Computer …3.5 minutes into session… Tutor: I didn't understand. What force acts on all objects within close proximity to earth? Student: uh the force of gravity Tutor: Fine. Besides the Earth's gravitational force, are there any other forces acting on the person, keys, and elevator after the release? Student: no Tutor: Yep. Thus, keep in mind that after the release, the only force on the person, keys, and elevator is the force of gravity. If gravity is the only force acting on an object, will it be moving or staying still? Student: it will be moving Tutor: Fine. Not only are the person, keys, and elevator moving, they have only gravitational forces on them. When an object is falling and has only gravitational force on it, it is said to be in what? …1.3 minutes into session… Student: Can I ask you questions? Tutor: Yes Student: Is that ok? Tutor: Mm-hm Student: Um from what we learned from the last problem they're gonna have the same- the keys and you have the same acceleration right? Student: So that means they'll both hit the bottom at the same time. But I don't understand where- I don't understand if you're in the elevator- Tutor: You see Student: Where are you going to-? Tutor: The uh let me uh the key uh- the person holds the key in front of- S11p4hhtranscript.wav Hc p2-s2-58hcex.wav