Learning the Structure of Task-Oriented Conversations from the Corpus of In-Domain Dialogs Ph.D. Thesis Defense Ananlada Chotimongkol Carnegie Mellon University,

Learning the Structure of Task-Oriented Conversations from the Corpus of In-Domain Dialogs Ph.D. Thesis Defense Ananlada Chotimongkol Carnegie Mellon University, 18 th December 2007 Thesis Committee: Alexander Rudnicky (Chair) William Cohen Carolyn Penstein Rosé Gokhan Tur (SRI International)

2 Outline Introduction Structure of task-oriented conversations Machine learning approaches Conclusion

3 A spoken dialog system Speech Synthesizer Speech Recognizer Natural Language Generator “ I would like to fly to Seattle tomorrow. ” “ When would you like to leave? ” Natural Language Understanding Dialog Manager problem | dialog structure | learning approaches | conclusion Domain Knowledge tasks, steps, domain keywords

4 Problems in acquiring domain knowledge Problems: Require domain expertise Subjective May miss some cases (Yankelovich, 1997) example dialogs Domain Knowledge (tasks, steps, domain keywords) problem | dialog structure | learning approaches | conclusion Problems: Require domain expertise Subjective May miss some cases Time consuming (Bangalore et al., 2006)

5 Client: I'D LIKE TO FLY TO HOUSTON TEXAS Agent : AND DEPARTING PITTSBURGH ON WHAT DATE ? Client: DEPARTING ON FEBRUARY TWENTIETH... Agent : DO YOU NEED A CAR ? Client : YEAH Agent : THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH THRIFTY RENTAL CAR FOR TWENTY THREE NINETY A DAY Client : OKAY Agent : WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU ? Client : YES... Agent : OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ? Client : YES Agent : AND WHERE AT IN HOUSTON ? Client : /UM/ DOWNTOWN Agent : OKAY Agent : DID YOU HAVE A HOTEL PREFERENCE ?... Task-oriented dialog problem | dialog structure | learning approaches | conclusion Client: I'D LIKE TO FLY TO HOUSTON TEXAS Agent : AND DEPARTING PITTSBURGH ON WHAT DATE ? Client: DEPARTING ON FEBRUARY TWENTIETH... Agent : DO YOU NEED A CAR ? Client : YEAH Agent : THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH THRIFTY RENTAL CAR FOR TWENTY THREE NINETY A DAY Client : OKAY Agent : WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU ? Client : YES... Agent : OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ? Client : YES Agent : AND WHERE AT IN HOUSTON ? Client : /UM/ DOWNTOWN Agent : OKAY Agent : DID YOU HAVE A HOTEL PREFERENCE ?... step1: reserve a flight step2: reserve a car step3: reserve a hotel Observable structure Reflect domain information Observable -> learnable?

6 Proposed solution problem | dialog structure | learning approaches | conclusion example dialogs Domain Knowledge (tasks, steps, domain keywords) dialog systemhuman revises

7 Learning system output problem | dialog structure | learning approaches | conclusion air travel dialogs Domain Knowledge task = create a travel itinerary steps = reserve a flight, reserve a hotel, reserve a car keywords = airline, city name, date

8 Thesis statement Investigate how to infer domain-specific information required to build a task-oriented dialog system from a corpus of in-domain conversations through an unsupervised learning approach problem | dialog structure | learning approaches | conclusion

9 Thesis scope (1) What to learn: domain-specific information in a task- oriented dialog A list of tasks and their decompositions (travel reservation: flight, car, hotel) Domain keywords (airline, city name, date) Investigate how to infer domain-specific information required to build a task-oriented dialog system from a corpus of in-domain conversations through an unsupervised learning approach problem | dialog structure | learning approaches | conclusion

10 Thesis scope (2) Resources: a corpus of in-domain conversations Recorded human-human conversations are already available Investigate how to infer domain-specific information required to build a task-oriented dialog system from a corpus of in-domain conversations through an unsupervised learning approach problem | dialog structure | learning approaches | conclusion

11 Thesis scope (3) Learning approach: unsupervised learning No training data available for a new domain Annotating data is time consuming Investigate how to infer domain-specific information required to build a task-oriented dialog system from a corpus of in-domain conversations through an unsupervised learning approach problem | dialog structure | learning approaches | conclusion

12 Proposed approach 2 research problems 1.Specify a suitable domain-specific information representation 2.Develop a learning approach that infers domain information captured by this representation from human-human dialogs Investigate how to infer domain-specific information required to build a task-oriented dialog system from a corpus of in-domain conversations through an unsupervised learning approach problem | dialog structure | learning approaches | conclusion

13 Outline Introduction Structure of task-oriented conversations Properties of a suitable dialog structure Form-based dialog structure representation Evaluation Machine learning approaches Conclusion

14 Properties of a desired dialog structure Sufficiency Capture all domain-specific information required to build a task-oriented dialog system Generality (domain-independent) Able to describe task-oriented dialogs in dissimilar domains and types Learnability Can be identified by an unsupervised machine learning algorithm problem | dialog structure : properties| learning approaches | conclusion

15 Domain-specific information in task-oriented dialogs A list of tasks and their decompositions Ex: travel reservation = flight + car + hotel  A compositional structure of a dialog based on the characteristics of a task Domain keywords Ex: airline, city name, date  The actual content of a dialog problem | dialog structure : properties | learning approaches | conclusion

16 Existing discourse structures Discourse structureSufficiencyGeneralityLearnability Segmented Discourse Representation Theory (Asher, 1993) Focus on meaning not actual entities ?? Grosz and Sidner ’ s Theory (Grosz and Sidner, 1986) Doesn’t model domain keywords unsupervised? DAMSL extension (Hardy et al., 2003) Doesn’t model a compositional structure ?unsupervised? A plan-based model (Cohen and Perrault, 1979) unsupervised? problem | dialog structure : properties | learning approaches | conclusion

17 Form-based dialog structure representation Based on a notion of form (Ferrieux and Sadek, 1994) A data representation used in the form-based dialog system architecture Focus only on concrete information Can be observed directly from in-domain conversations problem | dialog structure : form-based | learning approaches | conclusion

18 Form-based representation components Consists of 3 components 1.Task 2.Sub-task 3.Concept problem | dialog structure : form-based | learning approaches | conclusion

Form-based representation components 1.Task A subset of a dialog that has a specific goal make a travel reservation Client: I'D LIKE TO FLY TO HOUSTON TEXAS Agent : AND DEPARTING PITTSBURGH ON WHAT DATE ? Client: DEPARTING ON FEBRUARY TWENTIETH... Agent : DO YOU NEED A CAR ? Client : YEAH Agent : THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH THRIFTY RENTAL CAR FOR TWENTY THREE NINETY A DAY Client : OKAY Agent : WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU ? Client : YES... Agent : OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ? Client : YES Agent : AND WHERE AT IN HOUSTON ? Client : /UM/ DOWNTOWN Agent : OKAY Agent : DID YOU HAVE A HOTEL PREFERENCE ?...

Client: I'D LIKE TO FLY TO HOUSTON TEXAS Agent : AND DEPARTING PITTSBURGH ON WHAT DATE ? Client: DEPARTING ON FEBRUARY TWENTIETH... Agent : DO YOU NEED A CAR ? Client : YEAH Agent : THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH THRIFTY RENTAL CAR FOR TWENTY THREE NINETY A DAY Client : OKAY Agent : WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU ? Client : YES... Agent : OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ? Client : YES Agent : AND WHERE AT IN HOUSTON ? Client : /UM/ DOWNTOWN Agent : OKAY Agent : DID YOU HAVE A HOTEL PREFERENCE ?... Form-based representation components reserve a flight reserve a car reserve a hotel 2.Sub-task A step in a task that contributes toward the goal Contains sufficient information to execute a domain action

Form-based representation components 3.Concept (domain keywords) A piece of information required to perform an action Client: I'D LIKE TO FLY TO HOUSTON TEXAS Agent : AND DEPARTING PITTSBURGH ON WHAT DATE ? Client: DEPARTING ON FEBRUARY TWENTIETH... Agent : DO YOU NEED A CAR ? Client : YEAH Agent : THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH THRIFTY RENTAL CAR FOR TWENTY THREE NINETY A DAY Client : OKAY Agent : WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU ? Client : YES... Agent : OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ? Client : YES Agent : AND WHERE AT IN HOUSTON ? Client : /UM/ DOWNTOWN Agent : OKAY Agent : DID YOU HAVE A HOTEL PREFERENCE ?...

22 Data representation Represented by a form A repository of related pieces of information necessary for performing an action problem | dialog structure : form-based | learning approaches | conclusion

Client: I'D LIKE TO FLY TO HOUSTON TEXAS Agent : AND DEPARTING PITTSBURGH ON WHAT DATE ? Client: DEPARTING ON FEBRUARY TWENTIETH... Agent : DO YOU NEED A CAR ? Client : YEAH Agent : THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH THRIFTY RENTAL CAR FOR TWENTY THREE NINETY A DAY Client : OKAY Agent : WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU ? Client : YES... Agent : OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ? Client : YES Agent : AND WHERE AT IN HOUSTON ? Client : /UM/ DOWNTOWN Agent : OKAY Agent : DID YOU HAVE A HOTEL PREFERENCE ?... Data representation Form = a repository of related pieces of information Sub-task: contains sufficient information to execute a domain action  a form Form: flight query reserve a flight

Client: I'D LIKE TO FLY TO HOUSTON TEXAS Agent : AND DEPARTING PITTSBURGH ON WHAT DATE ? Client: DEPARTING ON FEBRUARY TWENTIETH... Agent : DO YOU NEED A CAR ? Client : YEAH Agent : THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH THRIFTY RENTAL CAR FOR TWENTY THREE NINETY A DAY Client : OKAY Agent : WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU ? Client : YES... Agent : OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ? Client : YES Agent : AND WHERE AT IN HOUSTON ? Client : /UM/ DOWNTOWN Agent : OKAY Agent : DID YOU HAVE A HOTEL PREFERENCE ?... Data representation Form = a repository of related pieces of information Task: a subset of a dialog that has a specific goal  a set of forms Form: flight query Form: hotel query Form: car query

Client: I'D LIKE TO FLY TO HOUSTON TEXAS Agent : AND DEPARTING PITTSBURGH ON WHAT DATE ? Client: DEPARTING ON FEBRUARY TWENTIETH... Agent : DO YOU NEED A CAR ? Client : YEAH Agent : THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH THRIFTY RENTAL CAR FOR TWENTY THREE NINETY A DAY Client : OKAY Agent : WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU ? Client : YES... Agent : OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ? Client : YES Agent : AND WHERE AT IN HOUSTON ? Client : /UM/ DOWNTOWN Agent : OKAY Agent : DID YOU HAVE A HOTEL PREFERENCE ?... Data representation Form: flight query Form = a repository of related pieces of information Concept: a piece of information required to perform an action  a slot Form: flight query DepartCity: Pittsburgh ArriveCity: Houston ArriveState: Texas DepartDate: February twentieth

26 Form-based representation properties Sufficiency  The form is already used in a form-based dialog system Philips train timetable system (Aust et al., 1995) CMU Communicator system (Rudnicky et al., 1999) Generality (domain-independent)  A broader interpretation of the form is provided  The analysis of six dissimilar domains Learnability  Components are observable directly from a dialog  (by human) annotation scheme reliability  (by machine) the accuracy of the domain information learned by the proposed approaches problem | dialog structure : form-based | learning approaches | conclusion

27 Outline Introduction Structure of task-oriented conversations Properties of a suitable dialog structure Form-based dialog structure representation Evaluation Dialog structure analysis (generality) Annotation experiment (human learnability) Machine learning approaches Conclusion

28 Dialog structure analysis Goal: To verify that form-based representation can be applied to dissimilar domains Approach: Analyze 6 task-oriented domains Air travel planning (information-accessing task) Bus schedule inquiry (information-accessing task) Map reading (problem-solving task) UAV flight simulation (command-and-control task) Meeting (personnel resource management) Tutoring (physics essay revising) problem | dialog structure : analysis | learning approaches | conclusion

Map reading domain route giverroute follower

30 Map reading domain (problem-solving task) Task: draw a route on a map Sub-task: draw a segment of a route Concepts: StartLocation = {White_Mountain, Machete, … } Direction = {down, left, … } Distance = {a couple of centimeters, an inch, … } Sub-task: ground a landmark Concepts: LandmarkName = {White_Mountain, Machete, … } Location = {below the start, … } problem | dialog structure : analysis | learning approaches | conclusion

GIVER1: okay... ehm... right, you have the start? FOLLOWER 2: yeah. GIVER3: right, below the start do you have... er like a missionary camp? FOLLOWER 4: yeah. GIVER 5: okay, well... if you take it from the start just run... horizontally. FOLLOWER 6: uh-huh. GIVER 7: eh to the left for about an inch. FOLLOWER 8: right. GIVER 9: and then go down along the side of the missionary camp. FOLLOWER 10: uh-huh. GIVER 11: 'til you're about an inch... above the bottom of the map. FOLLOWER 12: right. GIVER 13: then you need to go straight along for about 'til about... Dialog structure analysis (map reading domain) GIVER1: okay... ehm... right, you have the start? FOLLOWER 2: yeah. (action: (implicit) define_a_landmark) GIVER3: right, below the start do you have... er like a missionary camp? FOLLOWER 4: yeah. (action: define_a_landmark) GIVER 5: okay, well... if you take it from the start just run... horizontally. FOLLOWER 6: uh-huh. GIVER 7: eh to the left for about an inch. FOLLOWER 8: right. (action: draw_a_segment) GIVER 9: and then go down along the side of the missionary camp. FOLLOWER 10: uh-huh. GIVER 11: 'til you're about an inch... above the bottom of the map. FOLLOWER 12: right. GIVER 13: then you need to go straight along for about 'til about... Form: grounding LandmarkName: missionary camp Location: below the start Form: grounding LandmarkName: missionary camp Location: below the start Form: segment description Start Location: start Direction: left Distance: an inch Path: End Location: Form: segment description Start Location: start Direction: left Distance: an inch Path: End Location:

32 UAV flight simulation domain (command-and-control task) Task: take photos of the targets Sub-task: take a photo of each target Sub-subtask: control a plane Concepts: Altitude = {2700, 3300, … } Speed = {50 knots, 200 knots, … } Destination = {H-area, SSTE, … } Sub-subtask: ground a landmark Concepts: 1. LandmarkName = {H-area, SSTE, … } LandmarkType = {target, waypoint} problem | dialog structure : analysis | learning approaches | conclusion

33 Meeting domain Task: manage resources for a new employee Sub-task: get a computer Concepts: Type = {desktop, laptop, … } Brand = {IBM, Dell, … } Sub-task: get office space Sub-task: create an action item Concepts: Description = {have a space, … } Person = {Hardware Expert, Building Expert, … } StartDate = {today, … } EndDate = {the fourteenth of december, … } problem | dialog structure : analysis | learning approaches | conclusion

34 Characteristics of form-based representation Focus only on concrete information That is observable directly from in-domain conversations Describe a dialog with a simple model Pros: Possible to be learned by an unsupervised learning approach Cons: Can ’ t capture information that is not clearly expressed in a dialog Omitted concept values  Nevertheless, 93% of dialog content can be accounted for Can ’ t model a complex dialog that has a dynamic structure A tutoring domain  But it is good enough for many real world applications problem | dialog structure : analysis | learning approaches | conclusion

35 Form-based representation properties (revisit) Sufficiency  The form is already used in a form-based dialog system  Can account for 93% of dialog content Generality (domain-independent)  A broader interpretation of the form representation is provided  Can represent 5 out of 6 disparate domains Learnability  Components are observable directly from a dialog  (by human) annotation scheme reliability  (by machine) the accuracy of the domain information learned by the proposed approaches problem | dialog structure : analysis | learning approaches | conclusion

36 Annotation experiment Goal To verify that the form-based representation can be understood and applied by other annotators Approach Conduct an annotation experiment with non-expert annotators Evaluation Similarity between annotations Accuracy of annotations problem | dialog structure : annotation experiment | learning approaches | conclusion

37 Challenges in annotation comparison Different tagsets may be used since annotators have to design theirs own tagsets  Some differences are acceptable if they conform to the guideline Different dialog structure designs can generate dialog systems with the same functionalities Annotator 1Annotator 2 - and problem | dialog structure : annotation experiment | learning approaches | conclusion

Cross-annotator correction original annotation (dialog A) tagset 1 tagset 2 original annotation (dialog A) corrected annotation (dialog A) corrected annotation (dialog A) cross-annotator comparison Annotator 2 correct Annotator 1 annotates Annotator 2 annotates Annotator 1 corrects direct comparison cross-annotator comparison Each annotator creates his/her own tagset and then annotate dialogs Each annotator critiques and corrects another annotator ’ s work Compare the original annotation with the corrected one

39 Annotation experiment 2 domains Air travel planning domain (information-accessing task) Map reading domain (problem-solving task) 4 subjects in each domain People who are likely to use the form-based representation in the future Each subject has to Design a tagset and annotate the structure of dialogs Critique other subjects ’ annotation on the same set of dialogs problem | dialog structure : annotation experiment | learning approaches | conclusion

40 Evaluation metrics Annotation similarity Acceptability is the degree to which an original annotation is acceptable to a corrector Annotation accuracy Accuracy is the degree to which a subject ’ s annotation is acceptable to an expert

41 Annotation results High acceptability and accuracy Except task/sub-task accuracy in map reading domain Concepts can be annotated more reliably than tasks and sub-tasks Smaller units Have to be communicated clearly Concept Annotation Air Travel Map Reading acceptability0.960.95 accuracy0.970.89 Task/subtask Annotation Air Travel Map Reading acceptability0.810.84 accuracy0.900.65 problem | dialog structure : annotation experiment | learning approaches | conclusion

42 Form-based representation properties (revisit) Sufficiency  The form is already used in a form-based dialog system  Can account for 93% of dialog content Generality (domain-independent)  A broader interpretation of the form representation is provided  Can represent 5 out of 6 disparate domains Learnability  Components are observable directly from a dialog  Can be applied reliably by other annotators in most of the cases  (by machine) the accuracy of the domain information learned by the proposed approaches problem | dialog structure : annotation experiment | learning approaches | conclusion

43 Outline Introduction Structure of task-oriented conversations Machine learning approaches Conclusion

44 Overview of learning approaches Divide into 2 sub-problems 1.Concept identification What are the concepts? What are their members? 2.Form identification What are the forms? What are the slots (concepts) in each form?  Use unsupervised learning approaches Acquisition (not recognition) problem problem | dialog structure | learning approaches | conclusion

45 Client: I'D LIKE TO FLY TO HOUSTON TEXAS Agent : AND DEPARTING PITTSBURGH ON WHAT DATE ? Client: DEPARTING ON FEBRUARY TWENTIETH... Agent : DO YOU NEED A CAR ? Client : YEAH Agent : THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH THRIFTY RENTAL CAR FOR TWENTY THREE NINETY A DAY Client : OKAY Agent : WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU ? Client : YES... Agent : OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ? Client : YES Agent : AND WHERE AT IN HOUSTON ? Client : /UM/ DOWNTOWN Agent : OKAY Agent : DID YOU HAVE A HOTEL PREFERENCE ?... Learning example problem | dialog structure | learning approaches | conclusion Client: I'D LIKE TO FLY TO HOUSTON TEXAS Agent : AND DEPARTING PITTSBURGH ON WHAT DATE ? Client: DEPARTING ON FEBRUARY TWENTIETH... Agent : DO YOU NEED A CAR ? Client : YEAH Agent : THE LEAST EXPENSIVE RATE I HAVE WOULD BE WITH THRIFTY RENTAL CAR FOR TWENTY THREE NINETY A DAY Client : OKAY Agent : WOULD YOU LIKE ME TO BOOK THAT CAR FOR YOU ? Client : YES... Agent : OKAY AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON ? Client : YES Agent : AND WHERE AT IN HOUSTON ? Client : /UM/ DOWNTOWN Agent : OKAY Agent : DID YOU HAVE A HOTEL PREFERENCE ?... Form: flight query DepartCity: Pittsburgh ArriveCity: Houston ArriveState: Texas ArriveAirport: Intercontinental Form: hotel query City: Houston Area: Downtown HotelName: Form: car query Pick up location: Houston Pickup Time: Return Time:

46 Outline Introduction Structure of task-oriented conversations Machine learning approaches Concept identification Form identification Conclusion

47 Concept identification Goal: Identify domain concepts and their members City={Pittsburgh, Boston, Austin, … } Month={January, February, March, … } Approach: word clustering algorithm Identify concept words and group the similar ones into the same cluster problem | dialog structure | learning approaches : concept identification | conclusion

48 Word clustering algorithms Use word co-occurrences statistics Mutual information (MI-based) Kullback-Liebler distance (KL-based) Iterative algorithms need a stopping criteria Use information that is available during the clustering process Mutual information (MI-based) Distance between clusters (KL-based) Number of clusters problem | dialog structure | learning approaches : concept identification | conclusion

49 Clustering evaluation Allow more than one cluster to represent a concept To discover as many concept words as possible However, the clustering result that doesn ’ t contain splited concepts is preferred Quality score (QS) = harmonic mean of Precision (purity) Recall (completeness) Singularity Score (SS) SS of concept j = problem | dialog structure | learning approaches : concept identification | conclusion

50 Concept clustering results AlgorithmPrecisionRecallSSQSMaxQS MI-based0.780.430.770.610.68 KL-based0.860.600.70 0.71  Domain concepts can be identified with acceptable accuracy Example clusters {GATWICK, CINCINNATI, PHILADELPHIA, L.A., ATLANTA} {HERTZ, BUDGET, THRIFTY} Low recall for infrequent concepts  An automatic stopping criterion yields close to optimal results problem | dialog structure | learning approaches : concept identification | conclusion

51 Outline Introduction Structure of task-oriented conversations Machine learning approaches Concept identification Form identification Conclusion

52 Form Identification Goal: determine different types of forms and their associated slots Approach: 1.Segment a dialog into a sequence of sub-tasks  Dialog segmentation 2.Group the sub-tasks that associate with the same form type into a cluster  Sub-task clustering 3.Identify a set of slots associated with each form type  Slot extraction problem | dialog structure | learning approaches : form identification | conclusion

53 Step1: dialog segmentation Goal: segment a dialog into a sequence of sub-tasks Equivalent to identify sub-task boundaries Approach: TextTiling algorithm (Hearst, 1997) Based on lexical cohesion assumption (local context) HMM-based segmentation algorithm Based on recurring patterns (global context) HMM states = topics (sub-tasks) Transition probability = probabilities of topic shifts Emission probability = a state-specific language model problem | dialog structure | learning approaches : form identification | conclusion

54 Modeling HMM states HMM states = topics (sub-tasks) Induced by clustering reference topics (T ü r et al., 2001) Need annotated data Utterance-based HMM (Barzilay and Lee, 2004) Some utterances are very short  Induced by clustering predicted segments from TextTiling

55 Modifications for fine-grained segments in spoken dialogs Average segment length Air travel domain = 84 words Map reading domain = 55 words (WSJ = 428, Broadcast News = 996) Modifications include: A data-driven stop word list Reflect the characteristics of spoken dialogs A distance weight Higher weight for the context closer to candidate boundary problem | dialog structure | learning approaches : form identification | conclusion

56 Dialog segmentation experiment Evaluation metrics P k (Beeferman et al., 1999) Probabilistic error metric Sensitive to the value of k Concept-based F-measure (C. F-1) F-measure (or F-1) is a harmonic mean of precision and recall Count a near miss as a match if there is no concept in between Incorporate concept information in word token representation A concept label + its value -> [Airline]:northwest A concept label -> [Airline] problem | dialog structure | learning approaches : form identification | conclusion

57 TextTiling results Augmented TextTiling is significantly better than the baseline problem | dialog structure | learning approaches : form identification | conclusion Algorithm Air TravelMap Reading PkPk C. F-1PkPk TextTiling (baseline) 0.3870.6210.4120.396 TextTiling (augmented) 0.3710.7120.3840.464

58 HMM-based segmentation results Inducing HMM states from predicted segments is better than inducing from utterances Abstract concept representation yields better results Especially on map reading domain HMM-based is significantly better than TextTiling on map reading domain problem | dialog structure | learning approaches : form identification | conclusion Algorithm Air TravelMap Reading PkPk C. F-1PkPk HMM-based (utterance) 0.3980.6240.3920.436 HMM-based (segment) 0.3850.6980.3550.507 HMM-based (segment + label)0.3860.7060.2500.686 TextTiling (augmented)0.3710.7120.3840.464

59 Segmentation error analysis TextTiling algorithm performs better on consecutive sub-tasks of the same type HMM-based algorithm performs better on very fine-grained segments (only 2-3 utterances long) Map reading domain problem | dialog structure | learning approaches : form identification | conclusion

60 Step2: sub-task clustering Approach Bisecting K-mean clustering algorithm Incorporate concept information in word token representation Evaluation metrics Similar to concept clustering problem | dialog structure | learning approaches : form identification | conclusion

61 Sub-task clustering results Inaccurate segment boundaries affect clustering performance But don ’ t affect frequent sub-tasks much Missing boundaries are more problematic than false alarms Abstract concept representation yields better results More improvement in the map reading domain Even better than using reference segments  Appropriate feature representation is better than accurate segment boundaries Concept Word RepresentationAir TravelMap Reading concept label + value (oracle segment)0.7380.791 concept label + value0.5770.675 concept label0.6010.823 problem | dialog structure | learning approaches : form identification | conclusion

62 Step3: Slot extraction Goal: Identify a set of slots associated with each form type Approach: Analyze concepts contained in each cluster problem | dialog structure | learning approaches : form identification | conclusion

63 Slot extraction results Form: flight query Airline (79) ArriveTimeMin (46) DepartTimeHour(40) DepartTimeMin (39) ArriveTimeHour(36) ArriveCity(27) FlightNumber(15) ArriveAirport(13) DepartCity(13) DepartTimePeriod(11) Form: hotel query Fare (75) City(36) HotelName(33) Area(28) ArriveDateMonth(14) Form: car query car_type(13) city(3) state(1) Form: flight fare query Fare(257) City(27) CarRentalCompany(17) HotelName(15) ArriveCity(14) AirlineCompany(11) problem | dialog structure | learning approaches : form identification | conclusion Concepts are sorted by frequency

64 Outline Introduction Structure of task-oriented conversations Machine learning approaches Concept identification and clustering Form identification Conclusion

65 Form-based dialog structure representation Forms are a suitable domain-specific information representation according to these criteria Sufficiency  Can account for 93% of dialog content Generality (domain-independent)  A broader interpretation of the form representation is provided  Can represent 5 out of 6 disparate domains Learnability  (human) can be applied reliably by other annotators in most of the cases  (machine) can be identified with acceptable accuracy using unsupervised machine learning approaches problem | dialog structure | learning approaches | conclusion

66 Unsupervised learning approaches for inferring domain information Require some modifications in order to learn the structure of a spoken dialog Can identify components in form-based representation with acceptable accuracy Concept accuracy, QS = 0.70 Sub-task boundary accuracy, F-1 = 0.71 (air travel), = 0.69 (map reading) Form type accuracy, QS = 0.60 (air travel), = 0.82 (map reading) Can learn with inaccurate information If the number of errors is moderate Propagated errors don ’ t affect frequent components much  Dialog structure acquisition doesn ’ t require high learning accuracy

67 Conclusion To represent a dialog for a learning purpose we based our representation on an observable structure This observable representation Can be generalize for various types of task-oriented dialog Can be understood and applied by different annotators Can be learned by unsupervised learning approach The result from this investigation can be apply for Acquiring domain knowledge in a new task Exploring the structure of a dialog  Could potentially reduce human effort when developing a new dialog system

Thank you Question & Comment

69 References (1) N. Asher. 1993. Reference to Abstract Objects in Discourse. Dordrecht, the Netherlands: Kluwer Academic Publishers. H. Aust, M. Oerder, F. Seide, and V. Steinbiss. 1995. The Philips automatic train timetable information system. Speech Communication, 17(3-4):249-262. S. Bangalore, G. D. Fabbrizio, and A. Stent. 2006. Learning the Structure of Task- Driven Human-Human Dialogs. In Proceedings of COLING/ACL 2006. Sydney, Australia. R. Barzilay and L. Lee. 2004. Catching the Drift: Probabilistic Content Models, with Applications to Generation and Summarization. In HLT-NAACL 2004: Proceedings of the Main Conference, pp. 113-120. Boston, MA. D. Beeferman, A. Berger, and J. Lafferty. 1999. Statistical Models for Text Segmentation. Machine Learning, 34(1-3):177-210. P. R. Cohen and C. R. Perrault. 1979. Elements of a plan-based theory of speech acts. Cognitive Science, 3:177-212. A. Ferrieux and M. D. Sadek. 1994. An Efficient Data-Driven Model for Cooperative Spoken Dialogue. In Proceedings of ICSLP 1994. Yokohama, Japan. B. J. Grosz and C. L. Sidner. 1986. Attention, intentions, and the structure of discourse. Computational Linguistics, 12(3):175-204.

70 References (2) H. Hardy, K. Baker, H. Bonneau-Maynard, L. Devillers, S. Rosset, and T. Strzalkowski. 2003. Semantic and Dialogic Annotation for Automated Multilingual Customer Service. In Proceedings of Eurospeech 2003. Geneva, Switzerland. M. A. Hearst. 1997. TextTiling: segmenting text into multi-paragraph subtopic passages. Computational Linguistics, 23(1):33-64. W. C. Mann and S. A. Thompson. 1988. Rhetorical Structure Theory: Toward a functional theory of text organization. Text, 8(3):243-281. L. Polanyi. 1996. The Linguistic Structure of Discourse, Technical Report CSLI-96- 200. Stanford CA, Center for the Study of Language and Information, Stanford University. A. I. Rudnicky, E. Thayer, P. Constantinides, C. Tchou, R. Shern, K. Lenzo, X. W., and A. Oh. 1999. Creating natural dialogs in the Carnegie Mellon Communicator system. In Proceedings of Eurospeech 1999. Budapest, Hungary. J. M. Sinclair and M. Coulthard. 1975. Towards an analysis of Discourse: The English used by teachers and pupils: Oxford University Press. G. T ü r, A. Stolcke, D. Hakkani-T ü r, and E. Shriberg. 2001. Integrating prosodic and lexical cues for automatic topic segmentation. Computational Linguistics, 27(1):31- 57. N. Yankelovich. 1997. Using Natural Dialogs as the Basis for Speech Interface Design. In Susann Luperfoy (Ed.), Automated Spoken Dialog Systems. Cambridge, MA: MIT Press.

Learning the Structure of Task-Oriented Conversations from the Corpus of In-Domain Dialogs Ph.D. Thesis Defense Ananlada Chotimongkol Carnegie Mellon University,

Similar presentations

Presentation on theme: "Learning the Structure of Task-Oriented Conversations from the Corpus of In-Domain Dialogs Ph.D. Thesis Defense Ananlada Chotimongkol Carnegie Mellon University,"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Learning the Structure of Task-Oriented Conversations from the Corpus of In-Domain Dialogs Ph.D. Thesis Defense Ananlada Chotimongkol Carnegie Mellon University,

Similar presentations

Presentation on theme: "Learning the Structure of Task-Oriented Conversations from the Corpus of In-Domain Dialogs Ph.D. Thesis Defense Ananlada Chotimongkol Carnegie Mellon University,"— Presentation transcript:

Similar presentations

About project

Feedback