Dialog Reading Group December 3 rd, 2004 Learning the Structure of Task-Oriented Conversations from the Corpus Ananlada Chotimongkol Language Technologies.

Slides:



Advertisements
Similar presentations
Feature Selection as Relevant Information Encoding Naftali Tishby School of Computer Science and Engineering The Hebrew University, Jerusalem, Israel NIPS.
Advertisements

©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 12Slide 1 Software Design l Objectives To explain how a software design may be represented.
Sequence Clustering and Labeling for Unsupervised Query Intent Discovery Speaker: Po-Hsien Shih Advisor: Jia-Ling Koh Source: WSDM’12 Date: 1 November,
INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING NLP-AI IIIT-Hyderabad CIIL, Mysore ICON DECEMBER, 2003.
Problem Solving and Search in AI Part I Search and Intelligence Search is one of the most powerful approaches to problem solving in AI Search is a universal.
Modeling Human Reasoning About Meta-Information Presented By: Scott Langevin Jingsong Wang.
Dialogs on Dialogs reading group March, 19 th 2004 Dialog Structure Design and Annotation Ananlada Chotimongkol Language Technologies Institute School.
Multiple Criteria for Evaluating Land Cover Classification Algorithms Summary of a paper by R.S. DeFries and Jonathan Cheung-Wai Chan April, 2000 Remote.
U1, Speech in the interface:2. Dialogue Management1 Module u1: Speech in the Interface 2: Dialogue Management Jacques Terken HG room 2:40 tel. (247) 5254.
Machine Learning CPSC 315 – Programming Studio Spring 2009 Project 2, Lecture 5.
Methodologies for Evaluating Dialog Structure Annotation Ananlada Chotimongkol Presented at Dialogs on Dialogs Reading Group 27 January 2006.
Algorithms and Problem Solving-1 Algorithms and Problem Solving.
Knowledge Acquisitioning. Definition The transfer and transformation of potential problem solving expertise from some knowledge source to a program.
Learning the Structure of Task-Oriented Conversations from the Corpus of In-Domain Dialogs Ph.D. Thesis Defense Ananlada Chotimongkol Carnegie Mellon University,
Algorithms and Problem Solving. Learn about problem solving skills Explore the algorithmic approach for problem solving Learn about algorithm development.
Speech recognition, understanding and conversational interfaces Alexander Rudnicky School of Computer Science
Spatial Semi- supervised Image Classification Stuart Ness G07 - Csci 8701 Final Project 1.
Chapter 2 Succeeding as a Systems Analyst
Data Mining with Decision Trees Lutz Hamel Dept. of Computer Science and Statistics University of Rhode Island.
Creating Architectural Descriptions. Outline Standardizing architectural descriptions: The IEEE has published, “Recommended Practice for Architectural.
Presented by Zeehasham Rasheed
Information Modeling: The process and the required competencies of its participants Paul Frederiks Theo van der Weide.
Towards Learning Dialogue Structures from Speech Data and Domain Knowledge: Challenges to Conceptual Clustering using Multiple and Complex Knowledge Source.
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Latent Semantic Analysis (LSA). Introduction to LSA Learning Model Uses Singular Value Decomposition (SVD) to simulate human learning of word and passage.
Programming Fundamentals (750113) Ch1. Problem Solving
1 An Excel-based Data Mining Tool Chapter The iData Analyzer.
Maria-Florina Balcan A Theoretical Model for Learning from Labeled and Unlabeled Data Maria-Florina Balcan & Avrim Blum Carnegie Mellon University, Computer.
Part I: Classification and Bayesian Learning
1 An introduction to design patterns Based on material produced by John Vlissides and Douglas C. Schmidt.
Knowledge Science & Engineering Institute, Beijing Normal University, Analyzing Transcripts of Online Asynchronous.
The Project AH Computing. Functional Requirements  What the product must do!  Examples attractive welcome screen all options available as clickable.
(ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
An Excel-based Data Mining Tool Chapter The iData Analyzer.
What is a domain model? “A domain model captures the most important types of objects in the context of the business. The domain model represents the ‘things’
An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.
Interactive Dialogue Systems Professor Diane Litman Computer Science Department & Learning Research and Development Center University of Pittsburgh Pittsburgh,
Author: James Allen, Nathanael Chambers, etc. By: Rex, Linger, Xiaoyi Nov. 23, 2009.
SOFTWARE DESIGN AND ARCHITECTURE LECTURE 09. Review Introduction to architectural styles Distributed architectures – Client Server Architecture – Multi-tier.
SOFTWARE DESIGN (SWD) Instructor: Dr. Hany H. Ammar
A Weakly-Supervised Approach to Argumentative Zoning of Scientific Documents Yufan Guo Anna Korhonen Thierry Poibeau 1 Review By: Pranjal Singh Paper.
Discriminative Models for Spoken Language Understanding Ye-Yi Wang, Alex Acero Microsoft Research, Redmond, Washington USA ICSLP 2006.
Crowdsourcing for Spoken Dialogue System Evaluation Ling 575 Spoken Dialog April 30, 2015.
Machine Learning.
Presenter: Shanshan Lu 03/04/2010
Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.
UHD::3320::CH121 DESIGN PHASE Chapter 12. UHD::3320::CH122 Design Phase Two Aspects –Actions which operate on data –Data on which actions operate Two.
Chapter 7 Developing a Core Knowledge Framework
A Scalable Machine Learning Approach for Semi-Structured Named Entity Recognition Utku Irmak(Yahoo! Labs) Reiner Kraft(Yahoo! Inc.) WWW 2010(Information.
1 CONTEXT DEPENDENT CLASSIFICATION  Remember: Bayes rule  Here: The class to which a feature vector belongs depends on:  Its own value  The values.
Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.
Distributed Models for Decision Support Jose Cuena & Sascha Ossowski Pesented by: Gal Moshitch & Rica Gonen.
Computational Approaches for Biomarker Discovery SubbaLakshmiswetha Patchamatla.
1 USC INFORMATION SCIENCES INSTITUTE EXPECT TEMPLE: TEMPLate Extension Through Knowledge Acquisition Yolanda Gil Jim Blythe Information Sciences Institute.
Domain Model A representation of real-world conceptual classes in a problem domain. The core of object-oriented analysis They are NOT software objects.
Data Mining and Decision Support
Integrating Multiple Knowledge Sources For Improved Speech Understanding Sherif Abdou, Michael Scordilis Department of Electrical and Computer Engineering,
On Using SIFT Descriptors for Image Parameter Evaluation Authors: Patrick M. McInerney 1, Juan M. Banda 1, and Rafal A. Angryk 2 1 Montana State University,
Reinforcement Learning for Mapping Instructions to Actions S.R.K. Branavan, Harr Chen, Luke S. Zettlemoyer, Regina Barzilay Computer Science and Artificial.
Conversational role assignment problem in multi-party dialogues Natasa Jovanovic Dennis Reidsma Rutger Rienks TKI group University of Twente.
Learning Procedural Knowledge through Observation -Michael van Lent, John E. Laird – 인터넷 기술 전공 022ITI02 성유진.
Network Management Lecture 13. MACHINE LEARNING TECHNIQUES 2 Dr. Atiq Ahmed Université de Balouchistan.
1 Minimum Bayes-risk Methods in Automatic Speech Recognition Vaibhava Geol And William Byrne IBM ; Johns Hopkins University 2003 by CRC Press LLC 2005/4/26.
Welcome to M301 P2 Software Systems & their Development
Algorithms and Problem Solving
Introduction Artificial Intelligent.
Managing Dialogue Julia Hirschberg CS /28/2018.
Programming Fundamentals (750113) Ch1. Problem Solving
Algorithms and Problem Solving
Presentation transcript:

Dialog Reading Group December 3 rd, 2004 Learning the Structure of Task-Oriented Conversations from the Corpus Ananlada Chotimongkol Language Technologies Institute School of Computer Science Carnegie Mellon University

Dialog Reading Group December 3 rd, 2004 Outline Introduction Form-based dialog structure Task structure Dialog mechanisms Dialog structure learning Concept identification and clustering Form identification Operation Classification

Dialog Reading Group December 3 rd, 2004 Outline Introduction Form-based dialog structure Task structure Dialog mechanisms Dialog structure learning Concept identification and clustering Form identification Operation Classification

Dialog Reading Group December 3 rd, 2004 Building a new dialog system Speech Synthesizer Speech Recognizer Natural Language Generator “ I would like to fly to Seattle tomorrow. ” “ When would you like to leave? ” Natural Language Understanding Dialog Manager Domain Knowledge

Dialog Reading Group December 3 rd, 2004 Domain knowledge Steps in the task Specify the desired flight Search for flights that match the criteria Negotiate the flights Make a reservation Important information, keywords Destination, date, time, airlines, etc. Domain language: how do people talk

Dialog Reading Group December 3 rd, 2004 What is the problem? Speech Synthesizer Speech Recognizer Natural Language Generator “ I would like to fly to Seattle tomorrow. ” “ When would you like to leave? ” Natural Language Understanding Dialog Manager Domain Knowledge Can ’ t reuse Time consuming May need an expert

Dialog Reading Group December 3 rd, 2004 Research goal Reduce human effort on acquiring domain knowledge when create a dialog system in a new domain  By learning the domain knowledge from data

Dialog Reading Group December 3 rd, 2004 Observations Task-oriented conversations have a clear structure Reflects domain information e.g. a task is divided into sub-tasks Has recurring patterns that are observable through the language

Dialog Reading Group December 3 rd, 2004 The solutions To learn domain knowledge from data 1.Specify the structure of task-oriented conversations Capture sufficient domain knowledge Domain-independent Learnable 2.Learn the structure from a corpus of human-human conversations

Dialog Reading Group December 3 rd, 2004 Dialogue structure Task Structure (data representation) Necessary information for achieving a task goal Steps in the task Domain keywords Dialog mechanism (operations) The ways that the participants communicate and perform the task

Dialog Reading Group December 3 rd, 2004 Outline Introduction Form-based dialog structure Task structure Dialog mechanisms Dialog structure learning Concept identification and clustering Form identification Operation Classification

Dialog Reading Group December 3 rd, 2004 Existing dialog structures: Theoretical-oriented Examples: Theory of Discourse Structure (Grosz and Sidner, 1986) Discourse Representation Theory (DRT) (Kamp and Reyle, 1993) Focus on developing a theory that helps interpret discourse meaning Might be too complex to be implemented in a dialog system Use hand-written rules to recognize the structure

Dialog Reading Group December 3 rd, 2004 Existing dialog structures: Engineering-oriented Examples: Plan-based theory (Allen and Perrault, 1980) The theory of Conversation Acts (Traum and Hinkelman, 1992) Focus on practical issues: Predictability of each dialog component The implementation of the structure in a dialog system

Dialog Reading Group December 3 rd, 2004 What are missing? Don ’ t describe key domain information that the participants communicate in a dialog. The role of city names in a travel domain It is not clear how to apply the structure in a dialog system The relations between dialog structure components and dialog system components How a dialog manager should treat each component

Dialog Reading Group December 3 rd, 2004 Form-based dialog structure Describe a dialog structure with an existing dialog manger frameworks Have a concrete mapping between dialog structure components and dialog system components A form-based architecture has been used successfully in many dialog systems A form-based structure consists of: A task structure (forms and slots) Dialogue mechanisms (form operators) that advance the dialog

Dialog Reading Group December 3 rd, 2004 Outline Introduction Form-based dialog structure Task structure Dialog mechanisms Dialog structure learning Concept identification and clustering Form identification Operation Classification

Dialog Reading Group December 3 rd, 2004 Task Structure 3-level of organization 1.Task: a subset of conversations that has a specific goal 2.Sub-task: a step in a task that contributes toward a task goal => form 3.Concept: key information => slot

Dialog Reading Group December 3 rd, 2004 Task Structure: Bus schedule enquiry domain 1.Task (multiple tasks): Which bus runs between A and B? When will the bus X arrive? 2.Sub-tasks: no further decomposition 3.Concepts: Bus Number={61C, 28X, … } Location={CMU, airport, … }

Dialog Reading Group December 3 rd, 2004 Departure time query form F: Query_Departure_Time Depart_Location: carnegie_mellon Arrive_Location: the airport Arrive_Time: Hour: four Minute: thirty Bus_Number: 28X

Dialog Reading Group December 3 rd, 2004 Task Structure: Travel planning domain 1.Task: create travel itinerary 2.Sub-tasks: Flight reservation Hotel reservation Car rental reservation 3.Concepts: airlines={Continental, US-Airways, … } hotel={Hilton, Marriott, … }

Dialog Reading Group December 3 rd, 2004 Task Structure: Map reading domain Task: draw a line (a route) Sub-tasks: Draw a segment of a line Concepts: Landmark = {white_mountain, Machete, … } Orientation = {down, left, … } Distance = {a couple of centimeters, an inch, … }

Dialog Reading Group December 3 rd, 2004 Outline Introduction Form-based dialog structure Task structure Dialog mechanisms Dialog structure learning Concept identification and clustering Form identification Operation Classification

Dialog Reading Group December 3 rd, 2004 Dialogue mechanisms Operations that the participants use to advance the dialog toward the goal Task-oriented operations Manipulate a form (data structure) Examples: init_form, fill_form Discourse-oriented operations Manage the flow of a conversation Examples: acknowledgement, greeting

Dialog Reading Group December 3 rd, 2004 Dialogue mechanisms (2) Have a unique consequence on the state of the conversation init_form causes a system to create a new form Domain independent, only operation parameters that are different Fill city_name in flight_information form Fill bus_number in bus_information form

Air travel-planning domain PT8: request_form_info: WHAT TIME WOULD YOU LIKE TO DEPART DepLoc:[PITTSBURGH ] 1 st leg Form Dept_Loc: City: PITTSBURGH Dept_Date: Month: FEBRUARY Date: TWENTIETH Dept_Time: Flight_ref: Arr_Loc: City: HOUSTON State: TEXAS Airport: INTERCONTINENTAL Arr_Date: Arr_Time: Airline_company: 1 st leg Form Dept_Loc: City: PITTSBURGH Dept_Date: Month: FEBRUARY Date: TWENTIETH Dept_Time: EARLY TimeP: MORNING NOT BEFORE Hour: SEVEN Flight_ref: Arr_Loc: City: HOUSTON State: TEXAS Airport: INTERCONTINENTAL Arr_Date: Arr_Time: Airline_company: PT8: request_form_info: WHAT TIME WOULD YOU LIKE TO DEPART DepLoc:[PITTSBURGH ] X9: fill_form_info: /UM/ EARLY DepT:[MORNING ]NOT BEFORE DepT:[H:[SEVEN ]] PT8: request_form_info: WHAT TIME WOULD YOU LIKE TO DEPART DepLoc:[PITTSBURGH ] X9: fill_form_info: /UM/ EARLY DepT:[MORNING ]NOT BEFORE DepT:[H:[SEVEN ]] PT10: acknowledge: OKAY access_DB inform_result: U.S. AIRWAYS HAS A NON-STOP …

Bus schedule enquiry domain U2: fill_form_info: i wanted to take the 28X bus from /um/ DepLoc:[forbes avenue] to ArLoc:[the airport] F: Query_Departure_Time Depart_Location: Arrive_Location: Arrive_Time: Bus_Number: F: Query_Departure_Time Depart_Location: forbes avenue Arrive_Location: the airport Arrive_Time: Bus_Number: 28X

Dialog Reading Group December 3 rd, 2004 Outline Introduction Form-based dialog structure Task structure Dialog mechanisms Dialog structure learning Concept identification and clustering Form identification Operation Classification

Dialog Reading Group December 3 rd, 2004 Learning framework Goal: minimize human effort Use unsupervised learning when possible Incorporating information from existing knowledge sources If additional knowledge from a human is required Train an initial model with a small amount of annotated data Use unsupervised learning or active learning to selectively explore un-annotated data A human can correct a mistake

Dialog Reading Group December 3 rd, 2004 Dialog structure components Domain-dependent -> have to learn in every domain Task structure (forms, slots) Expression for task-oriented operations Domain-independent -> infrastructure or have to learn only once List of operations Expression for discourse-oriented operations

Dialog Reading Group December 3 rd, 2004 Outline Introduction Form-based dialog structure Task structure Dialog mechanisms Dialog structure learning Concept identification and clustering Form identification Operation Classification

Dialog Reading Group December 3 rd, 2004 Concept identification and clustering Goal: Identify concept members cluster together the ones that belong to the same concept City={Pittsburgh, Boston, Austin, … } Assumption: Word boundaries include compound word boundaries are given

Dialog Reading Group December 3 rd, 2004 Concept identification steps 1.Identify potential concept members Filter out noise, function words 2.Cluster similar words together Statistical-based clustering: Mutual information- based and Kullback-Liebler-based Knowledgebase clustering: WordNet 3.Select clusters that represent domain concepts Use the same criteria as (1), but work on a cluster level

Dialog Reading Group December 3 rd, 2004 Outline Introduction Form-based dialog structure Task structure Dialog mechanisms Dialog structure learning Concept identification and clustering Form identification Operation Classification

Dialog Reading Group December 3 rd, 2004 Form Identification Goal: determine different types of forms that occur in the domain Assumption: A dialog may be annotated with concept labels

Dialog Reading Group December 3 rd, 2004 Approach Segment a dialog into a sequence of sub- tasks (form boundaries identification) Train a classifier on lexicon cohesion (Hearst, 1994) and prosodic features Group together the sub-tasks that belong to the same form type Use unsupervised clustering based on cosine similarity Identify a set of slots that associated with each form type Analyze a cluster of similar form instances

Dialog Reading Group December 3 rd, 2004 Outline Introduction Form-based dialog structure Task structure Dialog mechanisms Dialog structure learning Concept identification and clustering Form identification Operation Classification

Dialog Reading Group December 3 rd, 2004 Operation Classification Goal: Learn the expressions that associate with each operation  by classifying an utterance into a pre-defined set of operations Assumption A dialog may be annotated with concepts labels List of operation types are given Operation boundaries are known

Dialog Reading Group December 3 rd, 2004 Supervised classification Use a Markov model (Woszczyna and Waibel, 1994) States = operation types Transition probability = dependency between operation types Emission probability = P(W|operation_type) Enhanced models Use domain concepts as word classes to reduce a data sparseness problem Add prosodic features

Dialog Reading Group December 3 rd, 2004 Unsupervised learning and active learning 1.Train an initial classifier from human-labeled data 2.Apply the current classifier to an unlabeled operation (Unsupervised learning) if the confidence is high, add this instance and the predicted label into the training set (Active learning) if the confidence is low, ask a human to label this instance and then add it into the training set 3.Train a new classifier on all labeled data (both machined-labeled and human-labeled) Step 2-3 can be iterated

Dialog Reading Group December 3 rd, 2004 Classifier confidence score 1.Difference in probability between the first rank and the second rank 2.The entropy of the classifier output High entropy = low confidence

Dialog Reading Group December 3 rd, 2004 Suggestion?