Ashish Vaswani Speech acts for Dialogue agents, Coding schemes and dialogue act taxonomies.

Ashish Vaswani Speech acts for Dialogue agents, Coding schemes and dialogue act taxonomies

Speech acts for dialogue agents (Traum) Talks about the role of speech acts in allowing an agent to participate in dialogue with another agent A dialogue agent is one that can interact and communicate with other agents in a coherent manner, not just with one-shot messages but with a sequence of related messages all on the same topic or In the service of an overall goal. In studying speech acts, the focus is on pragmatics rather than semantics i.e how is language used by agents, and not what the sentences mean.

Foundational Philosophical speech act work Began with philosophers of language interested in issues in Natural language pragmatics Austin: –Utterance are used to do things –Under favorable conditions, utterances can change the mental and interactional state of the participants. –Speaking is acting –Three main divisions of speech acts. Locutionary act: Act of saying something. Illocutionary act: The act performed in saying something. (viz, informing, warning etc.) –Composed of illocutionary force and propositional content. –Indirect speech acts (Could you please pass the salt ?) Perlocutionary act: The effect of the utterance on the speaker (viz. persuasion, surprise etc.) –Classified illocutionary acts into several categories based on illocutionary force (verdictives, exercitives, commissives, expositives and behavitives)

Speech act work continued Searle: –Extended and refined Austin’s work on illocutionary acts. –No necessary correspondence between illocutionary acts and illocutionary verbs that a language chooses to describe these acts. –Searle pointed out 13 different dimensions along which speech acts could vary suggesting an alternate taxonomy on purpose (his first dimension) –Searle’s Taxonomy Representatives Directives Commissives Expressives Declarations

AI models of speech acts Problem with early speech act work was that there did not exist formal accounts of actions and mental states that could be used to design more precise definitions of speech acts. Bruce: First one to account of Speech Act Theory in terms of actions and plans (AI) –Natural language generation is Social Action. (beliefs, desires and wants) –Inform and request could be used in achieving intentions to change states of belief.

AI models of speech acts Cohen and Perrault –Defined speech acts as plan operators that change the beliefs of the speaker and hearer –Enumerated goals for an account of speech acts –A plan based theory of speech acts should specify a planning system and a definition of speech acts as operators in the system –Mental state consists of beliefs and wants. –They used a modified version of the STRIPS planning system –cando preconditions and want preconditions for operators –They modeled REQUEST and INFORM within their system

AI models of speech acts Allen and Perrault – Used the same formalism as Cohen and Perrault –Recognizing other agents plans important for interpreting utterances. Hinkelman use linguistic cues to build partial speech act templates and plan inference for utterance hypothesis

AI models Perrault: (non monotonic theory of speech acts) –Utterance itself is insufficient to determine the effects of a speech act (prior context, mental state of agent, actual utterance) –Stated the effects in terms of default logic. Dynamic logic approaches: –Cohen and Levesque showed how effects of illocutionary acts can be derived from general principle of rational cooperative interaction (sincerity and helpfulness) –Recognizing illocutionary force of an utterance is not necessary, only cooperation. –Sadek uses a similar logic of rational action.

Extending speech acts to Dialogue (Dialogue function as action) Litman and Allen: –Extend Allen and Perraults work to include dialogues and hierarchy of plans –Domain plans, discourse plans meta plans Carberry and Lambert add problem solving plans to domain plans and discourse plans. Cohen and Levesque extend their work into a theory of joint intention and multi agent action –Why confirmations appear in dialogue. (belief of object of intention)

Multiple levels of interaction Attempts to model different kinds of dialogue phenomena at different strata. (from sentence level and upwards) One early classification –(transactions(exchanges(moves(acts)))) –Moves: speech acts towards a particular purpose –The exchange structure was also called a dialogue game In Traum and Hinkelman, there were levels of acts rather than ranks

Speech act based communicative languages Language based on Speech acts would itself be a good agent communication language KQML (knowledge query and manipulation language) –Each message has an identifier (kind of action) and other parameters specifying content. Based on Austin’s performatives. Problems with hidden speech acts.

Speech Acts in multi agent action theory The main effects of speech acts are on the mental and interactional states of the participants. (BDI attitudes) Social attitudes We must also consider social attitudes (question :Are social attitudes basic ?) Mutual belief (Harman) : A group of people have mutual knowledge of p if each knows p and we know this where this refers to the whole fact known. Mutual belief is achieved through the process of grounding. (Clark and Schafer) Obligations are necessary for modeling social situations (viz. a hearer is obligated to answer a question if posed one). What an agent should do. Problem: How do you decide social norms? Obligations might conflict with the agents goals and he might choose to violate them (e.g, interrogation) Another social attitude is joint intention or shared plan. Coordinated team activity depends on more than only individual intentions and beliefs. (how do shared intentions guide individual action ?)

Speech acts in multi agent action theory Defining speech acts –How can one give precise definitions of speech acts using mental state and action? –How can one recognize whether such an act has been performed? (because of involvement of mental states, an observer might not be able to tell) –How can agents plan to use speech acts to accomplish their goals? –Traum : Plan recipe for communication

continued Planning speech acts –Acts can be planned as games, or single moves. –How far ahead should an agent plan? –The future actions of agents are inaccurate. –Negotiations, arguments (more planning), casual conversation (no planning) Recognizing speech acts –Combination of input utterance with aspects of current context to decide what acts have been performed (for example, current context says that an INFORM act might be impending) –Should the agent just recognize the acts or the intentions also (This might be necessary for interpreting indirect speech acts) –How much of the plan should be inferred? Deep intention recognition might not be necessary instead considering all possible actions and their immediate effects is sufficient when combined with facility to repair erroneous conclusions. (default logic?) (McRoy) –Grounding relaxes the need for intention recognition since it can help in realizing motivations as the speaker is easily accessible.

The reliability of a Dialogue structure coding scheme (Carletta et al) Paper aims at introducing and describing the reliability of a scheme of dialogue coding distinctions for a Map task corpus In the Map Task, two participants have slightly different versions of a simple map with approximately fifteen landmarks on it. One participant's map has a route printed on it; the task is for the other participant to duplicate the route. The moves introduced is independent of the task. They attempt to classify dialogue structure at higher level also (Transactions and games) The dialogue structure can be used with codings of many other dialogue phenomena.

The dialogue structure coding Transactions: –Highest level –Subdialogues that accomplish one major step in the participants plan for achieving the task. –Size and shape depend on the task Conversational games (dialogue games) –A conversational game is a set of utterances starting with an initiation and encompassing all utterances up until the purpose of the game has been either fulfilled (e.g., the requested information has been transferred) or abandoned. –Games can nest within each other –Games are made up of Conversational moves which are different kinds of initiations and responses

The move coding scheme

The move coding scheme (moves) Instruct move: –move commands the partner to carry out an action. –Expected response could be performance of action if the participant knows the action. –G: Go right round, ehm, until you get to just above them. Explain move: –States information that has not been directly elicited by the partner. –Facts about the domain, state of plan or task, including facts that help establish what is mutually known –G: Where the dead tree is on the other side of the stream there's farmed land.

Move coding scheme Check move: –Requests the partner to confirm information that the speaker has some reason to believe, but is not entirely sure about.

Move coding scheme Align move: –checks the partner's attention, agreement, or readiness for the next move. –most common type of ALIGN move is for the transferer to know that the information has been successfully transferred, so that they can close that part of the dialogue and move on.

Move coding scheme Query-YN move: –asks the partner any question that takes a yes or no answer and does not count as a CHECK or an ALIGN –These questions are most often about what the partner has on the map F: I've got Dutch Elm. G: Dutch Elm. Is it written underneath the tree?

Move coding scheme The Query-W move: –is any query not covered by the other categories –most moves classified as QUERY-W are wh- questions

Move coding scheme (Response moves) Used within games after an initiation and try to fulfill expectations in the game Acknowledge move: –verbal response that minimally shows that the speaker has heard the move to which it responds, and often also demonstrates that the move was understood and accepted. –only the last three (from Clark and Schafer’s evidences for acknowledge) count as ACKNOWLEDGE moves in this coding scheme G: Ehm, if you... you're heading southwards. F: Mmhmm.

Move coding scheme Reply- Y move: –any reply to any query with a yes-no surface form that means "yes", however that is expressed –normally only appear after QUERY-YN, ALIGN, and CHECK moves. G: See the third seagull along? F: Yeah. Reply –N move –reply to a query with a yes-no surface form, that means "no“ G: Do you have the west lake, down to your left? F: No.

Move coding scheme Reply –W move: –any reply to any type of query that doesn't simply mean "yes" or "no.“ G: And then below that, what've you got? F: A forest stream. Clarify move: –reply to some kind of question in which the speaker tells the partner something over and above what was strictly asked. –Route givers tend to make CLARIFY moves when the route follower seems unsure of what to do, but there isn't a specific problem on the agenda

Move coding scheme Other possible responses: –Utterances where the responder refuses to share the same goal as the initiator (No, lets talk about..) –ACKNOWLEDGE moves with a negative slant –Sufficiently rare in the corpora. READY move: –moves that occur after the close of a dialogue game and prepare the conversation for a new game to be initiated. G: Okay. Now go straight down. –Confusion: That could have been an acknowledge move too

Coding continued Game coding scheme: –Beginning of new games are coded by purpose –Place where games end or are abandoned are marked –Marked as either occurring at top level or being embedded in the game structure Transaction coding scheme: –Four transaction types: NORMAL: Transaction serving a subtask viz. a route segment on the map. REVEW: Transactions created when participants return to parts of the route that have already been completed OVERVIEW: Overviewing an upcoming segment in order to provide a context for the partner. IRRELEVANT: Subdialogues not relevant to of the route (maybe about the experimental setup) –Coding involves marking in the dialogue where the transaction starts except for IRRELEVANT transactions. –Ends of transactions are not coded.

Reliability of coding scheme Tests of reliability –Krippendorff’s test’s of reliability Stability Reproducibility Accuracy –Agreement by coders on segmentation –Used kappa coefficient for reliability of classification.

Reliability of coding Refliability of move coding –Four coders –Each coder had access to the speech as well as transcripts –All coders interacted verbally with the developers Reliability of move segmentation –Kappa =.92 using word boundaries as units –Pairwise percent agreement on locations where any coder had marked a boundary was 89%. –No of units = 4079. No of boundaries = 796 –Most errors were with marking READY separately or marking it in the move that followed and marking a reply or a splitting it into a reply and EXPLAIN, CLARIFY etc.

Reliability of coding Reliability of move classification –Since the reliability of segmentation was good, it gave a good foundation for move classification –Move classification was evaluated only over move segments where the boundaries were agreed –Kappa for move coding = 0.83 –Largest confusions between CHECK and QUERY-YN INSTRUCT and CLARIFY ACKNOWLEDGE, READY and REPLY-Y –K = 0.89 for coding with Initiation a command, a statement or a question

Reliability of coding Reliability of move classification from Written instructions: –K = 0.69 Reliability of move coding in Another domain –Transcribed conversation between a hi-fi sales assistant and a married couple intending to purchase an amplifier K = 0.95 for move segmentation K = 0.81 for move classification Reliability of game coding: –Pairwise agreement on game beginnings = 70% Reliability of Transaction coding: –Done from written instructions –K = 0.59

Coding Dialogues with the DAMSL Annotation scheme (Mark Core and James F Allen) DAMSL (Dialogue Act Markup In Several Layers) Automatic analysis of Dialogue needed for –Computer acting as participant with users –Computer as observer interpreting human speech DAMSL allows multiple labels in multiple layers to be applied to an utterance Communicative actions described here are high level.

DAMSL annotation scheme Forward communicative functions –Speech acts that affect the future of dialogue –These categories are independent –Divided into Representatives (statements) Making claims about the world –Speaker trying to affect the beliefs of the hearer- Assert –Repeating information for emphasis or acknowledgement-Reassert Influencing-Addressee-Future-Action –All utterances that discuss potential actions of the addressee »Directives: 1.Info Request: Questions and Requests (tell me the time) 2.Action Directive: Requests for action (Please take out the trash) –Open-Option »Speaker gives a potential course of action but does not show preference towards it Commissives (Committing-Speaker-Future-Action) –Offers –Commitments Perfomative catetory –Utterances that make a fact true in virtue of their content (You are fired) Other forward functions

DAMSL annotation scheme Backward communicative function: –The speech act categories related to responses –The classes are independent –Agreement Accept, accept-part, Maybe, Reject-part, reject, hold –Understanding Did the listener understand the speaker? The listener may –Signal-non Understanding –Signal understanding (Acknowledgements, Repeat-Rephrase, completion) –Correct –Misspeaking Answer –Supplying information explicitly requested by a previous Info-Request act Information relations –Describe how the information in the current utterance relates to previous utterances

Utterance features: –Information Level Task (utterance about the task) Task Management (utterance about the planning and monitoring of task) Communication management (Physical requirements of dialogue) Other –Communicative Status Abandoned Uninterpretable –Syntactic Features Conventional form (hello, how may I help you) Exclamatory form (wow)

Experiments Used test dialogues from the TRAINS 91- 93 dialogues. A person was given a problem to solve viz. shipping box cars to a city and another person was instructed to act as a problem solving system.

Results Three statistics were used to measure interannotator reliability. PA – percent pairwise agreement PE- Expected pairwise agreement Kappa (PA-PE)/1-PE

Results

An emperical investigation of proposals in Collaborative Dialogues: Barbara et al. They use a slight modification of the DRI (Discourse resource initiative) scheme. Task (will be read out) The DRI coding scheme Similar and Simpler than the DAMSL scheme discussed before. –Forward looking functions This dimension characterizes the potential effect that an utterance Ui has on the subsequent dialogue. Statement: Make claims about the world. –Assert (Speaker trying to change Hearers beliefs) –Reassert (if the claim has already been made before) Influence on hearer (I-on-H) –Influences H’s future action »Open option »Info Request »Action directives Influence on Speaker (I-on-S) –Commits S to some future course of action »Offer »commit

DRI coding scheme Backward looking functions: –Ui has to do with response Answer Agreement : –Accept/reject –Holds Certain refinements were made to the core features by adding heuristics for tagging Statements, I-on-H and I-on-S.

Coding results Their results on forward functions were better than Core and Allen’s (97) Very low Kappa value for agreement

Twenty questions for Dialogue act taxonomies (Traum) Defining dialogue acts: Question 1. Which is most important : fit to intuitions or formal rigor? –Difficult to precisely formulate complex intuitions using available formal techniques –Sacrifice intuition for formal rigor or vice versa? –Answer will depend on the purpose of the concept. (experimentation or verfication)

Question 2 & 3 Is the definition of a dialogue act an issue of lexical semantics or ontology of action? –Is defining providing an account when someone might be justified in describing a sentenced headed with a particular verb (inform, request), or to provide a technical vocabulary to compactly describe various types of occurences? (the speech acts in the third paper) Under what conditions may an action said to have occurred? –Allwood uses 4 criteria Intention of performer Form of behavior (eg linguistic form, question 2?) Achieved result Context in which the behavior occurs. –Avoid defining DA’s according to, say a certain set of results holding and then identify instances of these acts using one of the other criteria say, linguistic form. This would lead to coding difficulties

Question 4 &5 What is the role of speaker intention –Some would define dialogue acts on the basis of intention behind them –Some would define it with the recognition of this intention (illocutionary acts) What is the role of addressee uptake –Many dialogue act definitions require some changes to the addressee based on understanding of the utterance in a particular way

Question 6 What view should be taken regarding the performance of acts? –Speakers and listeners view –View of the speaker addressee team, normative conventional point of view. –Is one allowed to consider subsequent utterances before deciding performance –This has implications while coding.

Dialogue act components(questions 7 and 8) How are actions used in a logic? What is context? –What aspects of the situation are relevant as potential conditions for defining types of dialogue act performance and what aspects are (directly) affected. –Special sorts of information used for conditions and effects of dialogue acts Dialogue state (pre: dialogue be in a particular state, effect: transition to a new dialogue state) Mental states (effect: newly adopted beliefs) Social obligations and commitments

Questions 9 & 10 What kind of preconditions are appropriate –Most convenient dialogue acts have few, if any actual preconditions How should an unsuccessful act be distinguished from a failed attempt to perform an act? –Difference between the success and satisfaction of a speech act

Relationships and complex acts(question 11 and 12) What is the relationship between dialogue acts and other (e.g., physical) acts? –Different theories would maintain a crisp or more blurred distinction between dialogue acts and non-communicative acts. What is the relationship between dialogue acts and dialogue structure –Wholly dependent on dialogue structure (grammar based approaches) –Dialogue structure is primarily constructed from the activity that the participants are engaged in –Dialogue structure is also used as context for performance of dialogue act (question 8)48

Questions 13 & 14 Are there multi-agent dialogue acts? –Some researchers view the performance of most illocutionary acts as a collective performance of multiple agents, in virtue of the grounding process –Games, exchanges and collaborative completions. –Problems with tagging. Can dialogue acts be “composed” of more primitive acts? –Could a multiple strata dialogue act taxonomy have levels or ranks?

Question 15 Can multiple dialogue acts occur at the same time (performed through the same utterance) ? –Since utterances have multiple functions, yes. –It is a problem if the logical theory does not support simultaneous action –It has complications in Tagging

Taxonomic considerations(question 16 ) Can the same taxonomy be used for different kinds of activities? –People have been designing taxonomies for different dialogue activities. –A general theory might better allow one to use act distributions to identify activities or genres of activities as well as episodes within an activity.

Percentage distributions of dialogue acts in Corpus Coding

Questions 17 and 18 Can the same taxonomy used for different kinds of agents? –Could the same taxonomy cover communicative activities between Human with human Human with machine Humans with animals etc. –Modality of communication also matters How detailed should a dialogue act taxonomy be? –How many distinctions in speech act verbs should be captured within a dialogue act taxonomy (e.g. state, assert, inform) –Trade off between proposing many acts for subtle differences and reliability of coding

Questions 19 and 20 How should complexity be realized in a coding taxonomy? –How to capture multiplicity of functions in a Taxonomy? Multiple labels for each utterance, one for each function (DRI, Allen and Core) Bundle dialogue functions into one label (Vermobil, Jekat et. Al) Intermediate approach (DAMSL) Can a Taxonomy be used for tagging dialogue corpora be given a formal semantics and/or be used in a dialogue system? –Hope is “yes”

Ashish Vaswani Speech acts for Dialogue agents, Coding schemes and dialogue act taxonomies.

Similar presentations

Presentation on theme: "Ashish Vaswani Speech acts for Dialogue agents, Coding schemes and dialogue act taxonomies."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Ashish Vaswani Speech acts for Dialogue agents, Coding schemes and dialogue act taxonomies.

Similar presentations

Presentation on theme: "Ashish Vaswani Speech acts for Dialogue agents, Coding schemes and dialogue act taxonomies."— Presentation transcript:

Similar presentations

About project

Feedback