A preliminary classification of dialogue genres or Correlating properties of activities with properties of dialogue systems Staffan Larsson Dept. of linguistics Göteborg University
Overview Introduction Previous classifications of dialogue Dimensions of classification Possible additional activity dimensions Using the classification: decision graphs and libraries Summary & future work
Introduction Goals –A classification of dialogue genres (types, kinds, …), relevant for development of dialogue systems –Correlating properties of activities with properties of dialogue systems –Investigate how this classification can be used in the development of dialogue systems and applications Background: GoDiS –An issue-based dialogue system implemented using TrindiKit (Larsson 2002) –This talk is done with GoDiS in mind, but it the ideas presented are intended to more general
Dahlbäck (1997) Modality: spoken/written Kinds of agents: human/computer Interaction: dialogue/monologue Context : spatial, temporal Number & type of tasks –Simultaneous? Dialogue-task distance –Similarity of dialogue structure – task structure Kinds of shared knowledge exploited –Perceptual, linguistic, cultural
Discussion: Dahlbäck Several dimensions, some relevant but some not –We currently assume spoken human- computer dialogue –Dialogue-task distance perhaps too abstract –Context, kinds of shared knowledge used, and number of tasks relevant, but not yet included in our classification –Type of task similar to our concept of activity
Allen et. al. (2001) technique used example tasktask complexity dialogue phenomena handled finite-state script long-distance dialing least complex user answers questions frame-basedgetting train timetable info user asks questions, simple clarifications by system sets of contexts travel booking agent shifts between predetermined topics plan-based models kitchen design consultant dynamically generated topic structures, collaborative negotiation subdialogues agent-based models disaster relief management most complex different modalities (e.g. planned world and actual world)
Discussion:Allen et. al. Relates properties of system to properties of activity, BUT Based on technologies, not properties of activities –Dialogue phenomena don’t necessarily come in lumps Focus on information seeking and collaborative planning; some types of dialogue not included –Tutorial, Explanatory, Instructional…
Desiderata for a classification of dialogue Based on multiple independent properties of (dialogue in) different activities Relating properties of activity to properties of system, formulated in the Information State approach Covering not only information seeking and collaborative planning dialogue
Information State (IS) –an abstract data structure (record, DRS, set, stack etc.) –accessed by dialogue system modules using conditions and operations Dialogue Moves –utterance function (ask, answer, request etc.) Update rules –Modify IS based on on observed moves –Select moves to be performed IS Approach implemented in TrindiKit Background: Information State Approach
Dialogue classification & IS approach We want to relate our classification to components of the IS approach: –IS type –Dialogue moves –Update rules In this talk, rather informally –For GoDiS, we have more formal descriptions
Some initial dimensions of classification Inquiry-oriented vs. Action- oriented dialogue Type of result: simple/complex Type of external process: active/passive Distribution of decision rights: shared/disjoint
Inquiry-oriented vs. action- oriented dialogue IOD: raising and addressing issues –E.g. database search AOD: introduces (non-communicative) actions to performed (requests) –E.g. programming a Video Recorder Dialogue genre Moves/rulesInformation State components Inquiry- Oriented (IOD) ask answer Question stack Action- Oriented (AOD) request confirm Action stack
Result type Is the primary result of the dialogue a simple or a complex information object? –Simple: proposition, action –Complex: plan, proof, explanation Complex results require update rules and information state components (e.g. a tree) enabling incremental construction Example: offline planning –U: Get me coffee –R: How do I do that? –U: First, go to the kitchen. –R: OK. And then? –U: Go to the coffee machine. –…
Proactivity of external process Passive: database, simple device (e.g. Video Recorder) (Pro)active: device, e.g. robot, burglar alarm –May need to interrupt current dialogue, perhaps even interrupt user utterances This dimension correlates with –the way the system is connected to the device Is the device interface a resource (passive) or a module (active)? –System intitiative and turntaking mechanisms
Distribution of decision rights Disjoint: each question directed to a specific DP ; this DP decides on the answer and does not need to negotiate Shared: some question(s) should be answered jointly; negotiation may be needed Dialogue system requirements for negotiation: –Dialogue move: propose –Information state component: a stack of pairs of issue under negotiation, and alternative solutions/answers to this issue N.B.: we here refer to collaborative negotiation (non-conflicting goals) –E.g. SunDial furniture selection task
activityIOD/ AOD result typeexternal process decision rights database searchIODsimple: price etc. complex: itinerary passive (database) disjoint ticket bookingAOD+ IOD simple: flightpassive (database) disjoint simple device control AOD+ IOD simple: actionspassive or active disjoint instructional (sys instructs usr) AOD+ IOD simple: actionspassive (manual) disjoint offline planning, incl. itinerary planning, complex device control AODcomplex: plan(s)passive (planner) shared online planning, e.g. TRIPS AOD+ IOD complex: planactive (device+ planner) shared explanationIODcomplex: proof or explanation passive (inference engine) shared tutorialIOD/ AOD complex?passive (planner) disjoint narrationIODcomplex: narrativepassivedisjoint
Possible additional activity- related factors Distribution of information –Symmetric: DPs have same kind of information –Asymmetric: DPs have different kinds of information –Relation to distribution of decision rights? Shared or conflicting goals –Conflicting goals may lead to non-collaborative negotiation, which would require argumentation acts, including rhetorical acts Number of simultaneous tasks (one or several) –But probably very few activities with just one task …
Comments What we really are classifying are activities –Table shows a classification of activities according to features of a dialogue system needed to particitpate in dialogues in these activities How specific should our activities, or activity types, be? –Action oriented dialogue? Device control? VCR control? Dialogue with Panasonic VCR 4500? Is ”genre” still a useful term? –Could perhaps be reserved for very basic properties, such as IOD/AOD –Or have genres like ”AOD for active devices and collaborative negotiation and asymmetric distrubution of information”
How can this classification be used? Make decision graphs … –… which based on properties of the activity, including dialogue properties, … –… leads to dialogue genres, or to desired properties of system. Based on output of decision graph, –select the variant of the system closest to the requirements –E.g. GoDiS for AOD with passive devices and disjoint decision rights
Sample decision graph (partial, and assumes disjoint decision rights) Does the dialogue involve requests for actions? Is the goal of the dialogue to control a device? Is the goal of the dialogue to retrieve information from a database? Is the device active? AOD-Passive IOD
Libraries? Disadvantages of ”system variants” approach –Large number of system variants –Same code respresented in several system variants Ideally, –system properties should correlate with modular libraries of moves, rules, and IS components; –These libraries can be combined into a system suitable for dialogue in the activity. Libraries e.g. for –AOD, IOD –Simple results, complex results –Negotiation
Independent ”decision graphs” for libraries: examples Does the dialogue involve questions and answers? –Yes -> use ”IOD” library Does the dialogue involve requests for actions? –Yes -> use ”AOD” library Does the dialogue involve an active external process? –Yes -> use ”ActiveDevice” library –No -> use the ”PassiveDevice” library Are there issues with shared decision rights? –Yes -> use ”Negotiation” library
Libraries, cont’d Libraries would also simplify implementation: –Enables upgrading a library without having to change anything else –E.g. plug in a new analysis of grounding –Allows reuse of the same rules etc. in multiple genres However, it may be difficult to achieve the required degree of modularity
Summary By –relating properties of (dialogue in) activities to properties of dialogue systems, we can –determine which variant of a system (or which combination of libraries) to use for a system in a given activity We provided a first attempt at such a classification, –and discussed how it could be used
Future work Extend the number of dimensions of classification –More activity-related factors –Add modality-related factors? Explore the idea of libraries –May be difficult to implement (Extend capabilities of GoDiS –Currently, IOD and AOD for passive devices, disjoint decision rights, asymmetric distribution of information, shared goals, multiple simultaneous tasks)
?
More thoughts Rule libraries come with infostate extensions/requirements, and with additional moves –Requirements not only on structure, but also on how it’s to be used, e.g. What does the order of a queue mean?
Interactive Communication Management The presence of ICM may be independent of activity –… but not the form of ICM –Have different ICM grammars for different kinds of activity –Which factors determine genre-specific ICM? Written/spoken Noisiness Available modalities How important to be right? AOD->higher requirements on recognition, more checks? Negotiation (in ”alternatives” sense) not really directly correlated with shared decision rights
Modality-related properties Written Spoken –Not noisy –Noisy Determines choice of feedback mechanisms To some extent activity-related
Allwood’s activity-based pragmatics Levels of activity/context –Physical: artifacts etc. –Biological –Psychological: beliefs, desires, intentions, … –Social: incl. rights & obligations, communicative and task- related How do these fit with the proposed activity-related factors? –Distribution of decision rights: social –Proactivity of external process: Physical (Biological? Psychological?) –Result type: Psychological? –Information state components: Psychological and social
Cutouts…
GoDiS: an issue-based dialogue system Built using TrindiKit –Toolkit for implementing and experimenting with dialogue systems based on the information state approach Explores and implements Issue-based Dialogue Management (IBDM) Extends theory to more flexible dialogue –Multiple tasks, information sharing between tasks –Interactive Communication Management (ICM), including feedback, and grounding –Question accommodation –Negotiation of alternatives –Menu based action oriented dialogue
input inter- pret TIS DATABASE LEXICON DOMAIN data- base control updateselect gene- rate output lexicon domain knowledge DME
TrindiKit GoDiS GoDiS-IOD GoDiS-AOD Travel Agency Auto- route Xerox manual VCR manager IBDM home device manager IS approach genre- specific activity- specific
General dialogue phenomena - may appear in any activity We assume grounding & accommodation probably present in all spoken H-H dialogue –However, grounding works very differently in noisy environments, and of course in written dialogue We don’t use these factors to distinguish activities FeatureMoves/rulesInfostate components ICM & grounding ICM movesTemporary storage, grounding issues Question accommodation Accommodation rules -
Added AOD/IOD- complicated cases –Web search:IOD/AOD; what is a non- communicative action? –Offline planning (should be IOD, unless DP requested to carry out the plan) Distinguish different kinds of computer DPs –Robots vs. Stationary devices, etc.
Additional dimension –Pronoun resolution needed? Or can it be ignored? How determine this y looking at dialogue? Turntaking related to –Grounding (modality) –Passive/active device –…?
Why not use all libraries (maximal variants)? –Because more work adapting to new domains