The Information State approach to dialogue modelling Staffan Larsson Dundee, Jan 2001
Overview Dialogue modelling The information state approach TrindiKit – a dialogue system toolkit GoDiS – a system implemented in TrindiKit Demo Can this be used for your purposes, and if so, how?
Dialogue modelling Theoretical motivations –find structure of dialogue –explain structure –relate dialogue structure to informational and intentional structure Practical motivations –build dialogue systems to enable natural human-computer interaction –speech-to-speech translation –...
Informal approaches to dialogue modelling speech act theory (Austin, Searle,...) –utterances are actions –illocutionary acts: ask, assert, instruct etc. discourse analysis (Schegloff, Sacks,...) –turn-taking, pre-sequences etc. dialogue games (Sinclair & Coulthard,...) –structure of dialogue segments (rather than separate utterances) –can e.g. be encoded as regular expressions or finite automata qna-game -> question qna-game* answer
Computational approaches implemented in systems and toolkits finite state automata (CLSU toolkit, Nuance) frame-based (Philips, SpeechWorks) plan-based (TRAINS, Allen, Cohen, Grosz, Sidner,...) general reasoning (Sadek,...) information states (TRINDI: Traum, Bos,...)
Why build dialogue systems? theoretical: test theories –e.g. what kind of information does the system need to keep track of? –problem: complex system with many components practical: natural language interfaces –databases (train timetables etc) –electronic devices (mobile phones,...) –instructional/helpdesk systems –booking flights etc –tutorial systems
What does a system need to be able to do? speech recognition parsing, syntactic and semantic interpretation –resolve ambiguities –anaphora and ellipsis resolution, etc... dialogue management –how does an utterance change the state of the dialogue? –given the current state of the dialogue, what should the system do? natural language generation speech synthesis
Why spoken dialogue? Spoken dialogue is the natural way for people to communicate –computers should adapt to humans rather than the other way around important to enable system and user to communicate in a natural (human-like) way –mixed initiative –turntaking, feedback, barge-in –handle embedded subdialogues –...
What’s happening with dialogue systems Beginning to be used commercially Limited domains –need to encode domain-specific knowledge; a general system would require general world knowledge –speech recognition is harder with large lexicon Simple dialogue types –mostly information-seeking Need to bridge gap between dialogue theory and working systems
The information state approach – key concepts Information states represent information available to dialogue participants, at any given stage of the dialogue Dialogue moves trigger information state updates, formalised as information state update rules Update rules consist of conditions and operations on the information state Dialogue move engine updates the information state based on observed moves, and decides on next move(s)
TrindiKit A toolkit for building and experimenting with dialogue move engines and systems, based on the information state approach
The information state is an abstract data structures (record, DRS, set, stack etc.) Modules (dialogue move engine, input, interpretation, generation, output etc.) access the information state DME (Dialogue Move Engine): module or group of modules responsible for integrating and generating dialogue moves Resources (device interface, lexicons, domain knowledge etc.) are hooked up to the information state TrindiKit architecture
input inter- pret Information State... control updateselect gene- rate output lexicon domain knowledge DME
A library of datatype definitions (records, DRSs, sets, stacks etc.) A language for writing information state update rules Methods and tools for visualising the information state debugging facilities TrindiKit components
A language for defining update algorithms used by TrindiKit modules to coordinate update rule application A language for defining basic control structure, to coordinate modules A library of basic ready-made modules for input/output, interpretation, generation etc.; A library of ready-made resource interfaces, e.g. to hook up databases, domain knowledge etc. TrindiKit components (cont’d)
TRINDIKIT dialogue theory (IS, rules, moves etc) domain knowledge (resources) domain-specific system Building a system domain-independent DME software engineering (basic types, control flow)
Starting from a theory of dialogue management, decide on –Type of information state (DRS, record, set of propositions, frame,...) –A set of dialogue moves –Information state update rules, including rules for integrating and selecting moves –DME Module algorithm(s) and basic control algorithm The DME is domain independent, given a certain type of dialogue –information-seeking –instructional –negotiative –... Building a domain-independent DME
Domain-specific system Build or select from existing components: Resources, e.g. –domain (device/database) interface –dialog-related domain knowledge, e.g. plan libraries etc. –grammars, lexicons Modules, e.g. –input –interpretation –generation –output
Explicit information state datastructure makes systems more transparent Update rules provide an intuitive way of formalising theories in a way which can be used by a system Domain knowledge encoded in resources; the rest of the system is domain independent TrindiKit Features
Features, cont’d Allows both serial and asynchronous systems Interfaces to OAA (only available for UNIX) Generic WWW interface Runs on UNIX, Windows, Linux etc. Needs SICStus Prolog Version 2.0 is available, next version expected early 2001 (SIRIDUS)
Extensions Modules for speech input and output, for using off-the-shelf products (SIRIDUS project) GUI for increased usability and overview, including tools for building systems Extend libraries of ready-made modules and resources Use in new tasks? –previously, the main focus has been on dialogue management –other tasks may require additional components
GoDiS and IMDiS – information state based on Questions Under Discussion MIDAS – DRS information state, first-order reasoning EDIS – information state based on PTT Autoroute – information state based on Conversational Game Theory Systems developed using TrindiKit
An experimental dialogue system built using the TrindiKit GoDiS
Information-seeking dialogue Information state based Ginzburg’s notion of Questions Under Discussion (QUD) Dialogue plans to drive dialogue Simpler than general reasoning and planning More versatile than frame-filling and finite automata GoDiS features
GoDiS & TrindiKit TrindiKit QUD-based dialogue theory (IS, rules,...) domain & language resources generic GoDiS system domain-specific GoDiS system information state approach
GoDiS dialogue moves Moves are determined by the relation of the content to the domain –utterance U is an answer if the content A of U is a relevant answer to a question Q in the domain –moves are not necessarily speech acts! Moves –ask(Q) –answer(A) –request repetition –greeting, quit
PRIVATE =PLAN = AGENDA = { findout(?return) } SHARED = findout(? x.month(x)) findout(? x.class(x)) respond(? x.price(x)) COM = dest(paris) transport(plane) task(get_price_info) QUD = LM = { ask(sys, x.origin(x)) } BEL = { } TMP = (same structure as SHARED) Sample GoDiS information state
integrateAnswer Before an answer can be integrated by the system, it must be matched to a question on QUD pre: eff: in( SHARED.LM, answer(usr, A)) fst( SHARED.QUD, Q) relevant_answer(Q, A) pop( SHARED.QUD ) reduce(Q, A, P) add( SHARED.COM, P) Sample update rule
Typical human-computer dialog S: Hello, how can I help you? U: I want price information please S: Where do you want to go? U: Paris S: How do you want to travel? U: A flight please S: When do you want to travel U: April S: what class did you have in mind? … S: The price is $123
Dialogue plans for information- seeking dialogue Find out how user wants to travel Find out where user wants to go to Find out where user wants to travel from Find out when user wants to travel … Lookup database Tell user the price
Typical human-human dialogue S(alesman), C(ustomer) S: hi C: flights to paris S: when do you want to travel? C: april, as cheap as possible...
Accommodation Lewis (1979): If someone says something at t which requires X to be in the conversational scoreboard, and X is not in the scoreboard at t, then (under certain conditions) X will become part of the scoreboard at t Has been applied to referents and propositions, as parts of the conversational scoreboard / information state
Question accommodation If questions are part of the information state, they too can be accommodated If the latest move was an answer, and there is an action in the plan to ask a matching question, put that question on QUD Requires that the number of possible matching questions is not too large (or can be narrowed down by asking clarification question)
Update rule for question accommodation QuAcc pre: eff: in( SHARED.LM, answer(usr, A)) in( PRIVATE.PLAN, findout(Q)) relevant_answer(Q, A) delete( PRIVATE.PLAN, findout(Q)) push( SHARED.QUD, Q)
Question and task accommodation in information-seeking dialogue S: hi U: flights to paris system finds plan containing appropriate questions, and loads it into the plan field in the information state system accommodates questions: how does user want to travel + where does user want to go, and integrates the answers “flight” and “to paris” system proceeds to next question on plan S: when do you want to travel?
Optimistic approach to grounding and acceptance DPs assume their utterances are accepted (and integrated into SHARED ) –If A asks a question with content Q, A will put Q topmost on SHARED.QUD If addresse indicates rejection, backtrack –using the PRIVATE.TMP field No need to indicate acceptance explicitly; it is assumed The alternative is a pessimistic approach –If A asks a question with content Q, A will wait for an acceptance (implicit or explicit) before putting Q on top of QUD
Adapted for travel agency and autoroute domains, as well as acting as interface to handheld computer or mobile phone Question and task accommodation to enable mixed initiative Simple “optimistic” grounding strategy Focus intonation based on information state contents Has been extended to handle instructional dialogue (IMDiS) Also being extended to handle negotiative dialogue (SIRIDUS) GoDiS features (cont’d)
How can you use this? use the information state approach? use TrindiKit? –tracking dialogue instead of actively participating –predicting dialogue moves problem: –everyday conversation is very complex –requires world knowledge and semantic interpretation but: –tracking dialogue is easier than participating –may not require as complex representations –infostate approach is very general