Presentation is loading. Please wait.

Presentation is loading. Please wait.

TrindiKit: A Toolkit for Flexible Dialogue Systems AI course, spring 2003 Staffan Larsson.

Similar presentations


Presentation on theme: "TrindiKit: A Toolkit for Flexible Dialogue Systems AI course, spring 2003 Staffan Larsson."— Presentation transcript:

1 TrindiKit: A Toolkit for Flexible Dialogue Systems AI course, spring 2003 Staffan Larsson

2 dialogue modelling in general architecture & concepts what’s in TrindiKit? building a system extending TrindiKit feature and advantages of TrindiKit a sample system: GoDiS This lecture

3 Dialogue modelling Theoretical motivations –find structure of dialogue –explain structure –relate dialogue structure to informational and intentional structure Practical motivations –build dialogue systems to enable natural human-computer interaction –speech-to-speech translation –...

4 Informal approaches to dialogue modelling speech act theory (Austin, Searle,...) –utterances are actions –illocutionary acts: ask, assert, instruct etc. discourse analysis (Schegloff, Sacks,...) –turn-taking, pre-sequences etc. dialogue games (Sinclair & Coulthard,...) –structure of dialogue segments (rather than separate utterances) –can e.g. be encoded as regular expressions or finite automata qna-game -> question qna-game* answer

5 Computational approaches implemented in systems and toolkits finite state automata (CLSU toolkit, Nuance) frame-based (Philips, SpeechWorks) plan-based (TRAINS, Allen, Cohen, Grosz, Sidner,...) general reasoning (Sadek,...) information states (TRINDI: Traum, Bos,...)

6 Why build dialogue systems? theoretical: test theories –e.g. what kind of information does the system need to keep track of? –problem: complex system with many components practical: natural language interfaces –databases (train timetables etc) –electronic devices (mobile phones,...) –instructional/helpdesk systems –booking flights etc –tutorial systems

7 What does a system need to be able to do? speech recognition parsing, syntactic and semantic interpretation –resolve ambiguities –anaphora and ellipsis resolution, etc... dialogue management –how does an utterance change the state of the dialogue? –given the current state of the dialogue, what should the system do? natural language generation speech synthesis

8 Why spoken dialogue? Spoken dialogue is the natural way for people to communicate –computers should adapt to humans rather than the other way around important to enable system and user to communicate in a natural (human-like) way –mixed initiative –turntaking, feedback, barge-in –handle embedded subdialogues –...

9 What’s happening with dialogue systems Beginning to be used commercially Limited domains –need to encode domain-specific knowledge; a general system would require general world knowledge –speech recognition is harder with large lexicon Simple dialogue types –mostly information-seeking Need to bridge gap between dialogue theory and working systems

10 What is TrindiKit? a toolkit for – building and experimenting with dialogue move engines and systems, – based on the information state approach not a dialogue system in itself

11 Architecture & concepts

12 module 1 module … Total Information State (TIS) Information state proper (IS) Module Interface Variables Resource Interface Variables resource 1 control module i module j module … module n resource … resource m DME

13 an abstract data structure (record, DRS, set, stack etc.) accessed by modules using conditions and operations the Total Information State (TIS) includes –Information State proper (IS) –Module Interface variables –Resource Interface variables Information State (IS)

14 module or group of modules responsible for –updating the IS based on observed moves –selecting moves to be performed dialogue moves are associated with IS updates using IS update rules –there are also update rules no directly associated with any move (e.g. for reasoning and planning) update rules: rules for updating the TIS –rule name and class –preconditon list: conditions on TIS –effect list: operations on TIS update rules are coordinated by update algorithms Dialogue Move Engine (DME)

15 Modules (dialogue move engine, input, interpretation, generation, output etc.) –access the information state –no direct communication between modules only via module interface variables in TIS modules don’t have to know anything about other modules increases modularity, reusability, reconfigurability –may interact with user or external processes Resources (device interface, lexicons, domain knowledge etc.) –hooked up to the information state (TIS) –accessed by modules –defined as object of some type (e.g. ”lexicon”) Modules and resources

16 What’s in TrindiKit?

17 What does TrindiKit provide? High-level formalism and interpreter for implementing dialogue systems –promotes transparency, reusability, plug- and-play, etc. –allows implementation and comparison of dialogue theories –hides low-level software engineering issues GUI, WWW-demo Ready-made modules and resources –speech –interfaces to databases, devices, etc. –reasoning, planning

18 a library of datatype definitions (records, DRSs, sets, stacks etc.) –user extendible a language for writing information state update rules GUI: methods and tools for visualising the information state debugging facilities –typechecking –logs of communication modules-TIS –etc. TrindiKit contents (1)

19 A language for defining update algorithms used by TrindiKit modules to coordinate update rule application A language for defining basic control structure, to coordinate modules A library of basic ready-made modules for input/output, interpretation, generation etc.; A library of ready-made resources and resource interfaces, e.g. to hook up databases, domain knowledge, devices etc. TrindiKit contents (2)

20 Special modules and resources included with TrindiKit OAA interface resource –enables interaction with existing software and languages other than Prolog Speech recognition and synthesis modules –TrindiKit shells for off-the-shelf products, e.g. Nuance Possible future modules: –planning and reasoning modules –multimodal input and output

21 Asynchronous TrindiKit Internal communication uses either –OAA (Open Agent Architecture) from SRI, or –AE (Agent Environment), a stripped-down version of OAA, implemented for TrindiKit enables asynchronous dialogue management –e.g.: system can listen and interpret, plan the dialogue, and talk at the same time

22 How to build a system

23 TrindiKit information state approach How to use TrindiKit We start from TrindiKit –Implements the information state approach –Takes care of low-level programming: dataflow, datastructures etc.

24 TrindiKit basic dialogue theory basic system information state approach How to build a basic system Formulate a basic dialogue theory –Information state –Dialogue moves –Update rules Add appropriate modules (speech recognition etc)

25 TrindiKit basic dialogue theory basic system information state approach genre-specific theory additions genre-specific system How to build a genre-specific system Add genre-dependent IS components, moves and rules

26 TrindiKit basic dialogue theory domain & language resources basic system application information state approach genre-specific theory additions genre-specific system How to build an application Add application-specific resources

27 Come up with a nice theory of dialogue Formalise the theory, i.e. decide on –Type of information state (DRS, record, set of propositions, frame,...) –A set of dialogue moves –Information state update rules, including rules for integrating and selecting moves –DME Module algorithm(s) and basic control algorithm –any extra datatypes (e.g. for semantics: proposition, question, etc.) Building a domain-independent Dialogue Move Engine

28 Specifying Infostate type the Total Information State contains a number of Information State Variables –IS, the Information State ”proper” –Interface Variables used for communication between modules –Resource Variables used for hooking up resources to the TIS, thus making them accessible from to modules use prespecified or new datatypes

29 Example: GoDiS infostate PRIVATE : PLAN : OpenStack( Action ) AGENDA : OpenQueue( Action ) SHARED : BEL : Set( Prop ) COM : Set( Prop ) QUD : OpenStack( Question ) LU: SPEAKER: Speaker MOVES: OpenQueue( Move ) ISSUES : OpenStack( Question ) QUD:local, questions available for ellipsis resolution ISSUES: global, questions which have been raised but not yet resolved

30 Specifying a set of moves amounts to specifying objects of type move (a reserved type) –there may be type constraints on the arguments of moves Example: GoDiS dialogue moves –Ask(Q), Q is a question –Answer(A), A is an answer (proposition or fragment) –Request(),  is an action –Confirm() –Greet –Quit

31 Writing rules rule = conditions + updates –if the rule is applied to the IS and its conditions are true, the operations will be applied to the IS –conditions may bind variables with scope over the rule (prolog variables, with unification and backtracking)

32 Example: a rule from GoDiS rule( integrateUsrAnswer, [ $/shared/lu/speaker = usr, assoc( $/shared/lu/moves, answer(R), false ), fst( $/shared/qud, Q ), $domain : relevant_answer( Q, R ), $domain : reduce(Q, R, P) ], [ set_assoc( /shared/lu/moves, answer(R),true), shared/qud := $$pop( $/shared/qud ), add( /shared/com, P ) ] ).

33 Building modules Algorithm –For DME modules: coordinate update rules –For control modules: coordinate other modules TrindiKit includes a language for writing algorithms –For DME modules: basic imperative programming constructs –For control module: basic imperative constructs plus asynchronous triggers

34 Sample update algorithm grounding, if $latest_speaker == sys then try integrate, try database, repeat downdate_agenda, store else repeat integrate orelse accommodate orelse find_plan orelse if (empty ( $/private/agenda ) then manage_plan else downdate_agenda repeat downdate_agenda if empty($/private/agenda)) then repeat manage_plan repeat refill_agenda repeat store_nim try downdate_qud

35 Sample control algorithm (2) input: { init => input:display_prompt, new_data(user_input) => input } | interpretation: { import interpret, condition(is_set(input)) => [ interpret, print_state ] } | dme: { import update, import select, init => [ select ], condition(not empty(latest_moves)) => [ update, if $latest_speaker == usr then select ] } | generation: { condition(is_set(next_moves)) => generate } | output: { condition(is_set(output)) => output } )).

36 From DME to dialogue system Build or select from existing components: Modules, e.g. –input –interpretation –generation –output Still domain independent the choice of modules determines e.g. the format of the grammar and lexicon

37 Domain-specific system Build or select from existing components: Resources, e.g. –domain (device/database) interface –dialog-related domain knowledge, e.g. plan libraries etc. –grammars, lexicons Example resources: GoDiS VCR control –VCR interface –Domain knowledge –Lexicon

38 Extending TrindiKit

39 You can add Datatypes –Whatever you need Modules –e.g. General interfaces to speech recognizers and synthesizers Resources –E.g. General interfaces to (passive) devices Important that all things added are reasonably general, so they can be reused in other systsems

40 Datatype definitions relations –relations between objects; true or false functions –functions from arguments to result selectors –selects an object embedded in another object Operations –Changes the information state

41 Building modules DME modules –Specific to a certain theory of dialogue management –Best implemented using rules and algorithms Other modules –Should be more general, less specific to certain theory of dialogue management –May be easier to implement directly in prolog or other language TrindiKit algorithm language currently only covers checking and updating the infostate These modules may also need to interact with other programs or devices

42 Building resources Resource –the resource itself; exports a set of predicates Resource interface –defines the resource as a datatype T, i.e. in terms of relations, functions and operations Resource interface variable –a TIS variable whose value is an object of the type T By changing the value of the variable, resources can be switched dynamically –change laguage –change domain

43 Features and advantages of TrindiKit

44 explicit information state datastructure –makes systems more transparent –enable e.g. context sensitive interpretation, distributed decision making, asynchronous interaction update rules –provide an intuitive way of formalising theories in a way which can be used by a system –represent domain-independent dialogue management strategies TrindiKit features

45 TrindiKit features cont’d resources –represent domain-specific knowledge –can be switched dynamically e.g. switching language on-line in GoDiS modular architecture promotes reuse –basic system -> genre-specific systems –genre-specific system -> applications

46 Theoretical advantages of TrindiKit theory-independent –allows implementation and comparison of competing theories –promotes exploration of middle ground between simplistic and very complex theories of dialogue intuitive formalisation and implementation of dialogue theories –the implementation is close to the theory

47 Practical advantages of TrindiKit promotes reuse and reconfigurability on multiple levels general solutions to general phenomena enables rapid prototyping of applications allows dealing with more complex dialogue phenomena not handled by current commercial systems

48 technical features interfaces to OAA (but can also run without it) –allows connecting systems to external software system modules can run either serially or in parallell wrappers for off-the-shelf recognizers and synthesizers runs on UNIX, Windows, Linux currently uses SICStus Prolog –but considering moving to shareware Prolog –possibly reimplement in other language –or make it independent of programming language (compilers for several languages)

49 availability TrindiKit website –www.ling.gu.se/projects/trindi/trindikit SourceForge project –development versions available –developer community? licensed under GPL more info in –Larsson & Traum: NLE Special Issue on Best Practice in Dialogue Systems Design, 2000 –TrindiKit manual (available from website)

50 GoDiS – information state based on Questions Under Discussion (Larsson et al 2000) –currently being reimplemented for thesis MIDAS – DRS information state, first-order reasoning (Bos & Gabsdil, 2000) EDIS – information state based on PTT (Matheson et al 2000) –extended to handle tutorial dialogue by Moore, Zinn, Core et al SRI Autoroute – information state based on Conversational Game Theory (Lewin 2000); robust interpretation (Milward 2000) Systems developed using TrindiKit

51 Recent work D’Homme (EU 2001) –Dialogues in the Home Environment –GoDiS, SRI system Instruction Based Learning for mobile robots (U Edinburgh) –MIDAS Tutoring Dialogue (U Edinburgh) –BEETLE (based on EDIS) Student projects (Gothenburg) adapting GoDiS to various domains

52

53 Research goals with GoDiS explore and implement issue-based dialogue management –starting from Ginzburg’s theory of dialogue semantics based on notion of QUD (Questions Under Discussion) –adapt to dialogue system (GoDiS) and implement –extend theory coverage, taking in relevant theories general theory of dialogue –minimize effort for adapting dialogue system to new domains incrementally extending system to handle increasingly complex types of dialogue –clarifies relation between dialogue genres –promotes reuse of update rules Larsson (2002): Issue-based Dialogue Management (PhD Thesis)

54 GoDiS: an issue-based dialogue system Built using TrindiKit –Toolkit for implementing and experimenting with dialogue systems based on the information state approach Explores and implements issue-based dialogue management 1.Menu based dialogue –Action-oriented dialogue, VCR application 2.Multiple tasks, information sharing between tasks 3.Feedback and grounding 4.Accommodation, re-raising, clarification 5.Multi-linguality & mutiple domains

55 TrindiKit GoDiS GoDiS-I GoDiS-A Travel Agency Auto- route Xerox manual VCR manager IBDM home device manager IS approach genre- specific applicatio n-specific

56 Issue-based dialogue management in GoDiS GoDiS: the basic system GoDiS-I: inquiry-oriented dialogue –typically, database search –implented application: Travel Agency dialogue as raising and addressing issues dialogue plans to drive dialogue –each plan associated with a ”task question” deals with multiple simultaneous issues enables information sharing between plans

57 input inter- pret TIS IS proper 5 module interface variables 3 resource interface variables DATABASE LEXICON DOMAIN data- base control updateselect gene- rate output lexicon domain knowledge DME

58 1. Action-oriented dialogue based on menus GoDiS-A: adapted for the genre of action-oriented dialogue each plan now associated with an action or a question semi-automatic conversion of menus to dialogue plans sample domain: menu-based dialogue for VCR

59 input inter- pret TIS IS proper 5 module interface variables 3 resource interface variables DEVICE LEXICON DOMAIN VCR interface control updateselect gene- rate output lexicon domain knowledge DME VCR

60 VCR menu structure fragment change play status –play, stop etc. change channel –channel: _ timer recording –add program channel:_ date:_ start-time:_ end_time:_ display added program –delete program display existing programs delete program:_ settings –clock, etc.

61 Simple Menu Traversal S> Lets see. What can I do for you? U> add program S> Okay. Add a program... Lets see. What channel do you want? U> channel one S> Okay. channel one... What date? U> today S> Okay. Today... What time do you want to start recording? U> six thirty S> Okay. From six thirty... What time do you want to stop recording? S> seven thirty. S> Okay. Until seven thirty... The program has been added. (dialog1_edit)

62 2. Multiple tasks and information sharing User can switch freely between any number of tasks –e.g. set the clock while programming the VCR Information collected while doing task A can be used in task B –For example, information where the user wants to travel can be used for travel reservation, visa enquiries, hotel reservation etc. This is a consequence of keeping a global information state, rather than task-local states (as in e.g. VoiceXML) Reraising previous task or issue is explicitly indicated (more later)

63 Multiple tasks in VCR S> Lets see. What can I do for you? U> add program S> Okay. Do you want to add a program? U> yes S> Lets see. What channel do you want? U> channel five S> ”Okay. Channel five… What date? U> set the clock to five thirty S> Okay. set the clock… Lets see. five thirty.” U> yes S> The clock was set. S> Returning to add a program. What date? U> today S> Okay. today. (dialog2_edit) Second task introduced Explicit signalling of task sequencing (ICM)

64 Information sharing in TA application S> Welcome to the travel agency! U> price information S> (…) Lets see. How do you want to travel? U> by flight S> (…) What city do you want to go to? U> paris S> (…) What city do you want to go from? U> do I need a visa ? S> (…) Lets see. What country are you from? U> sweden S> Okay. Yes, you need a Visa. S> Returning to the issue of price. Lets see. What city do you want to go from? Destination information used both in getting price information and getting visa information

65 3. Grounding and feedback Grounding –making sure that the participants are percieving, understanding, and accepting each other’s utterances –dealing with problematic situtations where e.g. an utterance is not percieved Feedback moves –(short) utterances which signal grounding status of previous utterance Sequencing moves –utterances which signal switching task (”returning to…”), reraising questions (”so…”) etc.

66 Grounding moves We want a typology of grounding moves –Both feedback and sequencing Some parameters: –Action levels in dialogue (Allwood, Clark) contact perception Understanding –Word meanings –Contextual meaning acceptance –Polarity Positive Negative Interrogative (asking for confirmation)

67 Some grounding moves in GoDiS Formal representation –icm:Level/Type{*Polarity}{:Content} Feedback moves –icm:und*neg – ”I don’t understand” –icm:und*pos:P – ”To Paris.” –icm:acc*neg:Q – ”Sorry, I can’t …” –icm:acc*pos – ”Okay” Feedback type selected depending on –Quality of recognised speech –Whether system can find a (relevant) interpretation –Whether system can accept what’s been said Sequencing moves –icm:reraise:Q – ”Returning to the issue of Q” –icm:loadplan – ”Let’s see…”

68 Grounding on multiple levels S> Lets see. What can I do for you? U> (mumble mumble) S> Sorry, I didnt hear what you said. What can I do for you? U> siridus S> I heard you say siridus. Sorry, I dont understand. What can I do for you? U> channel one S> channel one. I dont quite understand. Do you want to add a program or change channel? (dialog3a_edit) Negative perception feeback Positive perception Negative understanding Positive understanding (word meaning) Negative understanding (relevance) Task clarification

69 U> turn on the t v S> turn on the TV… Sorry, I cannot control the TV set. Do you want to add a program or change channel?” U> pardon S> Sorry, I cannot control the TV set. Do you want to add a program or change channel? U> change channel S> Okay. (dialog3b_edit) Rejection (negative acceptance feedback) Positive understanding User negative perception System repetition Positive acceptance

70 Correcting the system S> Lets see. What can I do for you? U> add program S> Okay. Add a program… S> Lets see. What channel do you want? U> channel six S> channel five? U> no S> not channel five. So, What channel do you want? U> channel six S> Okay. channel six.” (dialog4_edit) Interrogative feedback Negative answer to fb. Signals reraising (Sequencing ICM)

71 4. Question Accommodation To deal with –User giving more/less/other information than requested –Guessing what the user wants to do Basic idea: –Move questions to QUD or ISSUES to adapt to user utterances –E.g. short answers where question can be found in the context –Automatically generates clarification question if necessary

72 Task and question accommodation S> Lets see. What can I do for you? U> channel five today S> Alright. add a program… Lets see. Okay. channel five… today… What time do you want to start recording? U> from six thirty to seven thirty S> Okay. from six thirty… to seven thirty… The program has been added. ” (dialog6_edit) Signals accommodation (sequencing ICM)

73 Task clarification S> What can I do for you? U> channel five S> channel five. I dont quite understand. Do you want to add a program or change channel? U> change channel S> Okay. change channel. S> Lets see. channel five? U> yes S> The channel was changed (dialog7_edit)

74 Revision by reaccommodation and reraising S> What can I do for you? U> add program channel five S> Okay. Do you want to add a program? U> yes S> Lets see. channel five? U> yes S> What date? U> channel six S> Okay. channel six… What date?

75 5. Multilinguality and multiple domains Language and domain resources can be switched on the fly Does not affect the rest of the system Thanks to modularity!

76 Multilinguality S> Lets see. What can I do for you? U> add program S> Okay. Do you want to add a program? U> yes S> Lets see. What channel do you want? U> svenska S> Okej. S> Gå tillbaks till lägg till ett program. Vilken kanal vill du spela in? U> kanal ett S> kanal ett, är det korrekt? S> Så, Vilken kanal vill du spela in? U> english (dialog5_edit) S> Okay...

77 Multiple domains S> Welcome to the VCR manager! S> Lets see. What can I do for you? U> change domain S> Okay. change domain… Lets see. What domain do you want? U> telephone S> Okay. This is your automatic telephone operator… Lets see. Please specify a function. U> call S> Okay. make a phone call. S> Lets see. Please specify the destination of the call. U> luis S> Okay. Luis… Calling

78 Summary: TrindiKit a toolkit for dialogue systems R&D freely available to researchers close the gap between theory and practive of dialogue systems theory-independent promotes reuse and reconfigurability on multiple levels Enables implementation of flexible dialogue management –e.g. Issue-based dialogue management in GoDIS

79

80

81 TrindiKit and VoiceXML VoiceXML –industry standard –form-based dialogue manager –web/telephony infrastructure –requires scripting dialogues in detail Theory-specific? –VoiceXML implements a specific theory of dialogue –TrindiKit allows implementation of several different theories of dialogue –More complex dialogue phenomena hard to deal with in VoiceXML

82 TrindiKit and VoiceXML, cont’d Combine VoiceXML with TrindiKit? –future research area –support connecting TrindiKit to VoiceXML infrastructure –use TrindiKit system as VoiceXML server, dynamically building VoiceXML scripts –convert VoiceXML specifications to e.g. GoDiS dialogue plans


Download ppt "TrindiKit: A Toolkit for Flexible Dialogue Systems AI course, spring 2003 Staffan Larsson."

Similar presentations


Ads by Google