Download presentation
Presentation is loading. Please wait.
Published bySuhendra Iskandar Modified over 6 years ago
1
Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework
Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by: Dan Bohus Special appearances: Antoine Raux, Jahanzeb Sherwani, Thomas Harris
2
Examples RoomLine Let’s Go! Bus Information System Sublime TeamTalk
conference room reservations within SCS; system can access schedules of 13 conf rooms in Wean-Hall and NSH Let’s Go! Bus Information System bus schedule information system for Port Authority buses in Oakland and Squirrel Hill [Let’s Go! Project] Sublime personalized information management system TeamTalk an investigation into human and multi-robot spoken language communication in unstructured environments
3
Examples RoomLine Let’s Go! Bus Information System Sublime TeamTalk
conference room reservations within SCS; system can access schedules of 13 conf rooms in Wean-Hall and NSH Let’s Go! Bus Information System bus schedule information system for Port Authority buses in Oakland and Squirrel Hill [Let’s Go! Project] Sublime personalized information management system TeamTalk an investigation into human and multi-robot spoken language communication in unstructured environments
4
Examples RoomLine Let’s Go! Bus Information System Sublime TeamTalk
conference room reservations within SCS; system can access schedules of 13 conf rooms in Wean-Hall and NSH Let’s Go! Bus Information System bus schedule information system for Port Authority buses in Oakland and Squirrel Hill [Let’s Go! Project] Sublime personalized information management system TeamTalk an investigation into human and multi-robot spoken language communication in unstructured environments
5
Examples RoomLine Let’s Go! Bus Information System Sublime TeamTalk
conference room reservations within SCS; system can access schedules of 13 conf rooms in Wean-Hall and NSH Let’s Go! Bus Information System bus schedule information system for Port Authority buses in Oakland and Squirrel Hill [Let’s Go! Project] Sublime personalized information management system TeamTalk an investigation into human and multi-robot spoken language communication in unstructured environments
6
More Systems LARRI Madeleine Eureka
multimodal system that assists F/A-18 aircraft maintenance personnel throughout the execution of procedural tasks [Symphony] Madeleine text-based prototype for medical diagnosis system [MITRE workshop] Eureka dialogue interface to the Vivisimo web search engine
7
The Communicator / RavenClaw Spoken Dialogue Systems Framework
Examples Overall Architecture System Development Components & Resources Miscellaneous Current Research examples : architecture : development : components : miscellaneous : research
8
Overall Architecture Classical pipeline architecture
Recognition SPHINX Synthesis THETA Lang. Understand. PHOENIX/HELIOS Dialog Manag. RAVENCLAW Back-end (various) Lang. Generation ROSETTA examples : architecture : development : components : miscellaneous : research
9
Galaxy HUB Generic centralized, message- passing communication architecture Developed at MIT, used in Communicator program Competitor: OAA Recognition SPHINX Lang. Understand. PHOENIX/HELIOS Galaxy HUB Dialog Manag. RAVENCLAW Back-end (various) Synthesis THETA Lang. Generation ROSETTA examples : architecture : development : components : miscellaneous : research
10
Getting Even Closer HUB
Recognition SPHINX Lang. Understand. PHOENIX/HELIOS HUB Dialog Manag. RAVENCLAW Back-end (perl) Synthesis THETA Language Gen. ROSETTA examples : architecture : development : components : miscellaneous : research
11
Getting Even Closer HUB
PROCESS MONITOR Multiple, parallel decoders SPHINX SPHINX SPHINX Inputs from other modalities DateTime Other domain agents Recognition Server Parsing PHOENIX Confidence HELIOS Lang. Understand. PHOENIX/HELIOS Text I/O TTYServer HUB Dialog Manag. RAVENCLAW Back-end Galaxy Stub Actual Perl Back-end (perl) Synthesis THETA Lang. Generation ROSETTA (Perl) Galaxy Stub Lang. Generation ROSETTA examples : architecture : development : components : miscellaneous : research
12
The Communicator / RavenClaw Spoken Dialogue Systems Framework
Examples Overall Architecture System Development Components & Resources Miscellaneous examples : architecture : development : components : miscellaneous : research
13
Building a Spoken Dialogue System
Language, Acoustic, Lexical Models Grammar Recognition SPHINX Synthesis THETA Lang. Understand. PHOENIX/HELIOS Dialog Manag. RAVENCLAW Back-end (perl) Back-end (perl) RavenClaw Dialog Task Specification Lang. Generation ROSETTA (Limited Domain) Voice Templates examples : architecture : development : components : miscellaneous : research
14
RavenClaw Dialog Task Specification
So How Long Will It Take? MITRE Workshop on Dialogue Management (Fall 2003) Develop a Text-based SDS for medical diagnosis (provided backend) Madeleine (22 hours) Language, Acoustic, Lexical Models Grammar Recognition SPHINX Synthesis THETA Lang. Understand. PHOENIX/HELIOS Dialog Manag. RAVENCLAW Back-end (perl) Back-end (perl) RavenClaw Dialog Task Specification Lang. Generation ROSETTA (Limited Domain) Voice Templates examples : architecture : development : components : miscellaneous : research
15
Okay, How Long Will It Really Take?
To get a system running with a reasonable performance [poll amongst 3 RavenClaw developers] 1 month to get a working system up and running 1 month to fine-tune performance Further iterative improvements will continue as more data accumulates examples : architecture : development : components : miscellaneous : research
16
The Communicator / RavenClaw Spoken Dialogue Systems Framework
Examples Overall Architecture System Development Components & Resources Miscellaneous examples : architecture : development : components : miscellaneous : research
17
Components & Resources
Language, Acoustic Models Grammar Recognition SPHINX Synthesis THETA Lang. Understand. PHOENIX/HELIOS Dialog Manag. RAVENCLAW Back-end (perl) Back-end (perl) RavenClaw Dialog Task Specification Lang. Generation ROSETTA Limited Domain Voice Templates examples : architecture : development : components : miscellaneous : research
18
Components & Resources
Language, Acoustic Models Grammar Recognition SPHINX Lang. Understand. PHOENIX/HELIOS Dialog Manag. RAVENCLAW Back-end (perl) Back-end (perl) RavenClaw Dialog Task Specification Synthesis THETA Lang. Generation ROSETTA Limited Domain Voice Templates examples : architecture : development : components : miscellaneous : research
19
SPHINX II Semi-continuous acoustic models Language models
Off-the-shelf 8kHz, kHz, 16kHz models Scripts for building your own PLSA adapted models perform better Language models 2-gram & 3-gram model CMU-Cambridge SLM Toolkit Generate from Phoenix Grammar Finite state grammar Sphinx supports state-specific LMs Dictionary (lexical models) CMU Dictionary examples : architecture : development : components : miscellaneous : research
20
Sphinx II - continued Multiple parallel decoders [e.g., male + female]
Multiple hypothesis forwarded, selection done later Typical WER: 15-30% With pronounced differences native vs. non-native Lowered by retuning acoustic and language models to the domain Migration to SPHINX 3.x in the near future Expected: big improvement in WER Concern: real-time performance
21
Components & Resources
Language, Acoustic Models Grammar Recognition SPHINX Synthesis THETA Lang. Understand. PHOENIX/HELIOS Dialog Manag. RAVENCLAW Back-end (perl) Back-end (perl) RavenClaw Dialog Task Specification Lang. Generation ROSETTA Limited Domain Voice Templates examples : architecture : development : components : miscellaneous : research
22
Phoenix Parser / Grammar
[room_size_spec] ([rss_large]) ([rss_small]) ([rss_larger]) ([rss_smaller]) ([rss_smallest]) ([rss_largest]) ; [rss_large] (large) (big) (huge) [rss_larger] (*the larger) (*the bigger) (too small) [rss_largest] (*the largest) (*the biggest) [rss_small] (small) (little) Phoenix: Robust Parser CFG Grammar Manually-generated domain-specific grammar rules Reusable, generic sub-grammars [Yes], [No], [Number], [DateTime], [Help], [Repeat], [Suspend], etc… DO YOU HAVE SOMETHING A BIT LARGER? [NeedRoom] ( [_i_want] (DO YOU HAVE SOMETHING) ) [RoomSizeSpec] ( [room_size_spec] ( [rss_larger] (LARGER))) Parses all incoming hypotheses and passes all parses along… examples : architecture : development : components : miscellaneous : research
23
Helios / Confidence Annotation
Builds accurate confidence scores using features from 3 sources of knowledge: Speech recognition Language understanding Dialogue management Selects hypothesis with maximum confidence score Research in progress on hypothesis-selection, and transferability across domains examples : architecture : development : components : miscellaneous : research
24
Components & Resources
Language, Acoustic Models Grammar Recognition SPHINX Synthesis THETA Lang. Understand. PHOENIX/HELIOS Dialog Manag. RAVENCLAW Back-end (perl) Back-end (perl) RavenClaw Dialog Task Specification Lang. Generation ROSETTA Limited Domain Voice Templates examples : architecture : development : components : miscellaneous : research
25
RavenClaw Architecture
Captures all domain-specific dialog (task) logic using a hierarchical description The authoring effort is focused entirely here Dialog Task (Specification) Domain-independent Dialog Engine Manages dialog by executing the dialog task specification Provides a large number of domain-independent conversational strategies examples : architecture : development : components : miscellaneous : research
26
RavenClaw Architecture
Captures all domain-specific dialog (task) logic with a hierarchical description The authoring effort is focused entirely here Dialog Task (Specification) Domain-independent Dialog Engine Manages dialog by executing the dialog task specification Provides a large number of domain-independent conversational strategies examples : architecture : development : components : miscellaneous : research
27
RavenClaw: Dialogue Task Specification
diagnostic Madeleine I:Welcome E:LoadSymptoms GeneralFeel Diagnose R:HowAreYou? I:Glad I:Sorry Fever Travel general_feeling R:AskFever E:MeasureTemp I:InformFever have_fever Tree of dialog agents Terminals: Inform, Request, Expect, Execute Non-terminals / Dialog agency: plans execution of child nodes Basically a Hierarchical Task Execution Network; each agent: Preconditions & effects Success & failure criteria Trigger (focus) criteria Effects examples : architecture : development : components : miscellaneous : research
28
general_feeling Sample DTS Code GeneralFeel R:HowAreYou? I:Glad I:Sorry // /Madeleine/GeneralFeel DEFINE_AGENCY(CGeneralFeel, DEFINE_CONCEPTS( STRING_USER_CONCEPT(general_feeling, none)) DEFINE_SUBAGENTS( SUBAGENT(HowAreYou, CHowAreYou) SUBAGENT(Glad, CGlad) SUBAGENT(Sorry, CSorry)) SUCCEEDS_WHEN(COMPLETED(Glad) || COMPLETED(Sorry))) // /Madeleine/GeneralFeel/HowAreYou DEFINE_REQUEST_AGENT(CHowAreYou, REQUEST_CONCEPT(general_feeling) GRAMMAR_MAPPING("![Yes]>good, ![FeelingGood]>good, " "![FeelingSoSo]>soso, ![FeelingBad]>bad"))) // /Madeleine/GeneralFeel/Glad DEFINE_INFORM_AGENT(CGlad, PRECONDITION(C("general_feeling") == CString("good")) PROMPT("inform glad_youre_good") ON_COMPLETION(FINISH(/Madeleine))) // /Madeleine/GeneralFeel/Sorry DEFINE_INFORM_AGENT(CSorry, PRECONDITION(C("general_feeling") != CString("good")) PROMPT("inform sorry_youre_bad")) examples : architecture : development : components : miscellaneous : research
29
RavenClaw Execution chart diagnostic Madeleine I:Welcome E:LoadSymptoms GeneralFeel Diagnose R:HowAreYou? I:Glad I:Sorry Fever Travel general_feeling R:AskFever E:MeasureTemp I:InformFever have_fever Dialog Stack Expectation Agenda examples : architecture : development : components : miscellaneous : research
30
RavenClaw Execution chart diagnostic Madeleine I:Welcome E:LoadSymptoms GeneralFeel Diagnose R:HowAreYou? I:Glad I:Sorry Fever Travel general_feeling R:AskFever E:MeasureTemp I:InformFever have_fever Dialog Stack Expectation Agenda Madeleine examples : architecture : development : components : miscellaneous : research
31
RavenClaw Execution chart diagnostic Madeleine I:Welcome E:LoadSymptoms GeneralFeel Diagnose R:HowAreYou? I:Glad I:Sorry Fever Travel general_feeling R:AskFever E:MeasureTemp I:InformFever have_fever Dialog Stack Expectation Agenda Welcome Madeleine examples : architecture : development : components : miscellaneous : research
32
RavenClaw Execution chart diagnostic Madeleine I:Welcome E:LoadSymptoms GeneralFeel Diagnose R:HowAreYou? I:Glad I:Sorry Fever Travel general_feeling R:AskFever E:MeasureTemp I:InformFever have_fever Dialog Stack Expectation Agenda Hi, this is Madeleine, the automated… Madeleine examples : architecture : development : components : miscellaneous : research
33
RavenClaw Execution chart diagnostic Madeleine I:Welcome E:LoadSymptoms GeneralFeel Diagnose R:HowAreYou? I:Glad I:Sorry Fever Travel R:Headache R: R: R: general_feeling headache R:AskFever E:MeasureTemp I:InformFever have_fever Dialog Stack Expectation Agenda Hi, this is Madeleine, the automated… LoadSymptoms Madeleine examples : architecture : development : components : miscellaneous : research
34
RavenClaw Execution chart diagnostic Madeleine I:Welcome E:LoadSymptoms GeneralFeel Diagnose R:HowAreYou? I:Glad I:Sorry Fever Travel R:Headache R: R: R: general_feeling headache R:AskFever E:MeasureTemp I:InformFever have_fever Dialog Stack Expectation Agenda Hi, this is Madeleine, the automated… Madeleine examples : architecture : development : components : miscellaneous : research
35
RavenClaw Execution chart diagnostic Madeleine I:Welcome E:LoadSymptoms GeneralFeel Diagnose R:HowAreYou? I:Glad I:Sorry Fever Travel R:Headache R: R: R: general_feeling headache R:AskFever E:MeasureTemp I:InformFever have_fever Dialog Stack Expectation Agenda Hi, this is Madeleine, the automated… GeneralFeel Madeleine examples : architecture : development : components : miscellaneous : research
36
RavenClaw Execution / Input Pass
chart diagnostic Madeleine I:Welcome E:LoadSymptoms GeneralFeel GeneralFeel Diagnose R:HowAreYou? I:Glad I:Glad I:Sorry I:Sorry Fever Travel R:Headache R: R: R: general_feeling headache R:AskFever E:MeasureTemp I:InformFever have_fever Dialog Stack Expectation Agenda Hi, this is Madeleine, the automated… general_feeling: [good], [bad], [soso] How are you feeling today? general_feeling: [good], [bad], [soso] Not so good, I think I have a fever general_feeling: [good], [bad], [soso] have_fever: [fever]. ![yes], ![no] headache: [headache], ![yes], ![no] cough: [cough], ![yes], ![no] … … [soso](not so good) [fever](I think I have a fever) HowAreYou GeneralFeel GeneralFeel Madeleine examples : architecture : development : components : miscellaneous : research
37
RavenClaw Execution chart diagnostic Madeleine I:Welcome E:LoadSymptoms GeneralFeel Diagnose R:HowAreYou? I:Glad I:Sorry Fever Travel R:Headache R: R: R: general_feeling headache R:AskFever E:MeasureTemp I:InformFever have_fever Dialog Stack Expectation Agenda Hi, this is Madeleine, the automated… How are you feeling today? Not so good, I think I have a fever [soso](not so good) [fever](I think I have a fever) GeneralFeel Madeleine examples : architecture : development : components : miscellaneous : research
38
RavenClaw Execution chart diagnostic Madeleine I:Welcome E:LoadSymptoms GeneralFeel Diagnose R:HowAreYou? I:Glad I:Sorry Fever Travel R:Headache R: R: R: general_feeling headache R:AskFever E:MeasureTemp I:InformFever have_fever Dialog Stack Expectation Agenda Hi, this is Madeleine, the automated… How are you feeling today? Not so good, I think I have a fever [soso](not so good) [fever](I think I have a fever) Sorry GeneralFeel Oh, I’m sorry to hear that… Let me take your temperature… Madeleine examples : architecture : development : components : miscellaneous : research
39
RavenClaw – Other features
Dialogue Engine transparently provides a set of conversational skills Universal dialogue mechanisms: Repeat, Suspend / Resume, Quit Help: Help!, Where are we?, What can I say? Error handling: Explicit and implicit confirmations Strategies for recovering from non-understandings Dynamic dialogue task generation Dynamic dialogue control policy
40
Components & Resources
Language, Acoustic Models Grammar Recognition SPHINX Lang. Understand. PHOENIX/HELIOS Dialog Manag. RAVENCLAW Back-end (perl) Back-end (perl) RavenClaw Dialog Task Specification Synthesis THETA Lang. Generation ROSETTA Limited Domain Voice Templates examples : architecture : development : components : miscellaneous : research
41
Backend & Domain Agents
Various problem-specific solutions RoomLine Connects to a static Perl database or to the CMU CorporateTime server; Let’s Go! Bus Information system Connects to a PostGRES database Sublime Connects to a MySQL database; also functions as a web-server; DTW search domain agent Basically, build your own; we provide a stub for interfacing with the Galaxy-Hub examples : architecture : development : components : miscellaneous : research
42
Components & Resources
Language, Acoustic Models Grammar Recognition SPHINX Lang. Understand. PHOENIX/HELIOS Dialog Manag. RAVENCLAW Back-end (perl) Back-end (perl) RavenClaw Dialog Task Specification Synthesis THETA Lang. Generation ROSETTA Limited Domain Voice Templates examples : architecture : development : components : miscellaneous : research
43
Rosetta Language Generation
Template- and stochastic-based language generation Input: (act, object, {slot=value}) Output: text (tagged with concepts) # welcome to the system “welcome” => “Welcome to RoomLine, the automated conference room “. “reservation system.”, # greet user “greet_user” => (“Hi, <user_name>.”, “Hi, <user_name>, good to hear from you again.”), # inform the user that the system has misunderstood the times (order) “wrong_time_order” => sub { my %args my $time_interval_as_string = get_wrong_time_interval_as_string(\%args, “room_query.date_time.time”); my $answer = “I'm sorry, I must have misunderstood the “. “time you needed the room. “; $answer .= “I heard $time_interval_as_string. “; return [“$answer So, let's see ... “, “$answer So, let's try this again ... “, “$answer So, let's try this once more ... “]; }, examples : architecture : development : components : miscellaneous : research
44
Components & Resources
Language, Acoustic Models Grammar Recognition SPHINX Lang. Understand. PHOENIX/HELIOS Dialog Manag. RAVENCLAW Back-end (perl) Back-end (perl) RavenClaw Dialog Task Specification Synthesis THETA Lang. Generation ROSETTA Limited Domain Voice Templates examples : architecture : development : components : miscellaneous : research
45
Synthesis Cepstral Theta synthesis Festival synthesis
Open-domain unit-selection synthesis SSML tags [Currently working on barge-in location] Festival synthesis Diphone synthesis; Open-domain, Limited-domain unit-selection synthesis SABLE tags Server running separately on a Linux box examples : architecture : development : components : miscellaneous : research
46
The Communicator / RavenClaw Spoken Dialogue Systems Framework
Examples Overall Architecture System Development Components & Resources Miscellaneous Current Research examples : architecture : development : components : miscellaneous : research
47
Miscellaneous – Documentation
Transmitted largely by oral tradition :) A bit of documentation available Research papers, slides WIKI: mostly for developers, postings of updates, recent developments; hopefully more introductory materials soon. More under work Tutorials: 2 available, but a bit outdated examples : architecture : development : components : miscellaneous : research
48
Miscellaneous – Portability
Current systems work on PC Windows platforms Galaxy has Linux version Components are C, C++, (Visual Studio 6.0, Visual Studio.NET), Perl How about using different input / output components? Modify RavenClaw DMInterface class Has been done for the Gemini parser / language generator examples : architecture : development : components : miscellaneous : research
49
Miscellaneous – Research Platform
Communicator / RavenClaw framework is a research platform! Constantly evolving Modular Easy to change, develop and test new technologies Research on variety of topics in a real-world, full-blown system: Recognition, Language understanding, Dialogue management, Language generation, Synthesis Your work can be evaluated / reused easily across multiple existing systems examples : architecture : development : components : miscellaneous : research
50
Miscellaneous - Download
Download a version of RoomLine An installation script can seed your own project from this RoomLine version examples : architecture : development : components : miscellaneous : research
51
Miscellaneous – RavenClaw Team
Dan Bohus Antoine Raux Jahanzeb Sherwani Thomas Harris Satanjeev Banerjee Brian Langner More users / developers / documentation writers are always welcome!! Dialogs on Dialogs Reading Group examples : architecture : development : components : miscellaneous : research
52
The Communicator / RavenClaw Spoken Dialogue Systems Framework
Examples Overall Architecture System Development Components & Resources Miscellaneous Current Research examples : architecture : development : components : miscellaneous : research
53
Error awareness and recovery
Problem: lack of robustness when faced with understanding errors Solution: build mechanisms for acting robustly at the dialogue management level Error awareness Building better confidence annotators, hypothesis selection; transference across domains Error recovery strategies Recovery from non-understandings Error handling decision process Scalable, adaptable, task-independent architecture for making error handling decisions examples : architecture : development : components : miscellaneous : research
54
Let’s Go! Research Speech Recognition: acoustic adaptation on non-native speech WER: 50% 30% Speech Synthesis: flexible and natural F0 modeling (F0 unit selection) Emphasis on erroneous/uncertain words for utterance confirmation examples : architecture : development : components : miscellaneous : research
55
Sublime Interface for personalized information management
Narrow functionality in unrestricted domains Currently, handle information without understanding it Eventually, learn relationships and a shallow ontology examples : architecture : development : components : miscellaneous : research
56
That’s all, folks! THANK YOU!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.