A Newbie Experience of Dialogue System Construction Using the Ravenclaw Framework Arthur Chan.

A Newbie Experience of Dialogue System Construction Using the Ravenclaw Framework Arthur Chan

Introduction Do you know? –Arthur Chan actually takes classes in CMU ! Course he took this year: –“Project Course: Dialogue Systems” The course required the use of Ravenclaw/Olympus A journal was kept on the experience I learned in the process –Requested by gang members such as Dan and Thomas

Speaker’s Bio Mainly a speech recognition guy –i.e. the part that transform speech to text Not very experienced in dialogue system –Only work on directed dialogue system Speechwork 6.5 i.e. an all-in-one dialogue system + speech recognizer Dialogues are modularized –E.g. Digits, Alphabets, ZipCode

What did we do this year? 3 systems by 3 groups –RoadFinder: Aaron, Dave and Wen –ICSLPInfo: Arthur, Lingyun and Rohit –Extension of Vera: Mohit, Kaimin and ? The actual situation –Dave did most of the stunts –Each group has a person just to take care development kick-start and system issues –Mailing list became the collaborative means

New Development Sphinx3_Engine –With Sphinx 3.6 RCI –With Powerful Wideband Models (CALO) and Narrowband Models (Communicator) LM Training Scripts –With tools newly built in Project L (CMU- Cambridge LM Toolkit “V3”) IAX_Server –Allow systems to be used in Asterisk server(?)

This talk Mini Case Study of ICSLPInfo –Try to learn what information we could give to users for a conference The type of information is unknown Two perspectives –From a new user perspective –From a developer perspective

The New User’s Perspective Generally, as a new user, is it easy to learn Ravenclaw? Related Questions –Do I hate Dan? (Forever? Or even for a moment?) –Is it scary to use Ravenclaw? –What do we know /not know at a certain stage? –What is the general comment on the software?

The Developer’s Perspective From a developer’s standpoint, what are the issues of development? –Issues in speech recognition? –Issues in dialogue system development? –Issues in general application development? –Issues in multi-developer development? –When should we work on SR/RP/DS/BE?

The Development Process Stage 0: Planning, drawing diagrams and stuffs Stage 1: Making some existing systems run Stage 2: Making simple systems run –Making SR works without the backend –Making the backend works without the SR Stage 3: Making the first end-to-end system to run (Not cover today) Stage 4: Final adjustment and final demo

Stage 0: Planning (2-3 weeks) Major issue –The type of useful information could be unknown Author? Session? Title? Venue? –We actually didn’t know what is the most useful at Stage 0

Stage 1: Making some existing systems run (1 month) Wide varieties of pre-built systems using Ravenclaw –Path 1: Starting from ConvertProj ConvertProj is a very simple project –Path 2: Starting from RoomLine –Path 3: Starting from scratches Path 1 was first chosen so that everyone could get an initial system

Note in Stage 1 Not everyone has easy time to get the initial setup running (1-2 weeks) –Forgot to install active perl and miscellaneous tools –At the beginning, didn’t know where to debug The synthesizer turns out to be not pre- built (1-2 weeks) Speech Recognizer is not running yet –Don’t know why at that point.

If we starts from ConvertProj…… How do we write the first system then? –ConvertProj is very simple but we didn’t know what it does…… –We didn’t understand how Phoenix/Ravenclaw works Rohit: Let us start from Roomline then. –Turns out to be a very good idea –Why? Roomline is complicated but the learner can learn from the code There are also couple of patterns could be reused e.g for- loop, if-then-else

Note: We already got a hold of “Description of Ravenclaw Agent Description Language” –Not a tutorial, no examples –We didn’t know how to start based on it That’s why a template was needed –We end up trace the whole Roomline system

Stage 2a: Making a system with working SR Our biggest problem: Name Recognition –Recognizing 1000 names –Many of them are Asian names –No training data –Dave hasn’t built the LM building script The type of information is not yet set –Should we handle names?

Stage 2a: Making a system with working SR (cont.) Our first bootstrapping system –Use Sphinx3_Engine + CALO model Probably the strongest SR we could use –Use Roomline language model –Just tweak the grammar a little bit –Add a lot of compound words into classes –Also, only use session chairs (180 names) is in grammar

icslpinfo Reset DateTime WelcomeLogoutTask Request Satistfied Inform Logout HMIHY The First System (No BE)

Note at Stage 2a Finally gotten something running But the system did nothing We are still very vague in –how message is passed in Galaxy and –how results transferred from SR to RP to DM

Stage 2b: Making the backend works without the SR The backend is finally built at this stage The backend/DM/RP is working and text console mode is working DM now gives the abstract when asked about the author But this time, SR fails because –the grammar accept too many –the Roomline LM was used.

Note at Stage 2b Another difficult issue shows up –SR/RP/DM are very tightly coupled with each other Other problems –Occasionally, “” is shown in the prompts –Because some prompts wasn’t filled in Good part: –The first type of information we will handle is finally decided –This constrains SR –We start to feel time is running short

Stage 3: Making the first end-to- end system to run Speech Recognition –Retrain LM using faked corpora –Significantly trimmed down the number of authors to recognize (From 200 to 30) –Few author names are easily recognized still. –The lucky ones Alan Black Arthur Toth Julia Hirschberg Andrew Rosenberg –(Alex is not very happy about this. His name is confused with “context key”)

Note at this point Started to realized that SR couldn’t have quick improvement The problem of DM starts to be glaring –No disambiguation –When multiple results are return, no strategy to take care. Also, SR always couldn’t recognize things in grammar. –A lot of ++GARBAGE++ is recognized –See a lot “On Alan Black”

DM Allow disambiguation using author name and session name Taken care of different scenarios of results –If there is no results, Say Sorry and restart. –If there is one result Present the detail of the paper, Then ask whether to present the abstract of the paper –If there is less than or equal to 5 results Tell the user the number of papers found Then ask whether to present the summary of the paper. –(List of titles of the paper) –If there is more than 5 results Say sorry

Other small things We Hacked Out Confidence of The Recognizer –Audio Server is hacked such that We are always “confident” about the results. Annoying restarting issue –Commented the restarting routine in Windows

Backend and NLG Backend – (may be for this demo only) –SQL-based –Could do author-search and session-name- search NLG –Fill in all sorts of prompts –A lot of Implicit Confirmation and Explicit Confirmation are missing That caused a lot of “” in the system

Demo: Scenario –A user want to know information of the papers written by Alan Black Julia Hirschberg and Andrew Rosenberg What it shows –How bad recognition is taken care now. –What happened when the number of answers returned are multiple or single.

Note: Rohit Kummar and Lingyun Gao actually holds the latest and greatest system. This system only shows how we built up from ground zero.

Summary: 3 Difficult Issues in the Task 1, Tight coupling of SR/RP/DM –When one part is right, others could failed 2, SR issues –The SR task could be affected by different constraints. –First system is hard to be up –Compound with 1 3, Lack of documentation in DM –The current documentation base is not strong enough –Read-and-implement approach doesn’t work yet –Some concepts are difficult to understand Say COMPLETE/SUCCEED/FAILED GRAMMAR_MAPPING

Lessons learn Iteratively develop the system by boostrapping each with simple systems –This would greatly reduce the pain of coupling SR issue –The first system could be completed by some smaller grammars first In some task, SR shouldn’t be the focus at a certain point. –Aligned with common observation DM Development –A good working template is necessary –What we need: for loop, if-then-else templates

The bright side 1: birthday gift for Dave Once understood, pretty easy to program –E.g. birthday celebration system Sample Dialogue: –S: Do you want to know what’s going on? –U: Yes (or No) –S: No matter whether you say yes or no, I will have to tell you. Begin message. –Hmm-hm. Today is Mr David Huggines Daines’ Birthday. Because everyone is too shy to sing the birthday song for him. Me, Frank, will have to sing it. Here you go. Happy Birthday to you, Happy Birthday to you. Happy Birthday to David. Happy Birthday to you. This message is bought to you by …… –End message

Bright Side 2 If compared to a directed dialogue system, the current system could give unexpected results. Why? –several sub-systems of Dialogue system is working together Built in Libraries Grounding Focuses Developer-defined libraries It is delightful to use it in general

Bright Side 3 Source code has consistent coding style –Development problem will be mainly stemmed from 1, Lack of automatic regression test 2, Lack of central manager Not a bad thing in dialogue system if developer/system =1

Conclusion Summarize the system development of how the end-to-end system of ICSLPInfo is first developed Discussed several issues including –Coupling of systems –SR –DM development Overall speaking –Thrilled when getting the system running and working

A Newbie Experience of Dialogue System Construction Using the Ravenclaw Framework Arthur Chan.

Similar presentations

Presentation on theme: "A Newbie Experience of Dialogue System Construction Using the Ravenclaw Framework Arthur Chan."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

A Newbie Experience of Dialogue System Construction Using the Ravenclaw Framework Arthur Chan.

Similar presentations

Presentation on theme: "A Newbie Experience of Dialogue System Construction Using the Ravenclaw Framework Arthur Chan."— Presentation transcript:

Similar presentations

About project

Feedback