1 Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents S. Kawamoto, et al. October 27, 2004.

Slides:



Advertisements
Similar presentations
IBM WebSphere Everyplace Access for Multiplatforms Managing the e-business Customer Experience.
Advertisements

Collaborative Customer Relationship Management (CCRM) User Group June 23 rd, 2004.
Ch 3 System Development Environment
1 Opentest Architecture Table of Content –The Design Basic Components High-Level Test Architecture Test Flow –Services provided by each Layer Test Mgt.
MediaHub: An Intelligent Multimedia Distributed Hub Student: Glenn Campbell Supervisors: Dr. Tom Lunney Prof. Paul Mc Kevitt School of Computing and Intelligent.
XISL language XISL= eXtensible Interaction Sheet Language or XISL=eXtensible Interaction Scenario Language.
Managing Complexity: 3rd Generation Speech Applications Roberto Pieraccini August 7, 2006.
MotoHawk Training Model-Based Design of Embedded Systems.
The State of the Art in VoiceXML Chetan Sharma, MS Graduate Student School of CSIS, Pace University.
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
1 / 31 CS 425/625 Software Engineering User Interface Design Based on Chapter 15 of the textbook [SE-6] Ian Sommerville, Software Engineering, 6 th Ed.,
Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.
Software Requirements
Supplement 02CASE Tools1 Supplement 02 - Case Tools And Franchise Colleges By MANSHA NAWAZ.
About the Presentations The presentations cover the objectives found in the opening of each chapter. All chapter objectives are listed in the beginning.
Developing an approach for Learning Design Players Patrick McAndrew, Rob Nadolski & Alex Little Open University UK and Open University NL Paper available.
1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System Supervisor: Prof Michael Lyu Presented by: Lewis Ng,
Chapter 2 Introduction to Systems Architecture. Chapter goals Discuss the development of automated computing Describe the general capabilities of a computer.
Introduction to eValid Presentation Outline What is eValid? About eValid, Inc. eValid Features System Architecture eValid Functional Design Script Log.
Emotional Intelligence and Agents – Survey and Possible Applications Mirjana Ivanovic, Milos Radovanovic, Zoran Budimac, Dejan Mitrovic, Vladimir Kurbalija,
Find The Better Way Expand Your Voice with VXML May 10 th, 2005.
Sunee Holland University of South Australia School of Computer and Information Science Supervisor: Dr G Stewart Von Itzstein.
Microsoft ® Expression ® Web An Introduction to the Your Learning Guide to Expression Web tutorial.
Mobile Multimodal Applications. Dr. Roman Englert, Gregor Glass March 23 rd, 2006.
VoiceXML Builder Arturo Ramirez ACS 494 Master’s Graduate Project May 04, 2001.
CS 0004 –Lecture 1 Wednesday, Jan 5 th, 2011 Roxana Gheorghiu.
GUI: Specifying Complete User Interaction Soft computing Laboratory Yonsei University October 25, 2004.
Conversational Applications Workshop Introduction Jim Larson.
DEVSView: A DEVS Visualization Tool Wilson Venhola.
Input/OUTPUT [I/O Module structure].
System Design: Designing the User Interface Dr. Dania Bilal IS582 Spring 2009.
A Distributed Computing System Based on BOINC September - CHEP 2004 Pedro Andrade António Amorim Jaime Villate.
Automatic Software Testing Tool for Computer Networks ADD Presentation Dudi Patimer Adi Shachar Yaniv Cohen
ITCS 6010 SALT. Speech Application Language Tags (SALT) Speech interface markup language Extension of HTML and other markup languages Adds speech and.
11.10 Human Computer Interface www. ICT-Teacher.com.
PHP With Oracle 11g XE By Shyam Gurram Eastern Illinois University.
CHAPTER FOUR COMPUTER SOFTWARE.
Chapter 7. BEAT: the Behavior Expression Animation Toolkit
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
Introduction to Interactive Media Interactive Media Tools: Software.
APML, a Markup Language for Believable Behavior Generation Soft computing Laboratory Yonsei University October 25, 2004.
Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.
Personalizing the web for multilingual web sources Anil Goud V Lalith Krishna L Dinesh Kumar D.R.
1 PLAN RECOGNITION & USER INTERFACES Sony Jacob March 4 th, 2005.
© Copyright by Deitel & Associates, Inc. and Pearson Education Inc. All Rights Reserved. 1 Tutorial 27 - Phone Book Application Introducing Multimedia.
CHAPTER TEN AUTHORING.
Outline Grammar-based speech recognition Statistical language model-based recognition Speech Synthesis Dialog Management Natural Language Processing ©
Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal VideoConference Archives Indexing System.
Voice User Interface
4 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved. Computer Software Chapter 4.
Dept. of Computer Science University of Rochester Rochester, NY By: James F. Allen, Donna K. Byron, Myroslava Dzikovska George Ferguson, Lucian Galescu,
ENTERFACE 08 Project 1 “MultiParty Communication with a Tour Guide ECA” Mid-term presentation August 19th, 2008.
Intelligent Robot Architecture (1-3)  Background of research  Research objectives  By recognizing and analyzing user’s utterances and actions, an intelligent.
1 1. Representing and Parameterizing Agent Behaviors Jan Allbeck and Norm Badler 연세대학교 컴퓨터과학과 로봇 공학 특강 학기 유 지 오.
Controlling Computer Using Speech Recognition (CCSR) Creative Masters Group Supervisor : Dr: Mounira Taileb.
Distributed Computing With Triana A Short Course Matthew Shields, Ian Taylor & Ian Wang.
Introduction to Interactive Media Interactive Media Tools: Authoring Applications.
Animated Speech Therapist for Individuals with Parkinson Disease Supported by the Coleman Institute for Cognitive Disabilities J. Yan, L. Ramig and R.
CPSC 171 Introduction to Computer Science System Software and Virtual Machines.
Dialog Design I Basic Concepts of Dialog Design. Dialog Outline Evaluate User Problem Representations, Operations, Memory Aids Generate Dialog Diagram.
VoiceXML Version 2.0 Jon Pitcherella. What is it? A W3C standard for specifying interactive voice dialogues. Uses a “voice” browser to interpret documents,
Prof. Hany H. Ammar, CSEE, WVU, and
PGNET, Liverpool JMU, June 2005 MediaHub: An Intelligent MultiMedia Distributed Platform Hub Glenn Campbell, Tom Lunney, Paul Mc Kevitt School of Computing.
Lesson 1 1 LESSON 1 l Background information l Introduction to Java Introduction and a Taste of Java.
SEESCOASEESCOA SEESCOA Meeting Activities of LUC 9 May 2003.
W3C Multimodal Interaction Activities Deborah A. Dahl August 9, 2006.
Some of the utilities associated with the development of programs. These program development tools allow users to write and construct programs that the.
Wednesday NI Vision Sessions
Presented By Sharmin Sirajudeen S7 CS Reg No :
Overview of Computer system
Presentation transcript:

1 Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents S. Kawamoto, et al. October 27, 2004

2 Agenda Introduction Toolkit Design and Outline –Speech recognition module –Speech synthesis module –Facial image synthesis module –Agent manager –Virtual machine model –Task manager –Prototyping tools Prototype Systems Conclusions

3 Introduction An anthropomorphic spoken dialog agent (ASDA) is one of the next-generation human-computer interfaces Many ASDA systems have been developed, but developing a high-quality ASDA system is still challenging An unlimited number of life-like agent characters having different faces and voices just like human For this reason, Galatea has been developed to provide a platform to build next-generation ASDA systems

4 Features of the Toolkit Easy customization –Model-based approaches Once the model parameters are trained, facial expressions and voice quality can be controlled easily Key techniques for natural spoken dialog Incremental speech recognition, synchronization between speech and facial animation, etc Modularity of functional units –Simple architecture to manage each functional unit User can develop, improve, debug, etc Open-source free software Introduction

5 Toolkit Design and Outline Works as an inter-module communication manager Directly managed by the modules which utilize the devices Adding a new module for the function and connecting the module to the agent manager

6 Speech Recognition Module (SRM) Major interfaces of SRM are as follows: –Outputs Recognition result (XML format) Engine status (“busy”, “waiting”,... ) –Control command Reload grammar, change the settings of the speech recognition engine –Grammar representation Transforms the XML grammar into a format that is accepted by the speech recognition engine Toolkit Design and Outline Command Interpreter Grammar Transformer Speech Recognition Engine Speech input Grammar Request Response

7 Speech Synthesis Module (SSM) Accept arbitrary Japanese texts Synthesize speech with a human voice –HMM-based speech synthesis method is employed Synchronizing the lip movement with speech SSM can interrupt speech output to cope with any interruption by the user Toolkit Design and Outline Command Interpreter Dictionary Acoustic Models Speech Output Text Analyzer Waveform Generation Engine

8 Facial Image Synthesis Module (FSM) Supports high-quality facial image synthesis, animation control, precise lip-sync with voice GUI is equipped to fit a generic face wire frame model onto a full-face snapshot image Facial action control –Mouth shape –Facial expression Toolkit Design and Outline

9 Agent Manager (AM) Integrator of all the modules of the ASDA system Play a central role of communication Synchronization manager between SSM and FSM to achieve the precise lip-sync Toolkit Design and Outline Dispatcher Macro-command interpreter

10 Virtual Machine Model Module interface is modeled as a machine with slots –Each slot is indicates machine status Changing the slot values by a common command set “set Speak = now” means starting voice synthesis of a given text immediately Toolkit Design and Outline

11 Task Manager (TM) Define the dialog as a set of interactions which can be represented by a dialog description language Goal in developing the TM is that the system can use several types of dialog description languages –VoiceXML High-level language, task-oriented information and the intentions of the participants –PDOC (primitive dialog operation commands) Low-level language, device events and sequence control Toolkit Design and Outline

12 Prototyping Tools “Galatea Interaction Builder (IB)” Toolkit Design and Outline Application Developer Interaction Builder Galatea MMI System XISL File web site Create XISL Document Download and Execute XISL Check Design Scenario

13 Prototype Systems

14 Echo-back task Prototype Systems

15 Conclusions A human-like spoken dialog agent is one of the promising man-machine interfaces for the next generation Galatea is a software toolkit to develop a human-like spoken dialog agent Because of the high modularity and simple communication architecture, it will speed up the research and application development based on ASDA