W3C Multimodal Interaction Activities Deborah A. Dahl August 9, 2006.

Slides:



Advertisements
Similar presentations
XML-based Network Management Rob Enns
Advertisements

An overview of EMMA— Extensible MultiModal Annotation Michael Johnston AT&T Labs Research 8/9/2006.
Collaborative Customer Relationship Management (CCRM) User Group June 23 rd, 2004.
Rob Marchand Genesys Telecommunications
Irek Defée Signal Processing for Multimodal Web Irek Defée Department of Signal Processing Tampere University of Technology W3C Web Technology Day.
Which development tool is right for you? Commercial Tools John Fuentes – Principal Solutions Architect
XISL language XISL= eXtensible Interaction Sheet Language or XISL=eXtensible Interaction Scenario Language.
UNDERSTANDING JAVA APIS FOR MOBILE DEVICES v0.01.
Frontiers in Interaction The Power of Multimodal Standards Deborah Dahl Principal, Conversational Technologies Chair, W3C Multimodal Interaction Working.
1 Component Description Multimodal Interface Carnegie Mellon University Prepared by: Michael Bett 3/26/99.
The State of the Art in VoiceXML Chetan Sharma, MS Graduate Student School of CSIS, Pace University.
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
Multimodal Architecture for Integrating Voice and Ink XML Formats Under the guidance of Dr. Charles Tappert By Darshan Desai, Shobhana Misra, Yani Mulyani,
Thomas Kisner.  Unified Communications Architect at BNSF Railway  Board Member, DFW Unified Communications User Group ◦ Meets 4 th Thursday of Every.
Find The Better Way Expand Your Voice with VXML May 10 th, 2005.
Mobile Multimodal Applications. Dr. Roman Englert, Gregor Glass March 23 rd, 2006.
Client/Server Architectures
VoiceXML Builder Arturo Ramirez ACS 494 Master’s Graduate Project May 04, 2001.
INTRODUCTION TO WEB DATABASE PROGRAMMING
1 Skip Cave Chief Scientist, Intervoice Inc. Multimodal Framework Proposal.
MVC pattern and implementation in java
GIS technologies and Web Mapping Services
SCXML State Chart Markup Language. SCXML controls the flow of an application SCXML controls modalities –VoiceXML –XHTML –Others, e.g., InkML, SVG SCXML.
Some Thoughts on HPC in Natural Language Engineering Steven Bird University of Melbourne & University of Pennsylvania.
Conversational Applications Workshop Introduction Jim Larson.
July 13, 2006 © 2006 IBM Corporation Distributed Multimodal Synchronization Protocol (DMSP) Chris Cross IETF 66 July 13, 2006 With Contribution from Gerald.
ITCS 6010 SALT. Speech Application Language Tags (SALT) Speech interface markup language Extension of HTML and other markup languages Adds speech and.
MediaHub: An Intelligent MultiMedia Distributed Platform Hub Glenn Campbell, Tom Lunney & Paul Mc Kevitt School of Computing and Intelligent Systems Faculty.
Spoken dialog for e-learning supported by domain ontologies Dario Bianchi, Monica Mordonini and Agostino Poggi Dipartimento di Ingegneria dell’Informazione.
Integrating VoiceXML with SIP services
CHAPTER TEN AUTHORING.
The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications 14 Feb Christophe Strobbe K.U.Leuven - ESAT-SCD-DocArch.
Outline Grammar-based speech recognition Statistical language model-based recognition Speech Synthesis Dialog Management Natural Language Processing ©
Spoken Dialog Systems and Voice XML Lecturer: Prof. Esther Levin.
1 Geospatial and Business Intelligence Jean-Sébastien Turcotte Executive VP San Francisco - April 2007 Streamlining web mapping applications.
Managing and communicating uncertainty in geospatial web service workflows Richard Jones, Dan Cornford, Lucy Bastin, Matthew Williams Computer Science,
Building Rich Web Applications with Ajax Linda Dailey Paulson IEEE – Computer, October 05 (Vol.38, No.10) Presented by Jingming Zhang.
March 20, 2006 © 2005 IBM Corporation Distributed Multimodal Synchronization Protocol (DMSP) Chris Cross IETF 65 March 20, 2006 With Contribution from.
by Maria Rita Marruganti DIFFERENT WAYS OF SENDING INFORMATION Passive e.g. newspapers, radio, television. You don’t produce, just receive information.
L C SL C S The Intelligent Room’s MeetingManager: A Look Forward Alice Oh Stephen Peters Oxygen Workshop, January, 2002.
Österreichisches Forschnungsinstitut für Artificial Intelligence Representational Lego for ECAs Brigitte Krenn.
SOFTWARE DESIGN AND ARCHITECTURE LECTURE 13. Review Shared Data Software Architectures – Black board Style architecture.
12 Chapter 12: Advanced Topics in Object-Oriented Design Systems Analysis and Design in a Changing World, 3 rd Edition.
Contents : What is Silverlight? Silverlight Overview Silverlight Toolkit Overview Timeline & Packaging Silverlight V1.0 & V1.1 Properties of V1.0 Properties.
© FPT SOFTWARE – TRAINING MATERIAL – Internal use 04e-BM/NS/HDCV/FSOFT v2/3 JSP Application Models.
August 2003 At A Glance The IRC is a platform independent, extensible, and adaptive framework that provides robust, interactive, and distributed control.
Listener-Control Navigation of VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better.
March 20, 2006 © 2005 IBM Corporation Distributed Multimodal Synchronization Protocol (DMSP) Chris Cross IETF 65 March 21, 2006 With Contribution from.
Proposal of a Hierarchical Architecture for Multimodal Interactive Systems Masahiro Araki* 1 Tsuneo Nitta* 2 Kouichi Katsurada* 2 Takuya Nishimoto* 3 Tetsuo.
1 Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents S. Kawamoto, et al. October 27, 2004.
1 5/18/2007ã 2007, Spencer Rugaber Architectural Styles and Non- Functional Requirements Jan Bosch. Design and Use of Software Architectures. Addison-Wesley,
VoiceXML Version 2.0 Jon Pitcherella. What is it? A W3C standard for specifying interactive voice dialogues. Uses a “voice” browser to interpret documents,
Presentation Title 1 1/27/2016 Lucent Technologies - Proprietary Voice Interface On Wireless Applications Protocol A PDA Implementation Sherif Abdou Qiru.
Standards for representing meeting metadata and annotations in meeting databases Standards for representing meeting metadata and annotations in meeting.
Multimodal SIG © 2007 IBM Corporation Position Paper on W3C Workshop on Multimodal Architecture and Interfaces - Application control based on device modality.
SEESCOASEESCOA SEESCOA Meeting Activities of LUC 9 May 2003.
James A. Larson Developing & Delivering Multimodal Applications 1 EMMA Extensible MultiModal Annotation markup language Canonical structure for semantic.
VoiceXML. Nuance Speech Analysis 92% of customer service is through phone. 84% of industrialists believe speech better than web.
Software Architecture for Multimodal Interactive Systems : Voice-enabled Graphical Notebook.
Presented By Sharmin Sirajudeen S7 CS Reg No :
Chapter 17 The Need for HTML 5.
Introduction to Visual Basic. NET,. NET Framework and Visual Studio
TECHLEADS IT Oracle Apps ADF R12
Improving Dialogs with EMMA
Chapter 18 MobileApp Design
Onlineitguru Selenium is one of the most widely used open-source tool that is used for testing software or Automation. It is licensed under Apache License.
Multimodal Human-Computer Interaction New Interaction Techniques 22. 1
SpeechClipse v 1.0 “An Effective Plug-In for the Eclipse IDE”
Presentation transcript:

W3C Multimodal Interaction Activities Deborah A. Dahl August 9, 2006

Conversational Technologies Topics Multimodal Architecture EMMA

Conversational Technologies Multimodal Architecture

Conversational Technologies Multimodal Architecture and Interfaces A loosely-coupled, event-based architecture for integrating multiple modalities into applications All communication is event-based Based on a set of standard life-cycle events Components can also expose other events as required Encapsulation protects component data Encapsulation enhances extensibility to new modalities Can be used outside a web environment

Conversational Technologies MMI Architecture Defines five basic components of an MMI system –Runtime Framework or Browser: initializes application and runs markup –Interaction Manager: coordinates modality components and provides application flow –Modality Components: provide modality capabilities such as speech, pen, keyboard, mouse –Data Model: handles shared data –DCI (Delivery Context and Interfaces): device properties and user preferences

Conversational Technologies Generic MMI Architecture Runtime Framework Interaction Manager Data Model DCI Speech Input Pen Modality Keyboard Pointing Audio Output Speech Output Video Output Graphical Output InputOutput Image Output Speech Interaction

Conversational Technologies MMI Architecture Principles Runtime Framework communicates with Modality Components through asynchronous events Modality Components don’t communicate directly with each other but indirectly through the Runtime Framework Components must implement basic life cycle events, may expose others Modality components can be nested (e.g. a Voice Dialog component like a VoiceXML ) Components need not be markup-based EMMA communicates users’ inputs to IM

Conversational Technologies Life Cycle Events EventFromToPurpose NewContextRequestModalityRuntime FrameworkRequest new context NewContextResponseRuntime FrameworkModalitySend new context id PrepareRuntime FrameworkModalityPre-load markup PrepareResponseModalityRuntime FrameworkAcknowledge Prepare StartRuntime FrameworkModalityRun markup StartResponseModalityRuntime FrameworkAcknowledge Start DoneModalityRuntime FrameworkFinished running CancelRuntime FrameworkModalityStop processing CancelResponseModalityRuntime FrameworkAcknowledge Cancel PauseRuntime FrameworkModalitySuspend processing PauseResponseModalityRuntime FrameworkAcknowledge Prepare ResumeRuntime FrameworkModalityResume processing ResumeResponseModalityRuntime FrameworkAcknowledge Resume Dataeither Send data values ClearContextRuntime FrameworkModalityDeactivate context

Conversational Technologies Example of MMI Architecture Runtime Framework Speech Recognition Modality Face Recognition Modality TTS View life cycle event traffic

Conversational Technologies Example Modality Components MMI API TTS MMI API EMMA JSAPI ASR MMI API EMMA Face Recognizer

Conversational Technologies Startup

Conversational Technologies Face Recognition

Conversational Technologies Speech Recognition

Conversational Technologies Failure Responses

Conversational Technologies EMMA

Conversational Technologies EMMA (Extensible MultiModal Annotation) XML format represents results of processing user input includes annotations adding information about the input (confidence, timestamp, tokens, language, etc.) can be used for input from speech, ink camera, keyboard… Contents of Data property (e.g. of Data or Done event)

Conversational Technologies Advantages of EMMA Vendor neutral, uniform representation of semantic results across modalities Rich annotation of input context XML format enables leveraging of XML tools Designed to take into account VoiceXML requirements

Conversational Technologies Applications of EMMA Represent speech or other modality input for dialog manager –VoiceXML –SCXML –scripts processing stages after recognition –classification –reference resolution logging and storing in a database –OA&M –debugging –diagnosing dialog problems

Conversational Technologies Example Modalities speech recognition with rule grammar speech recognition with dictation grammar face recognition typed input

Conversational Technologies EMMA Result for Speech in a Dialog

Conversational Technologies EMMA Result for Dictation I like to go on a trip

Conversational Technologies EMMA for Face Recognition

Conversational Technologies EMMA Result for Text

Conversational Technologies Summary MMI Architecture provides a general, clean interface to a wide range of modality components EMMA provides a standard and general way of representing user inputs Very easy to integrate new modalities Loose coupling and lack of access to internal modality data improves security

Conversational Technologies