Mobile Multimodal Applications. Dr. Roman Englert, Gregor Glass March 23 rd, 2006.

Slides:



Advertisements
Similar presentations
INFORMATION SYSTEMS APPLIED MULTIMEDIA HIGHER This presentation will probably involve audience discussion, which will create action items. Use PowerPoint.
Advertisements

Manuela Veloso, Anthony Stentz, Alexander Rudnicky Brett Browning, M. Bernardine Dias Faculty Thomas Harris, Brenna Argall, Gil Jones Satanjeev Banerjee.
Welcome to Mobile TEL A questionnaire will follow this presentation for you to evaluate the application.
An overview of EMMA— Extensible MultiModal Annotation Michael Johnston AT&T Labs Research 8/9/2006.
Rob Marchand Genesys Telecommunications
Designing Multimedia with Fuzzy Logic Enrique Diaz de Leon * Rene V. Mayorga ** Paul D. Guild *** * ITESM, Guadalajara Campus, Mexico ** Faculty of Engineering,
Software Process Models
Irek Defée Signal Processing for Multimodal Web Irek Defée Department of Signal Processing Tampere University of Technology W3C Web Technology Day.
XISL language XISL= eXtensible Interaction Sheet Language or XISL=eXtensible Interaction Scenario Language.
INTRODUCTION OS/2 was initially designed to extend the capabilities of DOS by IBM and Microsoft Corporations. To create a single industry-standard operating.
Lecture 7 Date: 23rd February
Multimodal Interaction. Modalities vs Media Modalities are ways of encoding information e.g. graphics Media are instantiations of modalities e.g. a particular.
ISTD 2003, Audio / Speech Interactive Systems Technical Design Seminar work: Audio / Speech Ville-Mikko Rautio Timo Salminen Vesa Hyvönen.
People & Devices: (Inputs & Outputs) Startlingly small child using computer History of human-computer interaction Another history video.
Multimodal Architecture for Integrating Voice and Ink XML Formats Under the guidance of Dr. Charles Tappert By Darshan Desai, Shobhana Misra, Yani Mulyani,
About the Presentations The presentations cover the objectives found in the opening of each chapter. All chapter objectives are listed in the beginning.
MUSCLE Multimodal e-team related activity Technical University of Crete Speech Processing and Dialog Systems Group Presenter: Prof. Alex Potamianos Technical.
WP1 UGOT demos 2nd year review Saarbrucken Mar 2006.
Signal Processor User Interaction Module Voice or touch i/pVoice or touch signal Decoded command Command interpretation and executionCommand result Result.
CMPD 434 MULTIMEDIA AUTHORING
Paul Trani Adobe Certified Instructor/Expert Resources:
 What’s a Computer? What’s a Computer?  Characteristics of a Computer Characteristics of a Computer  Evolution of Computers Evolution of Computers.
Systems Analysis and Design in a Changing World, 6th Edition
1 Skip Cave Chief Scientist, Intervoice Inc. Multimodal Framework Proposal.
CS 0004 –Lecture 1 Wednesday, Jan 5 th, 2011 Roxana Gheorghiu.
14 Chapter 11: Designing the User Interface. 14 Systems Analysis and Design in a Changing World, 3rd Edition 2 Identifying and Classifying Inputs and.
Vincent Mugambi Developer & Platform Lead– East & Southern Africa Microsoft E- ACCESSIBILITY: MICROSOFT’S APPROACH.
Center for Human Computer Communication Department of Computer Science, OG I 1 Designing Robust Multimodal Systems for Diverse Users and Mobile Environments.
Welcome to CIS 083 ! Events CIS 068.
Author: James Allen, Nathanael Chambers, etc. By: Rex, Linger, Xiaoyi Nov. 23, 2009.
Parser-Driven Games Tool programming © Allan C. Milne Abertay University v
11.10 Human Computer Interface www. ICT-Teacher.com.
Unit 1_9 Human Computer Interface. Why have an Interface? The user needs to issue instructions Problem diagnosis The Computer needs to tell the user what.
Designing Interface Components. Components Navigation components - the user uses these components to give instructions. Input – Components that are used.
CHAPTER TEN AUTHORING.
1EMODE workshop – September 2007 Automatic Usability Assessment of Multimodal User Interfaces Based on Ergonomic Rules Adrian Stanciulescu Jean Vanderdonckt.
Human – Computer Interaction
Spoken Dialog Systems and Voice XML Lecturer: Prof. Esther Levin.
IBM - CVUT Student Research Projects Google maps with voice Martin Absolon Ivo Čermák
“Show me what you meant”: Mode-switching prompts in a multi-modal dialog system with distractions Thomas Harris & Hua Ai October 25, 2005.
March 20, 2006 © 2005 IBM Corporation Distributed Multimodal Synchronization Protocol (DMSP) Chris Cross IETF 65 March 20, 2006 With Contribution from.
E.g.: MS-DOS interface. DIR C: /W /A:D will list all the directories in the root directory of drive C in wide list format. Disadvantage is that commands.
APPLY FUNCTIONAL MODELING TO CONSEQUENCE ANALYSIS IN SUPERVISION SYSTEMS Present by Xinxin Zhang 1 Morten Lind 1, Giulio Gola 2,
Reading Flash. Training target: Read the following reading materials and use the reading skills mentioned in the passages above. You may also choose some.
Conceptual Design Dr. Dania Bilal IS588 Spring 2008.
March 20, 2006 © 2005 IBM Corporation Distributed Multimodal Synchronization Protocol (DMSP) Chris Cross IETF 65 March 21, 2006 With Contribution from.
1 Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents S. Kawamoto, et al. October 27, 2004.
GUI Meets VUI: Some Possible Guidelines James A. Larson VP, Larson Technical Services 4/21/20151© 2015 Larson Technical Services.
Multi-Modal Dialogue in Personal Navigation Systems Arthur Chan.
CARE properties Chris Vandervelpen
SEESCOASEESCOA SEESCOA Meeting Activities of LUC 9 May 2003.
Chapter – 8 Software Tools.
Speech Processing 1 Introduction Waldemar Skoberla phone: fax: WWW:
Unit 6 of COMP648 User Interface and Interaction Methods Dr Oscar Lin School of Computing and Information Systems Faculty of Science and Technology Athabasca.
Stanford hci group / cs376 u Jeffrey Heer · 19 May 2009 Speech & Multimodal Interfaces.
W3C Multimodal Interaction Activities Deborah A. Dahl August 9, 2006.
Software Architecture for Multimodal Interactive Systems : Voice-enabled Graphical Notebook.
1 Unit E-Guidelines (c) elsaddik SEG 3210 User Interface Design & Implementation Prof. Dr.-Ing. Abdulmotaleb.
Decision Support System by Simulation Model (Ajarn Chat Chuchuen) 1 Chapter 6 User Interface Management.
Programming Logic and Design Seventh Edition Chapter 1 An Overview of Computers and Programming.
Multimodal and Natural computer interaction Evelina Stanevičienė.
Mobile learning three C’s
11.10 Human Computer Interface
Module 2… Talking with computers
Human Computer Interaction Lecture 20 Universal Design
CHAPTER 8 Multimedia Authoring Tools
Screen Title Screen text Page x of y Graphic: description
Multimodal FooBilliard
Multimodal Human-Computer Interaction New Interaction Techniques 22. 1
Screen Title Screen text Page x of y Graphic: description
Presentation transcript:

Mobile Multimodal Applications. Dr. Roman Englert, Gregor Glass March 23 rd, 2006

1 Agenda: Motivation Potentials through Multimodality Use Cases Map & Sound Logo Components & Modules for Mobile Multimodal Interaction User Perspective Challenges

2 Moore's Law: Technical capability will double approximately every 18 months. Buxton's Law: Technology designers promise functionality proportional to Moore's Law. Moore‘s Law: Growth of Technology Moore's Law: Technical capability will double approximately every 18 months. Buxton's Law: Technology designers promise functionality proportional to Moore's Law. Buxton‘s Law: Growth of Functionality Multimodality. Motivation.  The Challenge is how to deliver more functionality without breaking through the complexity barrier and making the systems so cumbersome as to be completely unusable. God’s Law (complexity barrier): Human capacity is limited and does not increase over time! God‘s Law: Growth of Human Capabiltiy (Billy Buxton)

3 Multimodality. Potentials through MMI.

4 Multimodality – New User Interfaces. Composite Usage Szenario: Map. Example: User selects a point of interest clicking with a stylus and speaking in order to focus it. „Zoom in here”

5 Multimodality – New User Interfaces. Composite Usage Szenario: SoundLogo Example: User selects a sound logo by clicking on the title with a stylus and speaking in order to hear it SoundLogo = Personalized Call Connect Signal „Play this sound logo”

6 Input n Voice n Stylus n Gesture n … ClientUserServerContent Back-End Voice Data Dialog Manageme nt Synchronisatio n Management Media Resource Management (ASR/TTS) Output n Voice n Text n Graphic n Video n … User InterfaceTypes of Multimodality n Sequential n Parallel Architecture LayerInternet / Services Multimodality – New User Interfaces. Components of multimodal end-to-end connection.

7 Recognition grammar speech ink etc. system- generated Interpretation interpretation mouse/ keyboard semantic interpretation Integration integration processor interaction manager EMM A system and environment application functions session component EMM A Multimodality – New User Interfaces. Main modules for parallel interaction. back

8 Multimodality – New User Interfaces. User Perspective: Feedback nutshell from divers previous innovation projects: “Give us speech control” Composite interaction with full prototype implementation for customer self service: 2 campaigns (SMS & Personalized Call Connect Signal) Need to actively communicate the possibilities & advantages of new multimodal interaction paradigm to user Real appreciation of speech control & good acceptance of “push-to-talk” mode Expectation: Symmetry & consistency between the interaction modes BUT: How do users really want to speak to the machine? How to provide feedback? / How to correct input errors? Great for context dependent service interaction BUT: Which mode is most suitable for which task? For whom? Under which circumstances?

9 Multimodality – New User Interfaces. Challenges: Sequential vs parallel i/o Unique interpretation of multimodal hypotheses Discourse phenomena like anaphora resolution and generation Input correction loops Encapsulation of i/o tools to achieve a generic front end Model Driven Architecture

10 Thank you for your attention!

11 Multimodality – New User Interfaces. Sequential and Parallel Input. Sequential input Multimodal applications may allow to choose between different input modalities, e.g. to speak or to click on a button Only one input channel will be interpreted, i.e. the user may speak or click on a button Multiple input channels will be interpreted sequentially as defined by the application Parallel input Also known as composite input Multimodal applications allow to use multiple input modes at nearly the same time, e.g. the user may speak and tap onto the screen The Multimodal application will combine multiple inputs and interpret them User navigates in a map and speaks “zoom in here” Select a field and then speak “My number is …” Then click only on a button Afterwards, navigate “Back to main menu”  Parallel input needs additional platform or application capabilities in order to combine (integrate) and interpret multiple inputs

12 User speaks and clicks on the screen: “Zoom in.” Recognition grammar speech ink Interpretation interpretation semantic interpretation Integration integration processor interaction manager EMM A Multimodality – New User Interfaces. Example: Composite input for voice and stylus.

13 Semantic Interpretation: action = zoom in location = x, y from stylus Recognition grammar speech ink Interpretation interpretation semantic interpretation Integration integration processor interaction manager EMM A Multimodality – New User Interfaces. Example: Composite input for voice and stylus. Interpretation: zoom_in

14 User clicks on map while speaking: x = 17 y = 54 Recognition grammar speech ink Interpretation interpretation semantic interpretation Integration integration processor interaction manager EMM A Multimodality – New User Interfaces. Example: Composite input for voice and stylus.

15 Interpretation: point Recognition grammar speech ink Interpretation interpretation semantic interpretation Integration integration processor interaction manager EMM A Multimodality – New User Interfaces. Example: Composite input for voice and stylus.

16 Recognition grammar speech ink Interpretation interpretation semantic interpretation Integration integration processor interaction manager EMM A Multimodality – New User Interfaces. Example: Composite input for voice and stylus. Integration: <emma:emma version="1.0" xmlns:emma=" zoom_in point 17 54

17 Interaction manager (application specific tasks) n Proof of input data n integrated input? n speech only? n ink/stylus only? n Proof of suitability of Integration results n input data compatible? (e.g. are the real number of stylus input (e.g. 2times) the same like the expected value n Mapping of recognition results from different modalities e.g. n Speech recognition error but stylus correct n Speech recognition OK but stylus incorrect n Confidence ok and stylus ok n Decision for error handling output n graphical, audio, prompt, TTS n Handling of redundant information and creation of related user reaction n prioritisation of input modalities Integration integration processor interaction manager EMM A Multimodality – New User Interfaces. Methods and functionalities: Interaction manager. back