German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg 3 66123 Saarbruecken, Germany phone: (+49 681) 302-5252/4162 fax: (+49.

Slides:



Advertisements
Similar presentations
GMD German National Research Center for Information Technology Darmstadt University of Technology Perspectives and Priorities for Digital Libraries Research.
Advertisements

TeleMorph & TeleTuras: Bandwidth determined Mobile MultiModal Presentation Student: Anthony J. Solon Supervisors: Prof. Paul Mc Kevitt Kevin Curran School.
German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.
XProtect ® Professional Efficient solutions for mid-sized installations.
Sharing Content and Experience in Smart Environments Johan Plomp, Juhani Heinila, Veikko Ikonen, Eija Kaasinen, Pasi Valkkynen 1.
German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.
Empirical and Data-Driven Models of Multimodality Advanced Methods for Multimodal Communication Computational Models of Multimodality Adequate.
I-Room : Integrating Intelligent Agents and Virtual Worlds.
Page 1 SIXTH SENSE TECHNOLOGY Presented by: KIRTI AGGARWAL 2K7-MRCE-CS-035.
Android 4.0 ICS An Unified UI framework for Tablets and Cell Phones Ashwin. G. Balani, Founder Member, GTUG, Napur.
Richard Yu.  Present view of the world that is: Enhanced by computers Mix real and virtual sensory input  Most common AR is visual Mixed reality virtual.
ICT work programme ICT 22 Multimodal and natural computer interaction Aleksandra Wesolowska (Unit G.3 - Data Value Chain) Juan Pelegrin (Unit.
Component-Based Software Engineering Oxygen Paul Krause.
German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.
Media Coordination in SmartKom Norbert Reithinger Dagstuhl Seminar “Coordination and Fusion in Multimodal Interaction” Deutsches Forschungszentrum für.
Discovering Computers: Chapter 1
John Hu Nov. 9, 2004 Multimodal Interfaces Oviatt, S. Multimodal interfaces Mankoff, J., Hudson, S.E., & Abowd, G.D. Interaction techniques for ambiguity.
ISTD 2003, Audio / Speech Interactive Systems Technical Design Seminar work: Audio / Speech Ville-Mikko Rautio Timo Salminen Vesa Hyvönen.
AceMedia Personal content management in a mobile environment Jonathan Teh Motorola Labs.
New Technologies Are Surfacing Everyday. l Some will have a dramatic affect on the business environment. l Others will totally change the way you live.
DFKI Approach to Dialogue Management Norbert Reithinger, Elsa Pecourt, Markus Löckelt
Smart Home Technologies CSE 4392 / CSE 5392 Spring 2006 Manfred Huber
German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.
Building the Design Studio of the Future Aaron Adler Jacob Eisenstein Michael Oltmans Lisa Guttentag Randall Davis October 23, 2004.
Integrated Messaging Platform Broadcast Mediawire Multi media Tailored Solutions.
AS ICT.  A portable communication device is a pocket sized device that is carried around by an individual  They typically have a display screen with.
Basic Data Communication
1 Introduction to Multimedia What is Multimedia. 1
Multimedia. Definition What is Multimedia? Multimedia can have a many definitions these include: Multimedia means that computer information can be represented.
NV V5.7 Product Presentation. Brand New Professional GUI  Multiple User Interface for different look and feel  Audio indicator on camera (play audio.
Technologies For Home Networking SUDHIR DIXIT / RAMJEE PRASAD.
Brussels, 04 March 2004Workshop „New Communication Paradigms for 2020“ Semantic Routing, Service Discovery and Service Composition Gregor Erbach German.
DFKI GmbH, , R. Karger Indo-German Workshop on Language Technologies Reinhard Karger, M.A. Deutsches Forschungszentrum für Künstliche Intelligenz.
Presentation by: K.G.P.Srikanth. CONTENTS  Introduction  Components  Working  Applications.
Working group on multimodal meaning representation Dagstuhl workshop, Oct
CHAPTER FOUR COMPUTER SOFTWARE.
© 2007 Tom Beckman Features:  Are autonomous software entities that act as a user’s assistant to perform discrete tasks, simplifying or completely automating.
Module 3: Business Information Systems Chapter 8: Electronic and Mobile Commerce.
Multimodal Information Access Using Speech and Gestures Norbert Reithinger
Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.
Multimedia Chapter 1 Introduction to Multimedia Dhekra BEN SASSI.
IT in Business Essentials of the Internet and World Wide Web.
Submitted by:- Vinay kr. Gupta Computer Sci. & Engg. 4 th year.
Towards multimodal meaning representation Harry Bunt & Laurent Romary LREC Workshop on standards for language resources Las Palmas, May 2002.
German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.
Subtask 1.8 WWW Networked Knowledge Bases August 19, 2003 AcademicsAir force Arvind BansalScott Pollock Cheng Chang Lu (away)Hyatt Rick ParentMark (SAIC)
German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.
Spoken Dialog Systems and Voice XML Lecturer: Prof. Esther Levin.
German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.
Beyond the PC Kiosks & Handhelds Albert Huang Larry Rudolph Oxygen Research Group MIT CSAIL.
Reading Flash. Training target: Read the following reading materials and use the reading skills mentioned in the passages above. You may also choose some.
DFKI GmbH, , R. Karger Perspectives for the Indo German Scientific and Technological Cooperation in the Field of Language Technology Reinhard.
Human Factors in Mobile Computing By: Ed Leland EEL
Website Design, Development and Maintenance ONLY TAKE DOWN NOTES ON INDICATED SLIDES.
Multi-Modal Dialogue in Personal Navigation Systems Arthur Chan.
German Research Center for Artificial Intelligence DFKI GmbH Saarbruecken, Germany WWW: Eurospeech.
Stanford hci group / cs376 u Jeffrey Heer · 19 May 2009 Speech & Multimodal Interfaces.
Assignment 1 – Voice Activated Systems Meryem Gurel PowerPack : Physical Computing, Wireless Networks and Internet of Things 10/7/2013 German W Aparicio.
WP6 Emotion in Interaction Embodied Conversational Agents WP6 core task: describe an interactive ECA system with capabilities beyond those of present day.
What is Multimedia Anyway? David Millard and Paul Lewis.
Web Design Vocabulary #3. HTML Hypertext Markup Language - The coding scheme used to format text for use on the World Wide Web.
Internet of Things – Getting Started
1 Seamless Mobility Tom MacTavish, Vice President The Center for Human Interaction Research Motorola Labs.
NCP meeting Jan 27-28, 2003, Brussels Colette Maloney Interfaces, Knowledge and Content technologies, Applications & Information Market DG INFSO Multimodal.
 Background  Introduction  Purpose  Basic rover services  Physical architecture of Rover System  Server operation  Logical Architecture of A Rover.
© W. Wahlster, DFKI IST ´98 Workshop „The Language of Business - the Business of Language“ Vienna, 2 December 1998 German Research Center for Artificial.
Fundamentals of Information Systems, Sixth Edition
Ubiquitous Computing and Augmented Realities
Multimodal Human-Computer Interaction New Interaction Techniques 22. 1
Multimedia Systems & Interfaces
Presentation transcript:

German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: ( ) WWW: Wolfgang Wahlster Language Technologies for the Mobile Internet Era

© W. Wahlster Multimodal Interfaces to 3G Mobile Services Market studies (May 2002) predict: Cumulative revenues of almost 1 trillion € from launch until 2010 Multimodal UMTS Systems Non-voice service revenues will dominate voice revenues by year 3 and comprise 66% of 3G service revenues by 2010 Non-voice service revenues will dominate voice revenues by year 3 and comprise 66% of 3G service revenues by billion € in revenues in billion € in revenues in 2010 In 2010 the average 3G subscriber will spend about 30 € per month on 3G data services In 2010 the average 3G subscriber will spend about 30 € per month on 3G data services

© W. Wahlster Multimodal UMTS Systems Intelligent Interaction with Mobile Internet Services Access to web content and web services anywhere and anytime Access to corporate networks and virtual private networks from any device Access to edutainment and infotainment services Access to edutainment and infotainment services Access to all messages (voice, , multimedia, MMS) from any single device Access to all messages (voice, , multimedia, MMS) from any single device Personalization Localization

© W. Wahlster Mobile Messaging Services Evolution: From SMS to MMS Infrastructure Customer Expectation Applications Text Standard Phones Ubiquity Youth Focus Limited Enhancement SS7 SMSC MMS Relay and Servers UMTS IP/MPLS Protocols EMS Phones MMS Phones Integrated Image Capture Smart Phones Pictures Audio Multimedia Video Enhanced Text Personalized Services Location-based Services Emotional Experience Enhanced Message Creation MMS SMS EMS SMSC Terminals Language Technologies for MMS:- Speech Synthesis (with Affect) - Multimodal Authoring Interface - Speech-based Retrieval of Media Objects

© W. Wahlster 1.Using all Human Senses for Intuitive Interfaces 2.Media Fusion and Fission in the SmartKom System 3.Client/Server Architectures for Mobile Multimodal Dialogue Systems 4. Added-Value Mobile Services 5. Research Roadmaps for Multimodal Interfaces 6. Conclusions Outline of the Talk

© W. Wahlster From Spoken Dialogue to Multimodal Dialogue SmartKom Third Generation UMTS Phone Speech, Graphics and Gesture Verbmobil Today‘s Cell Phone Speech only

© W. Wahlster Spoken Dialogue Graphical User interfaces Gestural Interaction Multimodal Interaction Merging Various User Interface Paradigms Facial Expressions Haptic Input

© W. Wahlster System Input Channels Output Channels Storage HD Drive DVD visual tactile auditory haptic MEDIA (physical information carriers) MODALITIES (human senses) languagegraphicsgesture User CODE (systems of symbols) facial ex- pression Using All Human Senses for Intuitive Interaction: Code, Media and Modalities

© W. Wahlster Symbolic and Subsymbolic Fusion of Multiple Modes Speech Recognition Gesture Recognition Prosody Recognition Facial Expression Recognition Lip Reading Subsymbolic Fusion - Neuronal Networks - Hidden Markov Models Symbolic Fusion - Graph Unification - Bayesian Networks Reference Resolution and Disambiguation Semantic Representation

© W. Wahlster Mutual Disambiguation of Multiple Input Modes The combination of speech and vision analysis increases the robustness and understanding capabilities of multimodal user interfaces. Speech Recognition + Lip Reading increases robustness in noisy environments Speech Recognition + Gesture Recognition (XTRA, SmartKom) referential disambiguation and focus control Speech Recognition + Facial Expression Recognition (SmartKom) recognition of irony, sarcasm and scope disambiguation

© W. Wahlster SmartKom-Public: A Multimodal Communication Kiosk SmartKom-Mobile: A Handheld Communication Assistant SmartKom: A Transportable Interface Agent Media Analysis Kernel of SmartKom Interface Agent Interaction Management Application Manage- ment Media Design SmartKom-Home/Office: Multimodal Portal to Information Services

© W. Wahlster SmartKom: Intuitive Multimodal Interaction MediaInterface European Media Lab Uinv. Of Munich Univ. of Stuttgart Saarbrücken Aachen Dresden Berkeley Stuttgart MunichUniv. of Erlangen Heidelberg Main Contractor DFKI Saarbrücken The SmartKom Consortium: Project Budget: € 25.5 million, funded by BMBF (Dr. Reuse) and industry Project Duration: 4 years (September 1999 – September 2003) Ulm

© W. Wahlster SmartKom`s SDDP Interaction Metaphor SDDP = Situated Delegation-oriented Dialogue Paradigm User specifies goal delegates task cooperate on problems asks questions presents results Service 1 Service 2 Service 3 Webservices Personalized Interaction Agent See: Wahlster et al. 2001, Eurospeech

© W. Wahlster Multimodal Input and Output in the SmartKom System Where would you like to sit?

© W. Wahlster Personalized Interaction with WebTVs via SmartKom (DFKI with Sony, Philips, Siemens) User: Switch on the TV. Smartakus: Okay, the TV is on. User: Which channels are presenting the latest news right now? Smartakus: CNN and NTV are presenting news. User: Please record this news channel on a videotape. Smartakus: Okay, the VCR is now recording the selected program. Example: Multimodal Access to Electronic Program Guides for TV

© W. Wahlster Using Facial Expression Recognition for Affective Personalization (1) Smartakus: Here you see the CNN program for tonight. (2)User: That’s great. (3)Smartakus: I’ll show you the program of another channel for tonight. (2’)User: That’s great. (3’) Smartakus: Which of these features do you want to see? Processing ironic or sarcastic comments  

© W. Wahlster The SmartKom Demonstrator System Camera for Gestural Input Microphone Multimodal Control of TV-Set Multimodal Control of VCR/DVD Player

© W. Wahlster A Demonstration of SmartKom’s Multimodal Interface for the German President Dr. Rau

© W. Wahlster Seamless integration and mutual disambiguation of multimodalinput and output on semantic and pragmatic levels Situated understanding of possibly imprecise, ambiguous, or incom- plete multimodal input Context-sensitive interpretation of dialog interaction on the basis of dynamic discourse and context models Adaptive generation of coordinated, cohesive and coherent multimodal presentations Semi- or fully automatic completion of user-delegated tasks through the integration of information services Intuitive personification of the system through a presentation agent Salient Characteristics of SmartKom

© W. Wahlster Multimodal Input and Output in SmartKom Fusion and Fission of Multiple Modalities Input by the User Output by the Presentation agent Speech Gesture Facial Expressions

© W. Wahlster ? e.g. 60 x 90 pixel b/w e.g * 768 pixel 24-bit color The Need for Personalization: Adaptive Interaction with Mobile Devices

© W. Wahlster PEACH: „Beaming“ A Life-Like Character From A Large Public Display to a Mobile Personal Device PEACH: Personalized Edutainment in Museums (IRST – DFKI)

© W. Wahlster A “Web of Meaning“ has more Personalization Potential than a “Web of Links“ Three Layers of Webpage Annotations cf.: Dieter Fensel, James Hendler, Henry Liebermann, Wolfgang Wahlster (eds.) Spinning the Semantic Web, MIT Press, November 2002 Personalization Potential OWL DAML + OIL Content high Structure XML medium Layout HTML low

© W. Wahlster Personalization Mapping Web Content Onto a Variety of Structures and Layouts From the “one-size fits-all“ approach of static webpages to the “perfect personal fit“ approach of adaptive webpages Structure XML 1 XML 2 XML n Content OWL Layout HTML 11 HTML 1m HTML 21 HTML 2o HTML 31 HTML 3p

© W. Wahlster SmartKom: Towards Multimodal and Mobile Dialogue Systems for Indoor and Outdoor Navigation Seamless Integration of Various Positioning Technologies GSM/UMTS cells GPS Infrared Wavelan, Bluetooth Using the same device for driving and walking directions Speech and Gesture Input Graphics and Speech Output

© W. Wahlster Spoken Dialogues with the Car Navigation System: SENECA Product Announcement for E-Class Mercedes: End of 2002

© W. Wahlster Getting Driving and Walking Directions via SmartKom User: I want to drive to Heidelberg. Smartakus: Do you want to take the fastest or the shortest route? User: The fastest. Smartakus: Here you see a map with your route from Saarbrücken to Heidelberg. SmartKom can be used for Multimodal Navigation Dialogues in a Car

© W. Wahlster Getting Driving and Walking Directions via SmartKom Smartakus: You are now in Heidelberg. Here is a sightseeing map of Heidelberg. User: I would like to know more about this church! Smartakus: Here is some information about the St. Peter's Church. User: Could you please give me walking directions to this church? Smartakus: In this map, I have high-lighted your walking route.

© W. Wahlster SmartKom: Multimodal Dialogues with a Hybrid Navigation System

© W. Wahlster SmartKom, please look for the nearest parking lot. SmartKom, please look for the nearest parking lot. The parking garage at the main station provides 300 slots. Opening hours are from 6 am to 11 pm. Do you want to get there? The parking garage at the main station provides 300 slots. Opening hours are from 6 am to 11 pm. Do you want to get there? Spoken Navigation Dialogues with SmartKom No, please tell me about the next parking option. No, please tell me about the next parking option. The Market parking lot provides 150 slots. It is opened 24 hours a day. Do you want to get there? The Market parking lot provides 150 slots. It is opened 24 hours a day. Do you want to get there? Yes, please I‘ll bring you to the Market parking lot. I‘ll bring you to the Market parking lot.

© W. Wahlster The High-Level Control Flow of SmartKom

© W. Wahlster The High-Level Control Flow of SmartKom

© W. Wahlster The High-Level Control Flow of SmartKom

© W. Wahlster The High-Level Control Flow of SmartKom

© W. Wahlster The High-Level Control Flow of SmartKom

© W. Wahlster The High-Level Control Flow of SmartKom

© W. Wahlster The High-Level Control Flow of SmartKom

© W. Wahlster The High-Level Control Flow of SmartKom

© W. Wahlster The High-Level Control Flow of SmartKom

© W. Wahlster The High-Level Control Flow of SmartKom

© W. Wahlster The High-Level Control Flow of SmartKom

© W. Wahlster The High-Level Control Flow of SmartKom

© W. Wahlster The High-Level Control Flow of SmartKom

© W. Wahlster The High-Level Control Flow of SmartKom

© W. Wahlster Embedded Speech Understanding Content Access (eg. Map Updates) Webservices Distributed Speech Understanding Aurora Speech Features Speech Understanding System With Feature Interface Remote Speech Understanding Java-based Voice Streaming Speech Understanding System A Spectrum of Client/Server Architectures for Mobile Multimodal Systems: From Thin to Fat Clients

© W. Wahlster M3I: A Mobile, Multimodal, and Modular Interface of DFKI IBM Embedded Via Voice iPAQ JORNADA C++ EmbeddedJava Java-based Voice Streaming SmartKom‘s Multimodal Dialogue Engine 1.Hybrid Speech Understanding = Embedded + Remote/Distributed Speech Understanding Small Vocabulary Large Vocabulary (Topic Detection) 2.Resource-Adaptive Speech Processing: Availability of a Server Improves the Coverage and Quality

© W. Wahlster Example of Embedded Multimodal Dialogue System M3I for Pedestrian Navigation (DFKI) Spoken and Gestural Input combined with graphics and speech output on an iPAQ

© W. Wahlster Java-Based Voice Streaming for Hybrid Speech Understanding in M3I (DFKI)

© W. Wahlster SmartKom sends a note to the user or activates an alarm as soon as the user approaches an exhibit that matches the specification of an an item on the ActiveList. ActiveList‘s spatial alarm can be combined with: - route planning and navigation -temporal and spatial optimization of a visit SmartKom‘s Added-Value Mobile Service ActiveList Please let me know, when I pass a shop selling batteries.

© W. Wahlster SmartKom‘s Added-Value Mobile Service SpotInspector What‘s going on at the castle right now? SmartKom allows the user to have remote visual access to various interesting spots via a selection of webcams – showing current waiting queues, special events and activities. SpotInspector can be combined with: - multimedia presentations of the expected program for these spots - route planning and navigation to these spots

© W. Wahlster SmartKom‘s Added-Value Mobile Service PartnerRadar Where are Lisa und Tom ? What are they looking at? SmartKom helps to locate and to bring together members of the same party. Involved Technologies -Navigation and tour instructions -Monitoring of group activity - Additional information on exhibits that are interesting for the whole party.

© W. Wahlster ReflectorsPhoto Detector Speaker Command Button Microphone Fingerprint Recognizer Ultimate Simplicity: One-Button Mobile Devices 8hertz technologies Germany CARC Cyber Assist Research Center Japan

UMTS-Doit: The First Test and Evaluation Center for UMTS-based Multimodal Speech Services in Germany Mobile Network Internet Content Provider Gigastream UMTS Navigation Switch E1/ATM RNC Munich Node B at DFKI Saarbrücken PSTN, Telephone System UMTS-Doit Server Cooperation betweenand

© W. Wahlster UMTS Applications in a Mercedes: Webcam Providing a Look-Ahead of the Traffic Situation

© W. Wahlster UMTS Application in a Mercedes: Language-based Music Download DFKI Spin-off: Natural Language Music Search

© W. Wahlster MP3 music files from the Web Rist & Herzog for Blaupunkt Personalized Car Entertainment (DFKI for Bosch)

© W. Wahlster Empirical and Data-Driven Models of Multimodality Advanced Methods for Multimodal Communication Computational Models of Multimodality Adequate Corpora for MM Research Mobile, Human-Centered, and Intelligent Multimodal Interfaces Multimodal Interface Toolkit Research Roadmap of Multimodality XML-Encoded MM Human-Human and Human-Machine Corpora Mobile Multimodal Interaction Tools Standards for the Annotation of MM Training Corpora Examples of Added-Value of Multimodality Multimodal Barge-In Markup Languages for Multimodal Dialogue Semantics Models for Effective and Trustworthy MM HCI Collection of Hardest and Most Frequent/Relevant Phenomena Task-, Situation- and User- Aware Multimodal Interaction Plug- and Play Infrastructure Toolkits for Multimodal Systems Situated and Task- Specific MM Corpora Common Representation of Multimodal Content Decision-theoretic, Symbolic and Hybrid Modules for MM Input Fusion Reusable Components for Multimodal Analysis and Generation Corpora with Multimodal Artefacts and New Multi- modal Input Devices Models of MM Mutual Disambiguation Multiparty MM Interaction 2 Nov Dagstuhl Seminar Fusion and Coordination in Multimodal Interaction edited by: W. Wahlster Multimodal Toolkit for Universal Access

© W. Wahlster Ecological Multimodal Interfaces Research Roadmap of Multimodality Empirical and Data-Driven Models of Multimodality Advanced Methods for Multimodal Communication Toolkits for Multimodal Systems Usability Evaluation Methods for MM System Multimodal Feedback and Grounding Tailored and Adaptive MM Interaction Incremental Feedback between Modalities during Generation Models of MM Collaboration Parametrized Model of Multimodal Behaviour Demonstration of Performance Advances through Multimodal Interaction Real-time Localization and Motion/Eye Tracking Technology Multimodality in VR and AR Environments Resource-Bounded Multimodal Interaction User‘s Theories of System‘s Multimodal Capabilities Multicultural Adaptation of Multimodal Presentations Affective MM Communication Testsuites and Benchmarks for Multimodal Interaction Multimodal Models of Engagement and Floor Management Non-Monotonic MM Input Interpretation Computational Models of the Acquisition of MM Communication Skills Non-Intrusive & Invisible MM Input Sensors Biologically-Inspired Intersensory Coordination Models 2 Nov Dagstuhl Seminar Fusion and Coordination in Multimodal Interaction edited by: W. Wahlster

© W. Wahlster Burning Issues in Multimodal Interaction Multimodality: from alternate modes of interaction towards mutual disambiguation and synergistic combinations Discourse Models: from information-seeking dialogs towards argumentative dialogs and negotiations Domain Models: from closed world assumptions towards the open world of web services Dialog Behaviour: from automata models towards a combination of probabilistic and plan-based models

© W. Wahlster Multimodal interfaces increase the robustness of interaction, enable mutual disambiguation, and lead to intuitive and efficient dialogues Hybrid and resource-adaptive client/server architectures improve the quality and coverage of mobile multimodal interfaces The combination of indoor and outdoor navigation for drivers and pedestrians on a single device for various wireless technologies is one of the „killer apps“ for UMTS services Conclusions

URL of this Presentation:

© W. Wahlster © 2002 DFKI Design by R.O. Thank you very much for your attention