German Research Center for Artificial Intelligence DFKI GmbH 66123 Saarbruecken, Germany WWW:http://www.dfki.de/~wahlster Eurospeech.

Slides:



Advertisements
Similar presentations
GMD German National Research Center for Information Technology Darmstadt University of Technology Perspectives and Priorities for Digital Libraries Research.
Advertisements

National Technical University of Athens Department of Electrical and Computer Engineering Image, Video and Multimedia Systems Laboratory
INTEGRATION OF VOICE SERVICES IN INTERNET APPLICATIONS By Eduardo Carrillo (lecturer), J. J Samper, J.J. Martínez-Durá Universidad Autónoma de Bucaramanga.
TeleMorph & TeleTuras: Bandwidth determined Mobile MultiModal Presentation Student: Anthony J. Solon Supervisors: Prof. Paul Mc Kevitt Kevin Curran School.
German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.
Co-funded by the European Union Semantic CMS Community IKS impact on DFKI research Final Review Luxembourg March 13/14, 2013 Tilman Becker DFKI GmbH.
Manuela Veloso, Anthony Stentz, Alexander Rudnicky Brett Browning, M. Bernardine Dias Faculty Thomas Harris, Brenna Argall, Gil Jones Satanjeev Banerjee.
An overview of EMMA— Extensible MultiModal Annotation Michael Johnston AT&T Labs Research 8/9/2006.
German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.
German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.
Irek Defée Signal Processing for Multimodal Web Irek Defée Department of Signal Processing Tampere University of Technology W3C Web Technology Day.
Empirical and Data-Driven Models of Multimodality Advanced Methods for Multimodal Communication Computational Models of Multimodality Adequate.
MediaHub: An Intelligent Multimedia Distributed Hub Student: Glenn Campbell Supervisors: Dr. Tom Lunney Prof. Paul Mc Kevitt School of Computing and Intelligent.
Chapter 5 Input and Output. What Is Input? What is input? p. 166 Fig. 5-1 Next  Input device is any hardware component used to enter data or instructions.
XISL language XISL= eXtensible Interaction Sheet Language or XISL=eXtensible Interaction Scenario Language.
Sensor-based Situated, Individualized, and Personalized Interaction in Smart Environments Simone Hämmerle, Matthias Wimmer, Bernd Radig, Michael Beetz.
German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.
Media Coordination in SmartKom Norbert Reithinger Dagstuhl Seminar “Coordination and Fusion in Multimodal Interaction” Deutsches Forschungszentrum für.
Course Overview Lecture 1 Spoken Language Processing Prof. Andrew Rosenberg.
John Hu Nov. 9, 2004 Multimodal Interfaces Oviatt, S. Multimodal interfaces Mankoff, J., Hudson, S.E., & Abowd, G.D. Interaction techniques for ambiguity.
© W. Wahlster, DFKI IUI99, International Conference on Intelligent User Interfaces Los Angeles, January 6th, 1999 Agent-based Multimedia Interaction for.
DFKI Approach to Dialogue Management Norbert Reithinger, Elsa Pecourt, Markus Löckelt
German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.
Building the Design Studio of the Future Aaron Adler Jacob Eisenstein Michael Oltmans Lisa Guttentag Randall Davis October 23, 2004.
Mobile Multimodal Applications. Dr. Roman Englert, Gregor Glass March 23 rd, 2006.
Personalized Medicine Research at the University of Rochester Henry Kautz Department of Computer Science.
ACL, ECCAI and the Verbmobil/SmartKom Consortia German Research Center for Artificial Intelligence Stuhlsatzenhausweg 3, Geb Saarbrücken Tel.:
GUI: Specifying Complete User Interaction Soft computing Laboratory Yonsei University October 25, 2004.
Brussels, 04 March 2004Workshop „New Communication Paradigms for 2020“ Semantic Routing, Service Discovery and Service Composition Gregor Erbach German.
Intelligent Multimodal Interaction: Challenges and Promise Mark T. Maybury Schloss Dagstuhl, Germany 29 October 2001 MITRE
Improving Speech Applications with Video August 9 th 2006 David Multimedia Research Department Avaya Labs Research.
DFKI GmbH, , R. Karger Indo-German Workshop on Language Technologies Reinhard Karger, M.A. Deutsches Forschungszentrum für Künstliche Intelligenz.
Center for Human Computer Communication Department of Computer Science, OG I 1 Designing Robust Multimodal Systems for Diverse Users and Mobile Environments.
Recognition of meeting actions using information obtained from different modalities Natasa Jovanovic TKI University of Twente.
ITCS 6010 SALT. Speech Application Language Tags (SALT) Speech interface markup language Extension of HTML and other markup languages Adds speech and.
© 2007 Tom Beckman Features:  Are autonomous software entities that act as a user’s assistant to perform discrete tasks, simplifying or completely automating.
APML, a Markup Language for Believable Behavior Generation Soft computing Laboratory Yonsei University October 25, 2004.
Multimodal Information Access Using Speech and Gestures Norbert Reithinger
German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.
Towards multimodal meaning representation Harry Bunt & Laurent Romary LREC Workshop on standards for language resources Las Palmas, May 2002.
German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.
Subtask 1.8 WWW Networked Knowledge Bases August 19, 2003 AcademicsAir force Arvind BansalScott Pollock Cheng Chang Lu (away)Hyatt Rick ParentMark (SAIC)
German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.
Spoken Dialog Systems and Voice XML Lecturer: Prof. Esther Levin.
卓越發展延續計畫分項三 User-Centric Interactive Media ~ 主 持 人 : 傅立成 共同主持人 : 李琳山,歐陽明,洪一平, 陳祝嵩 水美溫泉會館研討會
German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.
ENTERFACE 08 Project 1 “MultiParty Communication with a Tour Guide ECA” Mid-term presentation August 19th, 2008.
Models for Human Interaction with Mobile Service Robots Helge Hütttenrauch Helge Hüttenrauch
Agenda 1. What we have done on which tasks 2. Further specification of work on all our tasks 3. Planning for deliverable writing this autumn (due in December)
Introduction to Dialogue Systems. User Input System Output ?
Intelligent Robot Architecture (1-3)  Background of research  Research objectives  By recognizing and analyzing user’s utterances and actions, an intelligent.
Beyond the PC Kiosks & Handhelds Albert Huang Larry Rudolph Oxygen Research Group MIT CSAIL.
NESPOLE! is a project which aims at providing a system capable of supporting communication in the field of e-commerce and e-service by resorting to automatic.
AgentSheets ® Thought Amplifier AgentSheets, Inc. Boulder, CO, USA Dr. Alexander Repenning, CEO.
AN INTELLIGENT ASSISTANT FOR NAVIGATION OF VISUALLY IMPAIRED PEOPLE N.G. Bourbakis*# and D. Kavraki # #AIIS Inc., Vestal, NY, *WSU,
DFKI GmbH, , R. Karger Perspectives for the Indo German Scientific and Technological Cooperation in the Field of Language Technology Reinhard.
1 Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents S. Kawamoto, et al. October 27, 2004.
Keyboard Computer Mouse Input devices is the information you put into the computer.
Multi-Modal Dialogue in Personal Navigation Systems Arthur Chan.
Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects. MAO Yuhang, DING Xiao-Qing, NI Yang, LIN Shiuan-Sung, Laurence LIKFORMAN,
“Intelligent User Interfaces” by Hefley and Murray.
W3C Multimodal Interaction Activities Deborah A. Dahl August 9, 2006.
WP6 Emotion in Interaction Embodied Conversational Agents WP6 core task: describe an interactive ECA system with capabilities beyond those of present day.
Presented By Sharmin Sirajudeen S7 CS Reg No :
© W. Wahlster, DFKI IST ´98 Workshop „The Language of Business - the Business of Language“ Vienna, 2 December 1998 German Research Center for Artificial.
Towards lifelike Computer Interfaces that learn
Pervasive Computing Happening?
Multimodal Human-Computer Interaction New Interaction Techniques 22. 1
Presented by: Mónica Domínguez
Presentation transcript:

German Research Center for Artificial Intelligence DFKI GmbH Saarbruecken, Germany WWW: Eurospeech Scandinavia Dialog Systems - Project Descriptions II Aalborg, 6 September 2001 Wolfgang Wahlster Anselm Blocher, Norbert Reithinger SmartKom: Multimodal Communication with a Life-like Character

© W. Wahlster Verbmobil SmartKom Today‘s Cell Phone Third Generation UMTS Phone Speech onlySpeech, Graphics and Gesture From Spoken Dialogue to Multimodal Dialogue

© W. Wahlster Spoken Dialogue Graphical User interfaces Gestural Interaction Multimodal Interaction Merging Various User Interface Paradigms

© W. Wahlster I‘d like to reserve tickets for this movie. Where would you like to sit? I‘d like these two seats. Multimodal Interaction with a Life-like Character User Input: Speech and Gesture Smartakus Output: Speech, Gesture and Facial Expressions User Input: Speech and Gesture

© W. Wahlster SmartKom: Multimodal Dialogs with a Life-like Character

© W. Wahlster SmartKom: Intuitive Multimodal Interaction MediaInterface European Media Lab Uinv. Of Munich Univ. of Stuttgart Saarbrücken Aachen Dresden Berkeley Stuttgart MunichUniv. of Erlangen Heidelberg Main Contractor DFKI Saarbrücken The SmartKom Consortium: Project Budget: € 25.5 million Project Duration: 4 years (September 1999 – September 2003) Ulm

© W. Wahlster Salient Characteristics of SmartKom Seamless integration and mutual disambiguation of multimodal input and output on semantic and pragmatic levels Situated understanding of possibly imprecise, ambiguous, or incom- plete multimodal input Context-sensitive interpretation of dialog interaction on the basis of dynamic discourse and context models Adaptive generation of coordinated, cohesive and coherent multimodal presentations Semi- or fully automatic completion of user-delegated tasks through the integration of information services Intuitive personification of the system through a presentation agent

© W. Wahlster SmartKom-Home/Office: Multimodal Portal to Information Services SmartKom-Public: A Multimodal Communication Kiosk SmartKom-Mobile: A Handheld Communication Assistant Media Analysis Kernel of SmartKom Interface Agent Interaction Management Application Manage- ment Media Design SmartKom: A Transportable Interface Agent

© W. Wahlster Fujitsu Stylistic™ 3500X 500 MHz Intel ® Celeron ™ 10.4" XGA TFT (1024x768 Pixels) 256 MB SDRAM 15 GB shock-mounted SmartKom-Home on a Portable Webpad Provides electronic program guides (EPG) for TV, controls consumer electronics like VCRs, and accesses standard applications like phone and Lean-forward mode: coordinated speech and gesture input Lean-backward mode: voice input alone

© W. Wahlster can be added to a car navigation system or carried by a pedestrian Additional services like route planning interactive navigation through a city can be accessed via GPS and GSM/UMTS connectivity Smartkom-Mobile

© W. Wahlster SmartKom`s SDDP Interaction Metaphor SDDP = Situated Delegation-oriented Dialogue Paradigm User specifies goal delegates task cooperate on problems asks questions presents results Service 1 Service 2 Service 3 IT Services Personalized Interaction Agent

© W. Wahlster Visual Support for SDDP adaptation to the user’s viewing angle reduction of the association “screen  computer” (no background) spotlights guide and control the user’s attention

© W. Wahlster classic fixed isometric perspective completely variable user-adaptive perspective with limited variability The Perspective of the User

© W. Wahlster Decomposition of Behavioural Schemata: Phases of Gestures PreparationStroke Retraction

© W. Wahlster Some Complex Behavioural Patterns of the Interaction Agent Smartakus Examples of complex motion patterns, pointing gestures and co-speech gestures Enumerate five points Go in a circle Jumping on the spot The i shape of Smartakus reminds one of an „ i “ at information kiosks.

© W. Wahlster Multimodal Input and Output in SmartKom Input by the User Output by the Presentation agent Speech Gesture Facial Expressions

© W. Wahlster Semantic Representation Language Semantic Representation Language Face Description Language Face Description Language Gesture Description Language Gesture Description Language Ontologies Knowledge Representation Language Inference Component Knowledge Representation Language Inference Component DBMS/ KBMS/ WWW DBMS/ KBMS/ WWW Face Analysis Facial Expression Generation Gesture Analysis Gesture Generation Parsing Facial Expressions Facial Expressions Gestures Modality-Specific Representation Languages as an Intermediate Representation before Media Fusion Speech Input M3L based on XML

© W. Wahlster SmartKom‘s Data Collection of Multimodal Dialogs User Side-view Camera Face-tracking Camera with Microphone Environmental Noise Microphone Array Screen Projected Webpage Face-tracking Camera Loudspeaker Microphone Array User Bird’s-eye Camera LCD Beamer SIVIT- Camera See: Talk by U. Türk on the SmartKom Data Collection at am in Session D11, ESE 11, Next Generation Speech Resources

© W. Wahlster Mobile Presentation Unit for SmartKom-Public 2 Sony DSR-PD100AP Video Cameras LCD-Beamer ASK C5 SIVIT Gesture Recognition Unit with Infrared Camera Microphones (Microphone Array) Speakers 3 Dual Pentiums III, 500 Visit the SmartKom Demo Booth, Next Demos: 10 am,12, 2pm and 4pm

© W. Wahlster The SmartKom Control GUI

© W. Wahlster The SmartKom Control GUI

© W. Wahlster The SmartKom Control GUI

© W. Wahlster The SmartKom Control GUI

© W. Wahlster The SmartKom Control GUI

© W. Wahlster The SmartKom Control GUI

© W. Wahlster The SmartKom Control GUI

© W. Wahlster The SmartKom Control GUI

© W. Wahlster The SmartKom Control GUI

© W. Wahlster The SmartKom Control GUI

© W. Wahlster Three Levels of Mark-up Languages for the Web Content : Structure : Form = 1 : n : m WWW Document Content Structure Form M3L XML HTML

© W. Wahlster [...] cinema_17a Europa [...] pid1234 [...] [...] cinema_17a Europa [...] pid1234 [...] M3L Representation of the Multimodal Discourse Context Blackboard with Presentation Context of the Previous Dialogue Turn

© W. Wahlster M3L Representation of the Word Lattice Produced by the Speech Recognizer for “ There [  ] I would like to get a reservation.“ T13:44:37.900Z shortPause [...] 5 7 gern PT0.57S PT0.84S 5 7 gerne PT0.57S PT0.84S [...] T13:44:37.900Z shortPause [...] 5 7 gern PT0.57S PT0.84S 5 7 gerne PT0.57S PT0.84S [...]

© W. Wahlster T14:45: PT0.040S T14:45: PT0.040S tarrying dynamic Gesture Recognition and Gesture Analysis “There [  ] I would like to get a reservation.“ Gesture Lattice as Result of Gesture Recognition Result of Gesture Analysis [...] tarrying dynStructId30 1 dynStructId28 2 [...] cinema_17a Europa [...] [...] tarrying dynStructId30 1 dynStructId28 2 [...] cinema_17a Europa [...]

© W. Wahlster Language Analysis and Media Fusion: Turn8: “There [  ] I would like to get a reservation.“ [...] acoustic understanding reserve cinema_17a Europa [...] [...] acoustic understanding reserve cinema_17a Europa [...] Confidence in the Speech Recognition Result Confidence in the Speech Understanding Result Planning Act Object Reference

© W. Wahlster Result of the Action Planner: Presentation Tasks and Presentation Results list add [...] 20:00 [...] list add [...] 20:00 [...]

© W. Wahlster Output Synchronization: Speech, Gesture, Graphics, Animation 11 declarative [...] eine Übersicht [...] 11 declarative [...] eine Übersicht [...]

© W. Wahlster SmartKom uses a Combination of Concept-to- Speech and Text-to-Speech Technologies L*H H%L*HH*LH*L%

© W. Wahlster Classification of Facial Expressions (U. Erlangen) Localization Classification (SVM, Eigenfaces) Classification (SVM, Eigenfaces) Annoyance Annoyance No Annoyance

© W. Wahlster The GUI of the Second SmartKom Prototype

URL of this Presentation: