German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg 3 66123 Saarbruecken, Germany phone: (+49 681) 302-5252/4162 fax: (+49.

Slides:

Advertisements

Similar presentations

CHART or PICTURE INTEGRATING SEMANTIC WEB TO IMPROVE ONLINE Marta Gatius Meritxell González TALP Research Center (UPC) They are friendly and easy to use.

Advertisements

TeleMorph & TeleTuras: Bandwidth determined Mobile MultiModal Presentation Student: Anthony J. Solon Supervisors: Prof. Paul Mc Kevitt Kevin Curran School.

German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.

Co-funded by the European Union Semantic CMS Community IKS impact on DFKI research Final Review Luxembourg March 13/14, 2013 Tilman Becker DFKI GmbH.

Talk and Look: Tools for Ambient Linguistic Knowledge A Project funded by the European Community under the Sixth Framework Programme for Research and Technological.

An overview of EMMA— Extensible MultiModal Annotation Michael Johnston AT&T Labs Research 8/9/2006.

German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.

German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.

© Löckelt, Becker, Pfleger, Alexandersson; DFKI Edilog 2002 Workshop Jan Alexandersson (Tilman Becker, Markus Löckelt, Norbert Pfleger) German Research.

Empirical and Data-Driven Models of Multimodality Advanced Methods for Multimodal Communication Computational Models of Multimodality Adequate.

MediaHub: An Intelligent Multimedia Distributed Hub Student: Glenn Campbell Supervisors: Dr. Tom Lunney Prof. Paul Mc Kevitt School of Computing and Intelligent.

On the road to the creation of situation-adaptive dialogue managers Ajay Juneja Dialog Seminar.

ENTERFACE’08 Multimodal high-level data integration Project 2 1.

Component-Based Software Engineering Oxygen Paul Krause.

Natural Language and Speech Processing Creation of computational models of the understanding and the generation of natural language. Different fields coming.

German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.

Media Coordination in SmartKom Norbert Reithinger Dagstuhl Seminar “Coordination and Fusion in Multimodal Interaction” Deutsches Forschungszentrum für.

Course Overview Lecture 1 Spoken Language Processing Prof. Andrew Rosenberg.

John Hu Nov. 9, 2004 Multimodal Interfaces Oviatt, S. Multimodal interfaces Mankoff, J., Hudson, S.E., & Abowd, G.D. Interaction techniques for ambiguity.

An Overview of QuickSet, from OGI  Cohen, P. R., Johnston, M., McGee, D., Oviatt, S., Pittman, J., Smith, I., Chen, L., and Clow, J. (1997). QuickSet:

ISTD 2003, Thoughts and Emotions Interactive Systems Technical Design Seminar work: Thoughts & Emotions Saija Gronroos Mika Rautanen Juha Sunnari.

CSD 5230 Advanced Applications in Communication Modalities 7/3/2015 AAC 1 Introduction to AAC Orientation to Course Assessment Report Writing.

DFKI Approach to Dialogue Management Norbert Reithinger, Elsa Pecourt, Markus Löckelt

Community Manager A Dynamic Collaboration Solution on Heterogeneous Environment Hyeonsook Kim  2006 CUS. All rights reserved.

German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.

Building the Design Studio of the Future Aaron Adler Jacob Eisenstein Michael Oltmans Lisa Guttentag Randall Davis October 23, 2004.

Mobile Multimodal Applications. Dr. Roman Englert, Gregor Glass March 23 rd, 2006.

ACL, ECCAI and the Verbmobil/SmartKom Consortia German Research Center for Artificial Intelligence Stuhlsatzenhausweg 3, Geb Saarbrücken Tel.:

Prof. Wolfgang Wahlster German Research Center for Artificial Intelligence, DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( )

GUI: Specifying Complete User Interaction Soft computing Laboratory Yonsei University October 25, 2004.

Brussels, 04 March 2004Workshop „New Communication Paradigms for 2020“ Semantic Routing, Service Discovery and Service Composition Gregor Erbach German.

Intelligent Multimodal Interaction: Challenges and Promise Mark T. Maybury Schloss Dagstuhl, Germany 29 October 2001 MITRE

Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.

DFKI GmbH, , R. Karger Indo-German Workshop on Language Technologies Reinhard Karger, M.A. Deutsches Forschungszentrum für Künstliche Intelligenz.

Interactive Dialogue Systems Professor Diane Litman Computer Science Department & Learning Research and Development Center University of Pittsburgh Pittsburgh,

Center for Human Computer Communication Department of Computer Science, OG I 1 Designing Robust Multimodal Systems for Diverse Users and Mobile Environments.

Author: James Allen, Nathanael Chambers, etc. By: Rex, Linger, Xiaoyi Nov. 23, 2009.

Recognition of meeting actions using information obtained from different modalities Natasa Jovanovic TKI University of Twente.

Working group on multimodal meaning representation Dagstuhl workshop, Oct

© 2007 Tom Beckman Features:  Are autonomous software entities that act as a user’s assistant to perform discrete tasks, simplifying or completely automating.

APML, a Markup Language for Believable Behavior Generation Soft computing Laboratory Yonsei University October 25, 2004.

Multimodal Information Access Using Speech and Gestures Norbert Reithinger

German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.

Towards multimodal meaning representation Harry Bunt & Laurent Romary LREC Workshop on standards for language resources Las Palmas, May 2002.

German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.

Subtask 1.8 WWW Networked Knowledge Bases August 19, 2003 AcademicsAir force Arvind BansalScott Pollock Cheng Chang Lu (away)Hyatt Rick ParentMark (SAIC)

German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.

Towards A Context-Based Dialog Management Layer for Expert Systems Victor Hung, Avelino Gonzalez & Ronald DeMara Intelligent Systems Laboratory University.

Towards a Theoretical Framework for the Integration of Dialogue Models into Human-Agent Interaction John R. Lee Assistive Intelligence Inc. Andrew B. Williams.

German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.

ENTERFACE 08 Project 1 “MultiParty Communication with a Tour Guide ECA” Mid-term presentation August 19th, 2008.

School of something FACULTY OF OTHER Facing Complexity Using AAC in Human User Interface Design Lisa-Dionne Morris School of Mechanical Engineering

Semantic Gadgets Pervasive Computing Meets the Semantic Web Reza Zakeri Sharif University of Technology.

Intelligent Robot Architecture (1-3)  Background of research  Research objectives  By recognizing and analyzing user’s utterances and actions, an intelligent.

Dialogue systems Volha Petukhova Saarland University 03/07/2015 Einführung in Diskurs and Pragmatik, Sommersemester

1 Workshop « Multimodal Corpora » Jean-Claude MARTIN Patrizia PAGGIO Peter KÜEHNLEIN Rainer STIEFELHAGEN Fabio PIANESI.

DFKI GmbH, , R. Karger Perspectives for the Indo German Scientific and Technological Cooperation in the Field of Language Technology Reinhard.

Multi-Modal Dialogue in Personal Navigation Systems Arthur Chan.

NEM – How to implement Convergence! SRA Presentation – Update May 2006

PGNET, Liverpool JMU, June 2005 MediaHub: An Intelligent MultiMedia Distributed Platform Hub Glenn Campbell, Tom Lunney, Paul Mc Kevitt School of Computing.

German Research Center for Artificial Intelligence DFKI GmbH Saarbruecken, Germany WWW: Eurospeech.

Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects. MAO Yuhang, DING Xiao-Qing, NI Yang, LIN Shiuan-Sung, Laurence LIKFORMAN,

Stanford hci group / cs376 u Jeffrey Heer · 19 May 2009 Speech & Multimodal Interfaces.

WP6 Emotion in Interaction Embodied Conversational Agents WP6 core task: describe an interactive ECA system with capabilities beyond those of present day.

NCP meeting Jan 27-28, 2003, Brussels Colette Maloney Interfaces, Knowledge and Content technologies, Applications & Information Market DG INFSO Multimodal.

MULTIMODAL AND NATURAL COMPUTER INTERACTION Domas Jonaitis.

© W. Wahlster, DFKI IST ´98 Workshop „The Language of Business - the Business of Language“ Vienna, 2 December 1998 German Research Center for Artificial.

KRISTINA Consortium Presented by: Mónica Domínguez (UPF-TALN)

Multimodal Human-Computer Interaction New Interaction Techniques 22. 1

Presented by: Mónica Domínguez

Presentation transcript:

German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: ( ) WWW: Wolfgang Wahlster SmartKom: Fusion and Fission of Speech, Gestures, and Facial Expressions International Workshop on Man-Machine Symbiotic Systems Kyoto, 26 November 2002, p. 213

© W. Wahlster Spoken Dialogue Graphical User interfaces Gestural Interaction Multimodal Interaction SmartKom: Merging Various User Interface Paradigms Facial Expressions Biometrics

© W. Wahlster Symbolic and Subsymbolic Fusion of Multiple Modes Speech Recognition Gesture Recognition Prosody Recognition Facial Expression Recognition Lip Reading Subsymbolic Fusion - Neuronal Networks - Hidden Markov Models Symbolic Fusion - Graph Unification - Bayesian Networks Reference Resolution and Disambiguation Modality-Free Semantic Representation

© W. Wahlster 1.Using all Human Senses for Symbiotic Man-Machine Interaction 2.SmartKom: Multimodal, Multilingual and Multidomain Dialogues 3.Modality Fusion in SmartKom 4.Multimodal Discourse Processing 5. Plan-based Modality Fission in SmartKom 6. Conclusions Outline of the Talk

© W. Wahlster MM Dialogue Back- Bone Home: Consumer Electronics EPG Public: Cinema, Phone, Fax, Mail, Biometrics Mobile: Car and Pedestrian Navigation Application Layer SmartKom-Mobile SmartKom-Public SmartKom-Home/Office SmartKom: A Highly Portable Multimodal Dialogue System

© W. Wahlster SmartKom: Intuitive Multimodal Interaction MediaInterface European Media Lab Uinv. Of Munich Univ. of Stuttgart Saarbrücken Aachen Dresden Berkeley Stuttgart MunichUniv. of Erlangen Heidelberg Main Contractor Scientific Director W. Wahlster DFKI Saarbrücken The SmartKom Consortium: Project Budget: € 25.5 million, funded by BMBF (Dr. Reuse) and industry Project Duration: 4 years (September 1999 – September 2003) Ulm

© W. Wahlster SmartKom`s SDDP Interaction Metaphor SDDP = Situated Delegation-oriented Dialogue Paradigm Anthropomorphic Interface = Dialogue Partner User specifies goal delegates task cooperate on problems asks questions presents results Service 1 Service 2 Service 3 Webservices Personalized Interaction Agent See: Wahlster et al. 2001, Eurospeech

© W. Wahlster Multimodal Input and Output in the SmartKom System Where would you like to sit?

© W. Wahlster I‘d like to reserve tickets for this performance. Where would you like to sit? I‘d like these two seats. Symbiotic Interaction with a Life-like Character User Input: Speech, Gesture, and Facial Expressions Smartakus Output: Speech, Gesture and Facial Expressions User Input: Speech, Gesture, and Facial Expressions

© W. Wahlster Multimodal Input and Output in SmartKom Fusion and Fission of Multiple Modalities Input by the User Output by the Presentation agent Speech Gesture Facial Expressions

© W. Wahlster SmartKom‘s Data Collection of Multimodal Dialogs User Side-view Camera Face-tracking Camera with Microphone Environmental Noise Microphone Array Screen Projected Webpage Face-tracking Camera Loudspeaker Microphone Array User Bird’s-eye Camera LCD Beamer SIVIT- Camera

© W. Wahlster Personalized Interaction with WebTVs via SmartKom (DFKI with Sony, Philips, Siemens) User: Switch on the TV. Smartakus: Okay, the TV is on. User: Which channels are presenting the latest news right now? Smartakus: CNN and NTV are presenting news. User: Please record this news channel on a videotape. Smartakus: Okay, the VCR is now recording the selected program. Example: Multimodal Access to Electronic Program Guides for TV

© W. Wahlster Using Facial Expression Recognition for Affective Personalization (1) Smartakus: Here you see the CNN program for tonight. (2)User: That’s great. (3)Smartakus: I’ll show you the program of another channel for tonight. (2’)User: That’s great. (3’) Smartakus: Which of these features do you want to see? Processing ironic or sarcastic comments  

© W. Wahlster negativeneutral Recognizing Affect: A Negative Facial Expression of the User

© W. Wahlster The SmartKom Demonstrator System Camera for Gestural Input Microphone Multimodal Control of TV-Set Multimodal Control of VCR/DVD Player Camera for Facial Analysis

© W. Wahlster Combination of Speech and Gesture in SmartKom This one I would like to see. Where is it shown?

© W. Wahlster Multimodal Input and Output in SmartKom Please show me where you would like to be seated.

© W. Wahlster Getting Driving and Walking Directions via SmartKom User: I want to drive to Heidelberg. Smartakus: Do you want to take the fastest or the shortest route? User: The fastest. Smartakus: Here you see a map with your route from Saarbrücken to Heidelberg. SmartKom can be used for Multimodal Navigation Dialogues in a Car

© W. Wahlster Getting Driving and Walking Directions via SmartKom Smartakus: You are now in Heidelberg. Here is a sightseeing map of Heidelberg. User: I would like to know more about this church! Smartakus: Here is some information about the St. Peter's Church. User: Could you please give me walking directions to this church? Smartakus: In this map, I have high-lighted your walking route.

© W. Wahlster SmartKom: Multimodal Dialogues with a Hybrid Navigation System

© W. Wahlster Seamless integration and mutual disambiguation of multimodalinput and output on semantic and pragmatic levels Situated understanding of possibly imprecise, ambiguous, or incom- plete multimodal input Context-sensitive interpretation of dialog interaction on the basis of dynamic discourse and context models Adaptive generation of coordinated, cohesive and coherent multimodal presentations Semi- or fully automatic completion of user-delegated tasks through the integration of information services Intuitive personification of the system through a presentation agent Salient Characteristics of SmartKom

© W. Wahlster The High-Level Control Flow of SmartKom

© W. Wahlster SmartKom’s Multimodal Dialogue Back-Bone Communication Blackboards Data Flow Context Dependencies Analyzers External Services Modality Fusion Discourse Modeling Action Planning Modality Fission Generators Speech Gestures Facial Expressions Speech Graphics Gestures Dialogue Manager

© W. Wahlster Unification of Scored Hypothesis Graphs for Modality Fusion in SmartKom Modality Fusion Mutual Disambiguation Reduction of Uncertainty Intention Hypotheses Graph Word Hypothesis Graph with Acoustic Scores Intention Recognizer Selection of Most Likely Interpretation Clause and Sentence Boundaries with Prosodic Scores Scored Hypotheses about the User‘s Emotional State Gesture Hypothesis Graph with Scores of Potential Reference Objects

© W. Wahlster SmartKom‘s Computational Mechanisms for Modality Fusion and Fission M3L: Modality-Free Semantic Representation Ontological Inferences Unification Overlay Operations Planning Constraint Propagation Modality Fusion Modality Fission

© W. Wahlster The Overlay Operation Versus the Unification Operation Nonmonotonic and noncommutative unification-like operation Inherit (non-conflicting) background information two sources of conflicts: –conflicting atomic values overwrite background (old) with covering (new) –type clash assimilate background to the type of covering; recursion Unification Overlay cf. J. Alexandersson, T. Becker 2001

© W. Wahlster Overlay Operations Using the Discourse Model Augmentation and Validation –compare with a number of previous discourse states: fill in consistent information compute a score –for each hypothesis - background pair : –Overlay (covering, background) Covering: Background: Intention Hypothesis Lattice Selected Augmented Hypothesis Sequence

© W. Wahlster An Example of the Overlay Operation Go to the moviesFilms on TV tonight Generalisation and Specialisation U: What films are shown on TV tonight?.... U: I‘d rather go to the movies.

© W. Wahlster Smartkom‘s Three-Tiered Discourse Model DO 1 DO 2 VO 1 DO 10 DO 3 DO 9 Modality Layer Discourse Layer System: This [  ] is a list of films showing in Heidelberg. heidelberg list LO 2 LO 3... Domain Layer DomainObject 1 ticketfirst DO 11 DO 12 reserve LO 4 LO 5 LO 6 DomainObject 2 GO 1...  User: Please reserve a ticket for the first one. DO = Discourse Object, LO = Linguistic Object GO = Gestural Object, VO = Visual Object cf. M. Löckelt et. al. 2002, N. Pfleger 2002

© W. Wahlster The High-Level Control Flow of SmartKom

© W. Wahlster Smartakus uses body language to notify the user that it is waiting for his input, that it is listening to him, that it has problems to understand his input, or that it is trying hard to find an answer to his question.g Smartakus is a Self-Animated Interface Agent Idle TimeNavigationPresentationSystem State

© W. Wahlster Some Complex Behavioural Patterns of the Interaction Agent Smartakus

© W. Wahlster [...] cinema_17a Europa [...] pid1234 [...] [...] cinema_17a Europa [...] pid1234 [...] M3L Representation of the Multimodal Discourse Context Blackboard with Presentation Context of the Previous Dialogue Turn

© W. Wahlster M3L Specification of a Presentation Task EuroSport T14:00: T15:00:00 Sport News sport... leanForward APGOAL3000 generatorAction GraphicsAndSpeech

© W. Wahlster SmartKom‘s Presentation Planner The Presentation Planner generates a Presentation Plan by applying a set of Presentation Strategies to the Presentation Goal. GlobalPresent PresentAddSmartakus DoLayout EvaluatePersonaNode Inform TryToPresentTVOverview ShowTVOverview SetLayoutData ShowTVOverview SetLayoutData PersonaAction SendScreenCommand Generation of Layout Smartakus Actions GenerateText... Speak cf. J. Müller, P. Poller, V. Tschernomas 2002

© W. Wahlster SmartKom‘s Use of Semantic Web Technology Three Layers of Annotations cf.: Dieter Fensel, James Hendler, Henry Liebermann, Wolfgang Wahlster (eds.) Spinning the Semantic Web, MIT Press, November 2002 Personalized Presentation M3L Content high Structure XML medium Layout HTML low

© W. Wahlster Various types of unification, overlay, constraint processing, planning and ontological inferences are the fundamental processes involved in SmartKom‘s modality fusion and fission components. The key function of modality fusion is the reduction of the overall uncertainty and the mutual disambiguation of the various analysis results based on a three-tiered representation of multimodal discourse. We have shown that a multimodal dialogue sytsem must not only understand and represent the user‘s input, but its own multimodal output. Conclusions

© W. Wahlster First International Conference on Perceptive & Multimodal User Interfaces (PMUI’03) November 5-7 th, 2003 Delta Pinnacle Hotel, Vancouver, B.C., Canada Conference Chair Sharon Oviatt, Oregon Health & Science Univ., USA Program Chairs Wolfgang Wahlster, DFKI, Germany Mark Maybury, MITRE, USA PMUI’03 is sponsored by ACM, and will be co-located in Vancouver with ACM’s UIST’03. This meeting follows three successful Perceptive User Interface Workshops (with PUI’01 held in Florida) and three International Multimodal Interface Conferences initiated in Asia (with ICMI’02 held in Pittsburgh).