German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg 3 66123 Saarbruecken, Germany phone: (+49 681) 302-5252/4162 fax: (+49.

Slides:

Advertisements

Similar presentations

Improving Learning Object Description Mechanisms to Support an Integrated Framework for Ubiquitous Learning Scenarios María Felisa Verdejo Carlos Celorrio.

Advertisements

DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.

Input Jeopardy The Keyboard Pointing Devices Pictures and Sounds Scanners and Readers Grab Bag $100100$100100$100100$100100$ $200200$200200$200200$200200$

German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.

Talk and Look: Tools for Ambient Linguistic Knowledge A Project funded by the European Community under the Sixth Framework Programme for Research and Technological.

German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.

Irek Defée Signal Processing for Multimodal Web Irek Defée Department of Signal Processing Tampere University of Technology W3C Web Technology Day.

© Löckelt, Becker, Pfleger, Alexandersson; DFKI Edilog 2002 Workshop Jan Alexandersson (Tilman Becker, Markus Löckelt, Norbert Pfleger) German Research.

ENTERFACE’08 Multimodal Communication with Robots and Virtual Agents.

Empirical and Data-Driven Models of Multimodality Advanced Methods for Multimodal Communication Computational Models of Multimodality Adequate.

MediaHub: An Intelligent Multimedia Distributed Hub Student: Glenn Campbell Supervisors: Dr. Tom Lunney Prof. Paul Mc Kevitt School of Computing and Intelligent.

Chapter 5 Input and Output. What Is Input? What is input? p. 166 Fig. 5-1 Next  Input device is any hardware component used to enter data or instructions.

XISL language XISL= eXtensible Interaction Sheet Language or XISL=eXtensible Interaction Scenario Language.

German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.

Media Coordination in SmartKom Norbert Reithinger Dagstuhl Seminar “Coordination and Fusion in Multimodal Interaction” Deutsches Forschungszentrum für.

John Hu Nov. 9, 2004 Multimodal Interfaces Oviatt, S. Multimodal interfaces Mankoff, J., Hudson, S.E., & Abowd, G.D. Interaction techniques for ambiguity.

CPSC 695 Future of GIS Marina L. Gavrilova. The future of GIS.

Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,

CS335 Principles of Multimedia Systems Multimedia and Human Computer Interfaces Hao Jiang Computer Science Department Boston College Nov. 20, 2007.

New Technologies Are Surfacing Everyday. l Some will have a dramatic affect on the business environment. l Others will totally change the way you live.

DFKI Approach to Dialogue Management Norbert Reithinger, Elsa Pecourt, Markus Löckelt

1st Project Introduction to HTML.

Introduction ‘Have you ever played video games before? Look at the joystick movement. When you move the joystick to the left, the plane on the TV screen.

The Internet & The World Wide Web Notes

German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.

Building the Design Studio of the Future Aaron Adler Jacob Eisenstein Michael Oltmans Lisa Guttentag Randall Davis October 23, 2004.

CHAPTER 2 Input & Output Prepared by: Mrs.sara salih 1.

Chapter ONE Introduction to HTML.

Web Design Basic Concepts.

Smart Learning Services Based on Smart Cloud Computing

Mobile Multimodal Applications. Dr. Roman Englert, Gregor Glass March 23 rd, 2006.

Introduction to Computers

Chapter 12 Designing the Inputs and User Interface.

Semantic Web outlook and trends May The Past 24 Odd Years 1984 Lenat’s Cyc vision 1989 TBL’s Web vision 1991 DARPA Knowledge Sharing Effort 1996.

GUI: Specifying Complete User Interaction Soft computing Laboratory Yonsei University October 25, 2004.

Brussels, 04 March 2004Workshop „New Communication Paradigms for 2020“ Semantic Routing, Service Discovery and Service Composition Gregor Erbach German.

An approach to Intelligent Information Fusion in Sensor Saturated Urban Environments Charalampos Doulaverakis Centre for Research and Technology Hellas.

Center for Human Computer Communication Department of Computer Science, OG I 1 Designing Robust Multimodal Systems for Diverse Users and Mobile Environments.

Author: James Allen, Nathanael Chambers, etc. By: Rex, Linger, Xiaoyi Nov. 23, 2009.

Recognition of meeting actions using information obtained from different modalities Natasa Jovanovic TKI University of Twente.

Working group on multimodal meaning representation Dagstuhl workshop, Oct

Spoken dialog for e-learning supported by domain ontologies Dario Bianchi, Monica Mordonini and Agostino Poggi Dipartimento di Ingegneria dell’Informazione.

APML, a Markup Language for Believable Behavior Generation Soft computing Laboratory Yonsei University October 25, 2004.

Multimodal Information Access Using Speech and Gestures Norbert Reithinger

HTML, XHTML, and CSS Sixth Edition Chapter 1 Introduction to HTML, XHTML, and CSS.

-1- Philipp Heim, Thomas Ertl, Jürgen Ziegler Facet Graphs: Complex Semantic Querying Made Easy Philipp Heim 1, Thomas Ertl 1 and Jürgen Ziegler 2 1 Visualization.

German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.

MULTIMEDIA DEFINITION OF MULTIMEDIA

Input By Hollee Smalley. What is Input? Input is any data or instructions entered into the memory of a computer.

CORPORUM-OntoExtract Ontology Extraction Tool Author: Robert Engels Company: CognIT a.s.

Towards multimodal meaning representation Harry Bunt & Laurent Romary LREC Workshop on standards for language resources Las Palmas, May 2002.

German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.

German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.

Chapter 5: Input CSC 151 Beth Myers Kristy Heller Julia Zachok.

Semantic Web - an introduction By Daniel Wu (danielwujr)

German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.

ENTERFACE 08 Project 1 “MultiParty Communication with a Tour Guide ECA” Mid-term presentation August 19th, 2008.

A Look To The Future Next-Generation User Interfaces By: John Garcia.

A comprehensive framework for multimodal meaning representation Ashwani Kumar Laurent Romary Laboratoire Loria, Vandoeuvre Lès Nancy.

Architecture of Decision Support System

School of something FACULTY OF OTHER Facing Complexity Using AAC in Human User Interface Design Lisa-Dionne Morris School of Mechanical Engineering

Semantic Gadgets Pervasive Computing Meets the Semantic Web Reza Zakeri Sharif University of Technology.

DFKI GmbH, , R. Karger Perspectives for the Indo German Scientific and Technological Cooperation in the Field of Language Technology Reinhard.

A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.

Oman College of Management and Technology Course – MM Topic 7 Production and Distribution of Multimedia Titles CS/MIS Department.

German Research Center for Artificial Intelligence DFKI GmbH Saarbruecken, Germany WWW: Eurospeech.

W3C Multimodal Interaction Activities Deborah A. Dahl August 9, 2006.

WP6 Emotion in Interaction Embodied Conversational Agents WP6 core task: describe an interactive ECA system with capabilities beyond those of present day.

© W. Wahlster, DFKI IST ´98 Workshop „The Language of Business - the Business of Language“ Vienna, 2 December 1998 German Research Center for Artificial.

Multimodal Human-Computer Interaction New Interaction Techniques 22. 1

Presentation transcript:

German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: ( ) WWW: Wolfgang Wahlster Towards Symmetric Multimodality: Fusion and Fission of Speech, Gesture and Facial Expression 26th Annual German Conference on Artificial Intelligence (KI 2003) 16 September 2003, Hamburg

© W. Wahlster Spoken Dialogue Graphical User interfaces Gestural Interaction Multimodal Interaction SmartKom: Merging Various User Interface Paradigms Facial Expressions Biometrics

© W. Wahlster but: the user must input more and more complex commands to specify his information needs. Broadband mobile Internet access technologies via UMTS or mobile hotspots pave the way for a wide spectrum of added-value web services. PDAs and smartphones with tiny keyboards and mice are useless for mobile settings. Multimodal Dialogue Systems for Mobile Systems The Need for Mobile Multimodal Dialogue Systems

© W. Wahlster The Fusion of Multimodal Input Multiple modalities increase the uncertainty of interpretation Uncertainty of Signal Interpretation in Perceptive User Interfaces ???? Speech Recognition Prosody Recognition Gesture Recognition Facial Expression Recognition

© W. Wahlster The Fusion of Multimodal Input Dialog Context Fusion with mutual reduction of uncertainties by the exclusion of nonsensical combinations but: the semantic fusion of multiple modalities in the dialog context ensures an unambiguous interpretation Multiple modalities increase the uncertainty of interpretation Speech Recognition Prosody Recognition Gesture Recognition Facial Expression Recognition

© W. Wahlster The SmartKom Consortium MediaInterface European Media Lab Uinv. Of Munich Univ. of Stuttgart Saarbrücken Aachen Dresden Berkeley Stuttgart MunichUniv. of Erlangen Heidelberg Main Contractor DFKI Saarbrücken Project duration: September 1999 – September 2003 Final presentation focusing on the mobile version: 5th September, Stuttgart Ulm

© W. Wahlster MAJOR SCIENTIFIC GOALS SmartKom‘s Major Scientific Goals Explore and design new symbolic and statistical methods for the seamless fusion and mutual disambiguation of multimodal input on semantic and pragmatic levels. Generalize advanced discourse models for spoken dialogue systems so that they can capture a broad spectrum of multimodal discourse phenomena. Explore and design new constraint-based and plan-based methods for multimodal fission and adaptive presentation layout. Integrate all these multimodal capabilities in a reusable, efficient and robust dialogue shell, that guarantees flexible configuration, domain independence and plug- and-play functionality.

© W. Wahlster Outline of the Talk 1.Towards Symmetric Multimodality 2.SmartKom: A Flexible and Adaptive Multimodal Dialogue Shell 3. Perception and Action under Multimodal Conditions 4. Multimodal Fusion and Fission in SmartKom 5. Ontological Inferences and the Three-Tiered Discourse Model of SmartKom 6. The Economic and Scientific Impact of SmartKom 7. Conclusions

© W. Wahlster Input Speech Gestures Facial Expressions Multimodal Fusion SmartKom Provides Full Symmetric Multimodality Symmetric multimodality means that all input modes (speech, gesture, facial expression) are also available for output, and vice versa. Challenge: A dialogue system with symmetric multimodality must not only understand and represent the user's multimodal input, but also its own multimodal output. Output Speech Gestures Facial Expressions Multimodal Fission USER SYSTEM The modality fission component provides the inverse functionality of the modality fusion component.

© W. Wahlster SmartKom Covers the Full Spectrum of Multimodal Discourse Phenomena Multimodal Discourse Phenomena mutual disambiguation of modalities multimodal deixis resolution and generation crossmodal reference resolution and generation multimodal turn-taking and backchannelling multimodal ellipsis resolution and generation multimodal anaphora resolution and generation Symmetric multimodality is a prerequisite for a principled study of these discourse phenomena.

© W. Wahlster Infrared Camera for Gestural Input, Tilting CCD Camera for Scanning, Video Projector Microphone Multimodal Control of TV-Set Multimodal Control of VCR/DVD Player Camera for Facial Analysis Projection Surface Speakers for Speech Output SmartKom’s Multimodal Input and Output Devices 3 dual Xeon 2.8 Ghz processors with 1.5 GB main memory

© W. Wahlster MM Dialogue Back- Bone Home: Consumer Electronics EPG Public: Cinema, Phone, Fax, Mail, Biometrics Mobile: Car and Pedestrian Navigation Application Layer SmartKom-Mobile Mobile Travel Companion that helps with navigation SmartKom-Public: Communication Companion that helps with phone, fax, , and authetification SmartKom-Home/Office: Infotainment Companion that helps select media content SmartKom: A Flexible and Adaptive Shell for Multimodal Dialogues

© W. Wahlster Here is a map with movie theatres. Generating Maps, Animations and Information Displays on the Fly

© W. Wahlster I would like to see this movie. Reference Resolution is based on a Symbolic Representation of the Smart Graphics Output

© W. Wahlster The route from Palais Moraß to Kino im Karlstor is marked on the map. Synchronization of Map Update and Character Behaviour

© W. Wahlster Please place your hand with spread fingers on the marked area. Interactive Biometric Authentication by Hand Contour Recognition

© W. Wahlster SmartKom-Home as an Infotainment Companion that helps select media content and runs on a tablet PC

© W. Wahlster SmartKom-Public Biometric Authentication, Telephony and Document Scanning and Forwarding in a Multimodal Dialogue

© W. Wahlster SmartKom-Mobile as a Travel Companion in the Car

© W. Wahlster SmartKom-Mobile as a Travel Companion for Pedestrians

© W. Wahlster The High-Level Control Flow of SmartKom

© W. Wahlster The High-Level Control Flow of SmartKom

© W. Wahlster The High-Level Control Flow of SmartKom

© W. Wahlster The High-Level Control Flow of SmartKom

© W. Wahlster The High-Level Control Flow of SmartKom

© W. Wahlster The High-Level Control Flow of SmartKom

© W. Wahlster The High-Level Control Flow of SmartKom

© W. Wahlster The High-Level Control Flow of SmartKom

© W. Wahlster The High-Level Control Flow of SmartKom

© W. Wahlster The High-Level Control Flow of SmartKom

© W. Wahlster The High-Level Control Flow of SmartKom

© W. Wahlster The High-Level Control Flow of SmartKom

© W. Wahlster The High-Level Control Flow of SmartKom

© W. Wahlster The High-Level Control Flow of SmartKom

© W. Wahlster SmartKom‘s Language Model and Lexicon is Augmented on the Fly with Named Entities Cinema Info - movie titles - actor names SmartKom‘s Basic Vocabulary 5500 Words TV Info - names of TV features - actor names Geographic Info - street names - names of points-of-interest e.g. all cinemas in one city > 200 new words e.g. TV programm of one day > 200 new words e.g. one city > more than 500 new names After a short dialogue sequence the lexicon includes > words.

© W. Wahlster Unification of Scored Hypothesis Graphs for Modality Fusion in SmartKom Word Hypothesis Graph with Acoustic Scores Clause and Sentence Boundaries with Prosodic Scores Scored Hypotheses about the User‘s Emotional State Gesture Hypothesis Graph with Scores of Potential Reference Objects Intention Recognizer Selection of Most Likely Interpretation Modality Fusion Mutual Disambiguation Reduction of Uncertainty Intention Hypotheses Graph

© W. Wahlster […] acoustic gesture understanding set epg_info […] featureFilm Enemy of the State […] […] […] acoustic gesture understanding set epg_info […] featureFilm Enemy of the State […] […] Confidence in the Speech Recognition Result Confidence in the Gesture Recognition Result Planning Act Object Reference Confidence in the Speech Understanding Result M3L Representation of an Intention Lattice Fragment I would like to know more about this

© W. Wahlster Please reserve these three seats. SmartKom Understands Complex Encircling Gestures

© W. Wahlster Using Facial Expression Recognition for Affective Personalization (3’) Smartakus: Which of these features do you want to see? Processing ironic or sarcastic comments (1) Smartakus: Here you see the CNN program for tonight. (2)User: That’s great.  (3)Smartakus: I’ll show you the program of another channel for tonight. (2’)User: That’s great. 

© W. Wahlster Fusing Symbolic and Statistical Information in SmartKom Early Fusion on the Signal Processing Level Face Camera Microphone Facial Expressions Affective User State Emotional Prosody - anger - joy Multiple Recognizers for a Single Modality time-stamped and scored hypotheses Speech Signal Boundary Prosody Emotional Prosody Speech Recognition

© W. Wahlster SmartKom‘s Computational Mechanisms for Modality Fusion and Fission Modality Fusion Modality Fission Ontological Inferences Unification Overlay Operations Planning Constraint Propagation M3L: Modality-Free Semantic Representation

© W. Wahlster The Markup Language Layer Model of SmartKom M3L MultiModal Markup Language OIL Ontology Inference Layer XMLS eXtended Markup Language Schema RDFS Resource Description Framework Schema XML eXtended Markup Language RDF Resource Description Framework HTML Hypertext Markup Language

© W. Wahlster March 2003 ISBN x 9, 392 pp., 98 illus. $40.00/£26.95 (CLOTH) Edited by Dieter Fensel, James A. Hendler, Henry Lieberman and Wolfgang Wahlster Foreword by Tim Berners-Lee Spinning the Semantic Web

© W. Wahlster Personalization Mapping Digital Content Onto a Variety of Structures and Layouts From the “one-size fits-all“ approach of static presentations to the “perfect personal fit“ approach of adaptive multimodal presentations Structure XML 1 XML 2 XML n Content M3L Layout HTML 11 HTML 1m HTML 21 HTML 2o HTML 31 HTML 3p

© W. Wahlster The Role of the Semantic Web Language M3L M3L (Multimodal Markup Language) defines the data exchange formats used for communication between all modules of SmartKom M3L is partioned into 40 XML schema definitions covering SmartKom‘s discourse domains The XML schema event.xsd captures the semantic representation of concepts and processes in SmartKom‘s multimodal dialogs

© W. Wahlster OIL2XSD: Using XSLT Stylesheets to Convert an OIL Ontology to an XML Schema

© W. Wahlster Using Ontologies to Extract Information from the Web MyOnto-Movie :title :description :actors MyOnto-Person :name :birthday :director Film.de-Movie :title :description Kinopolis.de-Movie :name :critics :o-title :main actor Mapping of Metadata

© W. Wahlster I would like to send an to Dr.Reuse M3L as a Meaning Representation Language for the User‘s Input

© W. Wahlster Exploiting Ontological Knowledge to Understand and Answer the User‘s Queries T10:25:46 Schwarzenegger/name> Pro7 Which movies with Schwarzenegger are shown on the Pro7 channel?

© W. Wahlster SmartKom’s Multimodal Dialogue Back-Bone Communication Blackboards Data Flow Context Dependencies Analyzers External Services Modality Fusion Discourse Modeling Action Planning Modality Fission Generators Speech Gestures Facial Expressions Speech Graphics Gestures Dialogue Manager

© W. Wahlster list epg_browse now T19:42: T22:00: T19:50: T19:55:00 Today’s Stock News ARD …….. A Fragment of a Presentation Goal, as specified in M3L

© W. Wahlster Today's Stock News Everybody Loves Raymond The King of Queens Evening News Still Standing Yes, Dear Crossing Jordan Bonanza Passions Mr. Personality Down to Earth Weather Forecast Today Here is a listing of tonight's TV broadcasts. A Dynamically Generated Multimodal Presentation based on a Presentation Goal

© W. Wahlster Domain Layer Discourse Layer Modality Layer OO1 TV broadcasts on 20/3/2003 DO 1 DO 11 DO 12 DO 13 OO2 Broadcast of „The King of Queens“ on 20/3/2003 DO 2 DO 3 DO 4 DO 5 LO 5 third one LO 1 listing VO 1 GO 1 here (pointing) LO 2 tonight LO 3 TV broadcast LO 4 tape An Excerpt from SmartKom’s Three-Tiered Multimodal Discourse Model

© W. Wahlster Overlay Operations Using the Discourse Model Augmentation and Validation –compare with a number of previous discourse states: fill in consistent information compute a score –for each hypothesis - background pair: Overlay (covering, background) Covering: Background: Intention Hypothesis Lattice Selected Augmented Hypothesis Sequence

© W. Wahlster The Overlay Operation Versus the Unification Operation Nonmonotonic and noncommutative unification-like operation Inherit (non-conflicting) background information two sources of conflicts: –conflicting atomic values overwrite background (old) with covering (new) –type clash assimilate background to the type of covering; recursion Unification Overlay cf. J. Alexandersson, T. Becker 2001

© W. Wahlster Example for Overlay User: "What films are on TV tonight?" System: [presents list of films] User: "That‘s a boring program, I‘d rather go to the movies." How do we inherit “tonight” ?

© W. Wahlster Overlay Simulation Go to the moviesFilms on TV tonight Assimilation Background Covering

© W. Wahlster Overlay - Scoring Four fundamental scoring parameters: –Number of features from Covering (co) –Number of features from Background (bg) –Number of type clashes (tc) –Number of conflicting atomic values (cv) Codomain [-1,1] Higher score indicates better fit (1  overlay(c,b)  unify(c,b))

© W. Wahlster SmartKom‘s Presentation Planner The Presentation Planner generates a Presentation Plan by applying a set of Presentation Strategies to the Presentation Goal. GlobalPresent PresentAddSmartakus DoLayout EvaluatePersonaNode Inform TryToPresentTVOverview ShowTVOverview SetLayoutData ShowTVOverview SetLayoutData PersonaAction SendScreenCommand Generation of Layout Smartakus Actions GenerateText... Speak cf. J. Müller, P. Poller, V. Tschernomas 2002

© W. Wahlster Adaptive Layout and Plan-Based Animation in SmartKom‘s Multimodal Presentation Generator

© W. Wahlster Seamless integration and mutual disambiguation of multimodalinput and output on semantic and pragmatic levels Situated understanding of possibly imprecise, ambiguous, or incom- plete multimodal input Context-sensitive interpretation of dialog interaction on the basis of dynamic discourse and context models Adaptive generation of coordinated, cohesive and coherent multimodal presentations Semi- or fully automatic completion of user-delegated tasks through the integration of information services Intuitive personification of the system through a presentation agent Salient Characteristics of SmartKom

© W. Wahlster The Economic and Scientific Impact of SmartKom 51 patents + 29 spin-off products 13 speech recognition 10 dialogue management 6 biometrics 3 video-based interaction 2 multimodal interfaces 2 emotion recognition Economic Impact 246 publications 117 keynotes / invited talks 66 masters and doctoral theses 27 new projects use results 5 tenured professors 10 TV features 81 press articles Scientific Impact

© W. Wahlster The virtual mouse has been installed in a cell phone with a camera. When the user holds a normal pen about 30cm in front of the camera, the system recognizes the tip of the pen as a mouse pointer. A red point then appears at the the tip on the display. An Example of Technology Transfer: The Virtual Mouse

© W. Wahlster Former Employees of DFKI and Researchers from the SmartKom Consortium have Founded Five Start-up Companies Eyeled ( CoolMuseum GmbH ( Mineway GmbH ( Location-aware mobile information systems Multimodal systems for music rerieval Agent-based middleware Sonicson GmbH ( Quadox AG (

© W. Wahlster SmartKom’s Impact on International Standardization SmartKom‘s Multimodal Markup Language M3L Standard for Multimodal Content Representation Scheme ISO, TC37, SC4 Standard for Natural Markup Language w3.org/TR/nl-spec ISO W3C

© W. Wahlster SmartKom‘s Impact on Software Tools and Resources for Research on Multimodality MULTIPLATFORM Software Framework Sites all over Europe COMIC, EU, FP5 Conversational Multimodal Interaction with Computers 1.6 Terabytes 448 WOZ Sessions - audio transcripts - gesture and emotion labeling BAS ELRA LDC Germany Europe World

© W. Wahlster Burning Research Issues in Multimodal Dialogue Systems Multimodality: from alternate modes of interaction towards mutual disambiguation and synergistic combinations Discourse Models: from information-seeking dialogs towards argumentative dialogs and negotiations Domain Models: from closed world assumptions towards the open world of web services Dialog Behaviour: from automata models towards a combination of probabilistic and plan-based models

© W. Wahlster Various types of unification, overlay, constraint processing, planning and ontological inferences are the fundamental processes involved in SmartKom‘s modality fusion and fission components. The key function of modality fusion is the reduction of the overall uncertainty and the mutual disambiguation of the various analysis results based on a three-tiered representation of multimodal discourse. We have shown that a multimodal dialogue sytsem must not only understand and represent the user‘s input, but its own multimodal output. Conclusions

Further Presentations about SmartKom at KI 2003: Adelhard, Shi, Frank, Zeißler, Batliner, Nöth, Niemann: Multimodal User State Recognition in a Modern Dialogue System, p Müller, Poller, Tschernomas: A Multimodal Fission Approach with a Presentation Agent in the Dialog System SmartKom, p URL of this Presentation:

© W. Wahlster © 2003 DFKI Design by R.O. Thank you very much for your attention