German Research Center for Artificial Intelligence, DFKI GmbH Stuhlsatzenhausweg 3 66123 Saarbruecken, Germany phone: (+49 681) 302-5252/4162 fax: (+49.

Slides:



Advertisements
Similar presentations
Wolfgang Wahlster German Research Center for Artificial Intelligence, DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162.
Advertisements

TeleMorph & TeleTuras: Bandwidth determined Mobile MultiModal Presentation Student: Anthony J. Solon Supervisors: Prof. Paul Mc Kevitt Kevin Curran School.
Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.
Empirical and Data-Driven Models of Multimodality Advanced Methods for Multimodal Communication Computational Models of Multimodality Adequate.
Improving Machine Translation Quality via Hybrid Systems and Refined Evaluation Methods Andreas Eisele DFKI GmbH and Saarland University Helsinki, November.
Natural Language and Speech Processing Creation of computational models of the understanding and the generation of natural language. Different fields coming.
Towards an NLP `module’ The role of an utterance-level interface.
German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.
Media Coordination in SmartKom Norbert Reithinger Dagstuhl Seminar “Coordination and Fusion in Multimodal Interaction” Deutsches Forschungszentrum für.
Shallow Processing: Summary Shallow Processing Techniques for NLP Ling570 December 7, 2011.
SAB ReviewFebruary 2004Pervasive 2004April 2004 Using an Extended Episodic Memory Within a Mobile Companion Alexander Kröner, Stephan Baldes, Anthony Jameson,
Speech recognition, understanding and conversational interfaces Alexander Rudnicky School of Computer Science
CSE 574: Artificial Intelligence II Statistical Relational Learning Instructor: Pedro Domingos.
Intelligent User Interfaces Research Group Directed by: Frank Shipman.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
DFKI Approach to Dialogue Management Norbert Reithinger, Elsa Pecourt, Markus Löckelt
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
Wolfgang Wahlster German Research Center for Artificial Intelligence DFKI GmbH Seventeenth International Joint Conference on Artificial.
ACL, ECCAI and the Verbmobil/SmartKom Consortia German Research Center for Artificial Intelligence Stuhlsatzenhausweg 3, Geb Saarbrücken Tel.:
Prof. Wolfgang Wahlster German Research Center for Artificial Intelligence, DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( )
1 7-Speech Recognition (Cont’d) HMM Calculating Approaches Neural Components Three Basic HMM Problems Viterbi Algorithm State Duration Modeling Training.
Brussels, 04 March 2004Workshop „New Communication Paradigms for 2020“ Semantic Routing, Service Discovery and Service Composition Gregor Erbach German.
Some Thoughts on HPC in Natural Language Engineering Steven Bird University of Melbourne & University of Pennsylvania.
1 High Resolution Statistical Natural Language Understanding: Tools, Processes, and Issues. Roberto Pieraccini SpeechCycle
DFKI GmbH, , R. Karger Indo-German Workshop on Language Technologies Reinhard Karger, M.A. Deutsches Forschungszentrum für Künstliche Intelligenz.
Center for Human Computer Communication Department of Computer Science, OG I 1 Designing Robust Multimodal Systems for Diverse Users and Mobile Environments.
Speech Recognition and Machine Translation Stephan Kanthak AIXPLAIN AG, Aachen, Germany.
7-Speech Recognition Speech Recognition Concepts
Recent Activities of Speech Corpora and Assessment in Korea Yong-Ju Lee Wonkwang University Korea.
Experiments on Building Language Resources for Multi-Modal Dialogue Systems Goals identification of a methodology for adapting linguistic resources for.
Multimodal Information Access Using Speech and Gestures Norbert Reithinger
Area Report Machine Translation Hervé Blanchon CLIPS-IMAG A Roadmap for Computational Linguistics COLING 2002 Post-Conference Workshop.
Wolfgang Wahlster German Research Center for Artificial Intelligence, DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162.
May 2006CLINT-CS Verbmobil1 CLINT-CS Dialogue II Verbmobil.
RELATIONAL FAULT TOLERANT INTERFACE TO HETEROGENEOUS DISTRIBUTED DATABASES Prof. Osama Abulnaja Afraa Khalifah
Towards multimodal meaning representation Harry Bunt & Laurent Romary LREC Workshop on standards for language resources Las Palmas, May 2002.
German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.
German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.
Chennai, 17./18. Feb 04Andreas KlüterNLP System Software Engineering Verbmobil from a Software Engineering point of view System Design and Software Integration.
22CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University. Lecture 10: Advanced Input.
Reinhard Karger German Research Center for Artificial Intelligence, DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( )
Towards a Theoretical Framework for the Integration of Dialogue Models into Human-Agent Interaction John R. Lee Assistive Intelligence Inc. Andrew B. Williams.
German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.
16.0 Spoken Dialogues References: , Chapter 17 of Huang 2. “Conversational Interfaces: Advances and Challenges”, Proceedings of the IEEE,
ENTERFACE 08 Project 1 “MultiParty Communication with a Tour Guide ECA” Mid-term presentation August 19th, 2008.
Intelligent Robot Architecture (1-3)  Background of research  Research objectives  By recognizing and analyzing user’s utterances and actions, an intelligent.
Translingual Information Management Stephan Busemann Language Technology Lab German Research Center for Artificial Intelligence.
Computer Science in Context Evangelos E. Milios Professor and Graduate Coordinator Faculty of Computer Science Dalhousie University.
DFKI GmbH, , R. Karger Perspectives for the Indo German Scientific and Technological Cooperation in the Field of Language Technology Reinhard.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
金聲玉振 Taiwan Univ. & Academia Sinica 1 Spoken Dialogue in Information Retrieval Jia-lin Shen Oct. 22, 1998.
Multi-Modal Dialogue in Personal Navigation Systems Arthur Chan.
Integrating Multiple Knowledge Sources For Improved Speech Understanding Sherif Abdou, Michael Scordilis Department of Electrical and Computer Engineering,
German Research Center for Artificial Intelligence DFKI GmbH Saarbruecken, Germany WWW: Eurospeech.
Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects. MAO Yuhang, DING Xiao-Qing, NI Yang, LIN Shiuan-Sung, Laurence LIKFORMAN,
Speech Processing 1 Introduction Waldemar Skoberla phone: fax: WWW:
Understanding Naturally Conveyed Explanations of Device Behavior Michael Oltmans and Randall Davis MIT Artificial Intelligence Lab.
Stanford hci group / cs376 u Jeffrey Heer · 19 May 2009 Speech & Multimodal Interfaces.
A Speech Interface to Virtual Environment Authors Scott McGlashan and Tomas Axling Swedish Institute of Computer Science.
Message Source Linguistic Channel Articulatory Channel Acoustic Channel Observable: MessageWordsSounds Features Bayesian formulation for speech recognition:
1 7-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches Recognition Theories Bayse Rule Simple Language Model P(A|W) Network Types.
A Simple English-to-Punjabi Translation System By : Shailendra Singh.
NCP meeting Jan 27-28, 2003, Brussels Colette Maloney Interfaces, Knowledge and Content technologies, Applications & Information Market DG INFSO Multimodal.
Speech and multimodal Jesse Cirimele. papers “Multimodal interaction” Sharon Oviatt “Designing SpeechActs” Yankelovich et al.
Mastering the Pipeline CSCI-GA.2590 Ralph Grishman NYU.
© W. Wahlster, DFKI IST ´98 Workshop „The Language of Business - the Business of Language“ Vienna, 2 December 1998 German Research Center for Artificial.
Neural Machine Translation
Artificial Intelligence for Speech Recognition
Robust Translation of Spontaneous Speech: A Multi-Engine Approach
Presentation transcript:

German Research Center for Artificial Intelligence, DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: ( ) WWW: Wolfgang Wahlster Ninth Conference of the European Chapter of the Association for Computational Linguistics EACL'99 Bergen, June 10, 1999 Deep Processing of Shallow Structures The Robust Integration of Speech, Language and Translation Technology for Intelligent Interface Agents

1. Speech-to-Speech Translation: Challenges for Language Technology 2. A Multi-Blackboard Architecture for the Integration of Deep and Shallow Processing 3. Integrating the Results of Multiple Deep and Shallow Parsers 4. Packed Chart Structures for Partial Semantic Representations 5. Robust Semantic Processing: Merging and Completing Discourse Representations 6. Combining the Results of Deep and Shallow Translation Threads 7. The Impact of Verbmobil on German Language Industry 8. SmartKom: Integrating Verbmobil Technology Into an Intelligent Interface Agent 9. Conclusion Outline

 W. Wahlster, DFKI Input Conditions Naturalness Adaptability Dialog Capabilities Increasing Complexity Close-Speaking Microphone/Headset Push-to-talk Telephone, Pause-based Segmentation Isolated Words Read Continuous Speech Speaker Independent Speaker Dependent Monolog Dictation Information- seeking Dialog Open Microphone, GSM Quality Spontaneous Speech Speaker adaptive Multiparty Negotiation Verbmobil Challenges for Language Engineering

 W. Wahlster, DFKI Verbmobil Server Wann fährt der nächste Zug nach Hamburg ab? When does the next train to Hamburg depart? Wo befindet sich das nächste Hotel? Where is the nearest hotel? Final Verbmobil Demos: World Expo-2000 (Hannover) CeBIT-2000 (Hannover) COLING-2000 (Saarbrücken) Context-Sensitive Speech-to-Speech Translation

 W. Wahlster, DFKI Wenn ich den Zug um 14 Uhr bekomme, bin ich um 4 in Frankfurt. If I get the train at 2 o‘clock I am in Frankfurt at 4 o‘clock. Am Flughafen könnten wir uns treffen. We could meet at the airport. Dialog Translation 1

 W. Wahlster, DFKI Abends könnten wir Essen gehen. We could go out for dinner in the evening. What time in the evening? Wann denn am Abend? Dialog Translation 2

 W. Wahlster, DFKI Ich könnte für 8 Uhr einen Tisch reservieren. I could reserve a table for 8 o‘clock. Dialog Translation 3

 W. Wahlster, DFKI Scenario 1 Appointment Scheduling Scenario 2 Travel Planning & Hotel Reservation Scenario 3 PC-Maintenance Hotline When? When? Where? How? What? When? Where? How? Focus on temporal expressions Focus on temporal and spatial expressions Integration of special sublanguage lexica Vocabulary Size: 2500/6000 Vocabulary Size: 7000/10000 Vocabulary Size: 15000/30000 Verbmobil II: Three Domains of Discourse

UNIVERSITÄT DES SAARLANDES RUHR-UNIVERSITÄT BOCHUM Phase 2 UNIVERSITÄT HAMBURG UNIVERSITÄT KARLSRUHE UNIVERSITÄT BIELEFELD TECHNISCHE UNIVERSITÄT MÜNCHEN FRIEDRICH- ALEXANDER- UNIVERSITÄT ERLANGEN-NÜRNBERG UNIVERSITÄT STUTTGART RHEINISCHE FRIEDRICH WILHELMS-UNIVERSITÄT BONN LUDWIG MAXIMILIANS UNIVERSITÄT MÜNCHEN TU-BRAUNSCHWEIG EBERHARDT-KARLS UNIVERSITÄT TÜBINGEN  W. Wahlster, DFKI D AIMLER C HRYSLER Verbmobil Partner

 W. Wahlster, DFKI The Control Panel of Verbmobil

 W. Wahlster, DFKI The Control Panel of Verbmobil

 W. Wahlster, DFKI The Control Panel of Verbmobil

 W. Wahlster, DFKI The Control Panel of Verbmobil

 W. Wahlster, DFKI The Control Panel of Verbmobil

 W. Wahlster, DFKI The Control Panel of Verbmobil

 W. Wahlster, DFKI The Control Panel of Verbmobil

 W. Wahlster, DFKI The Control Panel of Verbmobil

 W. Wahlster, DFKI The Control Panel of Verbmobil

 W. Wahlster, DFKI The Control Panel of Verbmobil

 W. Wahlster, DFKI The Control Panel of Verbmobil

 W. Wahlster, DFKI M1 M2M3 M5 M6M4 BB 2BB 1 BB 3 M1 M2 M3 M4 M5 M6 Verbmobil I Verbmobil II Multi-Agent Architecture Multi-Blackboard Architecture Each module must know, which module produces what data Direct communication between modules Each module has only one instance Heavy data traffic for moving copies around Multiparty and telecooperation applications are impossible Software: ICE and ICE Master Basic Platform: PVM All modules can register for each blackboard dynamically No direct communication between modules Each module can have several instances No copies of representation structures (word lattice, VIT chart) Multiparty and Telecooperation applications are possible Software: PCA and Module Manager Basic Platform: PVM From a Multi-Agent Architecture to a Multi-Blackboard Architecture Blackboards

 W. Wahlster, DFKI Audio Data Word Hypothesis Graph with Prosodic Labels VITs Underspecified Discourse Representations Command Recognizer Spontaneous Speech Recognizer Channel/Speaker Adaptation Prosodic Analysis Statistical Parser Dialog Act Recognition Chunk Parser HPSG Parser Semantic Construction Robust Dialog Semantics Semantic Transfer Generation A Multi-Blackboard Architecture for the Combination of Results from Deep and Shallow Processing Modules

 W. Wahlster, DFKI Augmented Word Lattice Augmented Word Lattice Chunk Parser Statistical Parser HPSG Parser partial VITs Chart with a combination of partial VITs Chart with a combination of partial VITs Robust Dialog Semantics Combination and knowledge- based reconstruction of complete VITs Robust Dialog Semantics Combination and knowledge- based reconstruction of complete VITs partial VITs Complete and Spanning VITs Complete and Spanning VITs Integrating Shallow and Deep Analysis Components in a Multi-Blackboard Architecture

 W. Wahlster, DFKI Machine Learning for the Integration of Statistical Properties into Symbolic Models for Speech Recognition, Parsing, Dialog Processing, Translation Transcribed Speech Data Segmented Speech with Prosodic Labels Annotated Dialogs with Dialog Acts Treebanks & Predicate- Argument Structures Aligned Bilingual Corpora Hidden Markov Models Neural Nets, Multilayered Perceptrons Probabilistic Automata Probabilistic Grammars Probabilistic Transfer Rules Extracting Statistical Properties from Large Corpora

 W. Wahlster, DFKI Incremental chart construction and anytime processing Rule-based combination and transformation of partial UDRS coded as VITs Selection of a spanning analysis using a bigram model for VITs (trained on a tree bank of 24 k VITs) Chart Parser using cascaded finite-state transducers (Abney, Hinrichs) Statistical LR parser trained on treebank (Block, Ruland) Very fast HPSG parser (see two papers at ACL99, Kiefer, Krieger et al.) Semantic Construction VHG: A Packed Chart Representation of Partial Semantic Representations

 W. Wahlster, DFKI Goals of robust semantic processing (Pinkal, Worm, Rupp) Combination of unrelated analysis fragments Completion of incomplete analysis results Skipping of irrelevant fragments Method:Transformation rules on VIT Hypothesis Graph: Conditions on VIT structures  Operations on VIT structures The rules are based on various knowledge sources: lattice of semantic types domain ontology sortal restrictions semantic constraints Results: 20% analysis is improved, 0.6% analysis gets worse Robust Dialog Semantics: Deep Processing of Shallow Structures

 W. Wahlster, DFKI We are meeting in Kaiserslautern. Wir treffen uns Kaiserslautern. (We are meeting Kaiserslautern.) English German Semantic Correction of Recognition Errors

 W. Wahlster, DFKI The preposition ‚in‘ is missing in all paths through the word hypothesis graph. A temporal NP is transformed into a temporal modifier using a underspecified temporal relation: [temporal_np(V1)]  [typeraise_to_mod (V1, V2)] & V2 The modifier is applied to a proposition: [type (V1, prop), type (V2, mod)]  [apply (V2, V1, V3)] & V3 Let us meet the late afternoon to catch the train to Frankfurt Let us meet (in) the late afternoon to catch the train to Frankfurt Robust Dialog Semantics: Combining and Completing Partial Representations

 W. Wahlster, DFKI I need a car next Tuesdayoops Monday Original Utterance Editing PhaseRepair Phase Reparandum Hesitation Reparans Recognition of Substitutions Transformation of the Word Hypothesis Graph I need a car next Monday Verbmobil Technology:Understands Speech Repairs and extracts the intended meaning Dictation Systems like: ViaVoice, VoiceXpress, FreeSpeech, Naturally Speaking cannot deal with spontaneous speech and transcribe the corrupted utterances. The Understanding of Spontaneous Speech Repairs

 W. Wahlster, DFKI Wir treffen uns in Mannheim, äh, in Saarbrücken. (We are meeting in Mannheim, oops, in Saarbruecken.) We are meeting in Saarbruecken. English German Automatic Understanding and Correction of Speech Repairs in Spontaneous Telephone Dialogs

 W. Wahlster, DFKI Probabilistic Analysis of Dialog Acts (HMM) Probabilistic Analysis of Dialog Acts (HMM) Recognition of Dialog Plans (Plan Operators) Recognition of Dialog Plans (Plan Operators) Dialog Act Type Dialog Phase HPSG Analysis Robust Dialog Semantics Robust Dialog Semantics VIT Semantic Transfer Semantic Transfer Dialog Act Type Integrating a Deep HPSG-based Analysis with Probabilistic Dialog Act Recognition for Semantic Transfer

 W. Wahlster, DFKI Dialog Act CONTROL_DIALOG MANAGE_TASK PROMOTE_TASK GREETING INTRODUCE POLITENESS_FORMULA THANK DELIBERATE BACKCHANNEL INIT DEFER CLOSE REQUEST SUGGEST INFORM FEEDBACK COMMIT REQUEST_SUGGEST REQUEST_CLARIFY REQUEST_COMMENT REQUEST_COMMIT GREETING_BEGIN GREETING_END DIGRESS EXCLUDE CLARIFY GIVE_REASON DEVIATE_SCENARIO REFER_TO_SETTING CLARIFY_ANSWER FEEDBACK_NEGATIVE REJECT EXPLAINED_REJECT FEEDBACK_POSITIVE ACCEPT CONFIRM The Dialog Act Hierarchy used for Planning, Prediction, Translation and Generation

 W. Wahlster, DFKI Statistical Prediction Statistical Prediction Context Evaluation Dialog Module Dialog-Act based Translation Plan Recognition Plan Recognition Dialog Memory Dialog Memory Main Proprositional Content Dialog Act Context Evaluation Dialog-Act based Translation Transfer by Rules Generation of Minutes Dialog Act Predictions Dialog Act Dialog Phase Focus Combining Statistical and Symbolic Processing for Dialog Processing

 W. Wahlster, DFKI ( OPERATOR-s goal [IN-TURN confirm-s ?SLASH-3314 ?SLASH-3316] subgoals(sequence[IN-TURN confirm-s ?SLASH-3314 ?SLASH-3315] [IN-TURN confirm-s ?SLASH-3315 ?SLASH-3316]) PROB 0.72) ( OPERATOR-s goal [IN-TURN confirm-s ?SLASH-3321 ?SLASH-3322] subgoals (sequence[DOMAIN-DEPENDENT accept ?SLASH-3321 ?SLASH-3322]) PROB 0.95) ( OPERATOR-s goal [IN-TURN confirm-s ?SLASH-3325 ?SLASH-3326] subgoals (sequence[DOMAIN-DEPENDENT confirm ?SLASH-3325 ?SLASH-3326]) PROB 0.83) Learning of Probabilistic Plan Operators from Annotated Corpora

 W. Wahlster, DFKI Dialog Translation by Verbmobil Multilingual Generation of Protocols HTML-Document In English Transfered by Internet or Fax HTML-Document In English Transfered by Internet or Fax German Dialog Partner American Dialog Partner Automatic Generation of Multilingual Protocols of Telephone Conversations

 W. Wahlster, DFKI A and B greet each other. A: (INIT_DATE, SUGGEST_SUPPORT_DATE, REQUEST_COMMENT_DATE) I would like to make a date. How about the seventeenth? Is that ok with you? B: (REJECT_DATE, ACCEPT_DATE) The seventeenth does not suit me. I’m free for one hour at three o’clock. A: (SUGGEST_SUPPORT_DATE) How about the sixteenth in the afternoon? B: (CLARIFY_QUERY, ACCEPT_DATE, CONFIRM) The sixteenth at two o’clock? That suits me. Ok. A and B say goodbye. Minutes generated automatically on 23 May :35:18 h Automatic Generation of Minutes

 W. Wahlster, DFKI The Control Panel of Verbmobil

 W. Wahlster, DFKI Integrating Deep and Shallow Processing: Combining Results from Concurrent Translation Threads Segment 1 Wenn wir den Termin vorziehen, Segment 1 Wenn wir den Termin vorziehen, Segment 2 das würde mir gut passen. Segment 2 das würde mir gut passen. Selection Module Segment 1 Translated by Semantic Transfer Segment 1 Translated by Semantic Transfer Segment 2 Translated by Case-Based Translation Segment 2 Translated by Case-Based Translation Segment 1 If you prefer another hotel, Segment 1 If you prefer another hotel, Segment 2 please let me know. Segment 2 please let me know. Alternative Translations with Confidence Values Statistical Translation Statistical Translation Dialog-Act Based Translation Dialog-Act Based Translation Semantic Transfer Semantic Transfer Case-Based Translation Case-Based Translation

 W. Wahlster, DFKI SEQ:=Set of all translation sequences for a turn Seq  SEQ:=Sequence of translation segments s 1, s 2,...s n Input: Each translation thread provides for every segment an online confidence value confidence (thread.segment) Task: Compute normalized confidence values for translated Seq CONF (Seq) =  Length(segment) * (alpha(thread) + beta(thread) * confidence(thread.segment)) Output: Best (SEQ) = {Seq  SEQ | Seq is maximal element in (SEQ  CONF ) segment  Seq A Context-Free Approach to the Selection of the Best Translation Result

 W. Wahlster, DFKI Turn := segment 1, segment 2...segment n For each turn in a training corpus all segments translated by one of the four translation threads are manually annotated with a score for translation quality. For the sequence of n segments resulting in the best overall translation score at most 4 n linear inequations are generated, so that the selected sequence is better than all alternative translation sequences. From the set of inequations for spanning analyses (  4 n ) the values of alpha and beta can be determind offline by solving the constraint system. Learning the Normalizing Factors Alpha and Beta from an Annotated Corpus

 W. Wahlster, DFKI Turn := Segment_1 Segment_2 Segment_3 Statistical Translation = STAT Case-based Translation = CASE Dialog-Act Based Translation = DIAL Semantic Transfer = SEMT quality (CASE, Segment_1), quality (SEMT, Segment_2), quality (STAT, Sement_3) is optimal > Length (Segment_1) * (alpha (DIAL) + beta (DIAL) * confidence (DIAL, Segment_1)) Length (Segment_2) * (alpha (DIAL) + beta (DIAL) * confidence (DIAL, Segment_2)) Length (Segment_3) * (alpha (DIAL) + beta (DIAL) * confidence (DIAL, Segment_3)) Example of a Linear Inequation Used for Offline Learning Length (Segment_1) * (alpha (CASE ) + beta (CASE) * confidence (CASE, Segment_1)) Length (Segment_2) * (alpha (SEMT) + beta (SEMT) * confidence (SEMT, Segment_2)) Length (Segment_3) * (alpha (STAT) + beta (STAT) * confidence (STAT, Segment_3))

 W. Wahlster, DFKI Using probabilities of dialog acts in the normalization process CONF (Seq) =  Length (segment) * (alpha (thread) + dialog-act (thread, segment) + beta (thread) * confidence (thread, segmnet)) e.g. Greet (Statistical_Translation, Segment > Greet (Semantic_Transfer, Segment) Suggest (Semantic_Transfer, Segment) > Suggest (Case_based Translation, Segment) Exploiting meta-knowledge If the semantic transfer generates  x disambiguation tasks then increase the alpha and beta values for semantic transfer. e.g. einen Termin vorziehen  prefer/give priority to/bring forward Observation: Even on the meta-control level (selection module) a hybrid approach is advantageous. segment  Seq The Context-Sensitive Selection of the Best Translation

 W. Wahlster, DFKI Funding by the German Ministry for Education and Research BMBF Phase I ( )$ 33 M Phase II ( )$ 28 M 60% Industrial funding according to shared cost model$ 17 M Additional R&D investments of industrial partners$ 11 M Total$ 89 M > 400 Publications (>250 refereed) >Many Patents > 10 Commercial Spin-off Products >Many new Spin-off Companies > 100 New jobs in German Language >50 Academics transferred to Industry Philips, DaimlerChrysler and Siemens are leaders in Spoken Dialog Applications Verbmobil: Long-Term, Large-Scale Funding and Its Impact

 W. Wahlster, DFKI Fielded applications Train schedules (German Railway System, DB) TABA (Philips) OSCAR (DaimlerChrysler) Flight Schedules (Lufthansa) ALF (Philips) Technical Challenges: phone -based dialogs, many proper names, clarification subdialogs Spoken Dialogs about Schedules

 W. Wahlster, DFKI Microphone Push-to-talk Switch Please call Doris Wahlster. Open the left window in the back. I want to hear the weather channel. When will I reach the next gas station? Where is the next parking lot? Speech control of: cellular phone, radio, windows / AC, route guidance system Option for S-, C-, and E-Class of Mercedes and BMW Speaker-independent, Garbage models for non-speech (blinker, AC, wheels) Linguatronic : Spoken Dialogs with Mercedes-Benz

User(s) Media Analysis Design Media Fusion Output Rendering Representation and Inference User Model Discourse Model Domain Model Task Model Media Models Interaction Management Media Analysis Input Processing Information Applications People Intention Recognition Media Design Application Interface Discourse Modeling User Modeling Presentation Design Language Graphics Gesture Biometrics Language Graphics Gesture Animated Presentation Agent The Architecture of the SmartKom Agent (cf. Maybury/Wahlster 1998)  W. Wahlster, DFKI

SmartKom-Home/Office: A Versatile Agent-based Interface SmartKom-Public: A Multimodal Communication Booth SmartKom-Mobile: A Handheld Communication Assistant Media Analysis Kernel of SmartKom Interface Agent Interaction Management Application Manage- ment Media Design SmartKom: A Transportable and Transmutable Interface Agent  W. Wahlster, DFKI

Smartcard/ Credit Card for authentication and billing Docking station for PDA/Notebook/ Camcorder high speed and broad bandwidth Internet connectivity High-resolution scanner Loudspeaker Room microphone Face-tracking camera Virtual touchscreen protected against vandalism Multipoint video conferencing SmartKom-Public: A Multimodal Communication Booth  W. Wahlster, DFKI

MOBILE Camera GPS Microphone Loudspeaker Stylus-Activated Sketch Pad Wearable Compute Server Docking Station for Car PC Biosensor for Authentication & Emotional Feedback GSM for Telephone, Fax, Internet Connectivity SmartKom-Mobile: A Handheld Communication Assistant  W. Wahlster, DFKI

SpeechMike Virtual Touchscreen Natural Gesture Recognition SmartKom-Home/Office: A Versatile Agent-based Interface  W. Wahlster, DFKI

MediaInterface European Media Lab Uinv. Of Munich Univ. of Stuttgart Saarbrücken Aachen Dresden Berkeley Stuttgart MunichUniv. of Erlangen Heidelberg Main Contractor Project Management Testbed Software Integration DFKI Saarbrücken The SmartKom Consortium: Project Budget: $ 34 M Project Duration: 4 years D AIMLER C HRYSLER Ulm SmartKom: Intuitive Multimodal Interaction  W. Wahlster, DFKI

hatsuka no gogo wa ii desu Am Zwanzigsten, am Morgen wäre in Ordnung. Speaker independent, robust speech recognition, over analog phone, ISDN, and GSM mobile phone Japanese German Verbmobil: Translation of Spontaneous Speech

 W. Wahlster, DFKI Real-world problems in language technology like the understanding of spoken dialogs, speech-to-speech translation and multimodal dialog systems can only be cracked by the combined muscle of deep and shallow processing approaches. In a multi-blackboard architecture based on packed representations on all processing levels (speech recognition, parsing, semantic processing, translation, generation) using charts with underspecified representations (eg. UDRS) the results of concurrent processing threads can be combined in an incremental fashion.   Conclusion

 W. Wahlster, DFKI All results of concurrent processing modules should come with a confidence value, so that a selection module can choose the most promising result at a each processing stage. Packed representations together with formalisms for underspecification capture the uncertainties in a each processing phase, so that the uncertainties can be reduced by linguistic, discourse and domain constraints as soon as they become applicable.   Conclusion

 W. Wahlster, DFKI Deep Processing can be used for merging, completing and repairing the results of shallow processing strategies. Shallow methods can be used to guide the search in deep processing. Statistical methods must be augmented by symbolic models (eg. Class-based language modelling, word order normalization as part of statistical translation). Statistical methods can be used to learn operators or selection strategies for symbolic processes.     It is much more than a balancing act... (see Klavans and Resnik 1996) Conclusion

 W. Wahlster, DFKI Harry Bunt Ron Kay Stephan Euler Martin Kay Susan Armstrong Dieter Huber Herbert Reininger Verbmobil‘s Scientific Advisory Board

URL of this Presentation: