Prof. Wolfgang Wahlster German Research Center for Artificial Intelligence, DFKI GmbH Stuhlsatzenhausweg 3 66123 Saarbruecken, Germany phone: (+49 681)

Slides:

Advertisements

Similar presentations

Wolfgang Wahlster German Research Center for Artificial Intelligence, DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162.

Advertisements

GMD German National Research Center for Information Technology Darmstadt University of Technology Perspectives and Priorities for Digital Libraries Research.

National Technical University of Athens Department of Electrical and Computer Engineering Image, Video and Multimedia Systems Laboratory

FLST: Speech Recognition Bernd Möbius

TeleMorph & TeleTuras: Bandwidth determined Mobile MultiModal Presentation Student: Anthony J. Solon Supervisors: Prof. Paul Mc Kevitt Kevin Curran School.

Empirical and Data-Driven Models of Multimodality Advanced Methods for Multimodal Communication Computational Models of Multimodality Adequate.

PHONEXIA Can I have it in writing?. Discuss and share your answers to the following questions: 1.When you have English lessons listening to spoken English,

Natural Language and Speech Processing Creation of computational models of the understanding and the generation of natural language. Different fields coming.

Speech Translation on a PDA By: Santan Challa Instructor Dr. Christel Kemke.

Media Coordination in SmartKom Norbert Reithinger Dagstuhl Seminar “Coordination and Fusion in Multimodal Interaction” Deutsches Forschungszentrum für.

CSE111: Great Ideas in Computer Science Dr. Carl Alphonce 219 Bell Hall Office hours: M-F 11:00-11:

Course Overview Lecture 1 Spoken Language Processing Prof. Andrew Rosenberg.

1 CS 430: Information Discovery Lecture 22 Non-Textual Materials 2.

Spoken Language Technologies: A review of application areas and research issues Analysis and synthesis of F0 contours Agnieszka Wagner Department of Phonetics,

Spoken Language Systems: The Unfinished Agenda Raj Reddy School of Computer Science Carnegie Mellon University Pittsburgh September 21, 2006 The entire.

Tanja Schultz, Alan Black, Bob Frederking Carnegie Mellon University West Palm Beach, March 28, 2003 Towards Dolphin Recognition.

ISTD 2003, Audio / Speech Interactive Systems Technical Design Seminar work: Audio / Speech Ville-Mikko Rautio Timo Salminen Vesa Hyvönen.

German Research Center for Artificial Intelligence, DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.

The Use of Speech in Speech-to-Speech Translation Andrew Rosenberg 8/31/06 Weekly Speech Lab Talk.

LYU0103 Speech Recognition Techniques for Digital Video Library Supervisor : Prof Michael R. Lyu Students: Gao Zheng Hong Lei Mo.

The Chinese University of Hong Kong Department of Computer Science and Engineering Lyu0202 Advanced Audio Information Retrieval System.

1 Problems and Prospects in Collecting Spoken Language Data Kishore Prahallad Suryakanth V Gangashetty B. Yegnanarayana Raj Reddy IIIT Hyderabad, India.

Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.

Track: Speech Technology Kishore Prahallad Assistant Professor, IIIT-Hyderabad 1Winter School, 2010, IIIT-H.

Mobile Multimodal Applications. Dr. Roman Englert, Gregor Glass March 23 rd, 2006.

Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.

ACL, ECCAI and the Verbmobil/SmartKom Consortia German Research Center for Artificial Intelligence Stuhlsatzenhausweg 3, Geb Saarbrücken Tel.:

1 7-Speech Recognition (Cont’d) HMM Calculating Approaches Neural Components Three Basic HMM Problems Viterbi Algorithm State Duration Modeling Training.

Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.

Some Thoughts on HPC in Natural Language Engineering Steven Bird University of Melbourne & University of Pennsylvania.

DFKI GmbH, , R. Karger Indo-German Workshop on Language Technologies Reinhard Karger, M.A. Deutsches Forschungszentrum für Künstliche Intelligenz.

7-Speech Recognition Speech Recognition Concepts

Recent Activities of Speech Corpora and Assessment in Korea Yong-Ju Lee Wonkwang University Korea.

1 Natural Language Processing Gholamreza Ghassem-Sani Fall 1383.

Multimodal Information Access Using Speech and Gestures Norbert Reithinger

Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.

Wolfgang Wahlster German Research Center for Artificial Intelligence, DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162.

May 2006CLINT-CS Verbmobil1 CLINT-CS Dialogue II Verbmobil.

Machine Learning in Spoken Language Processing Lecture 21 Spoken Language Processing Prof. Andrew Rosenberg.

Information Technology – Dialogue Systems Ulm University (Germany) Speech Data Corpus for Verbal Intelligence Estimation.

Machine Translation  Machine translation is of one of the earliest uses of AI  Two approaches:  Traditional approach using grammars, rewrite rules,

German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.

Chennai, 17./18. Feb 04Andreas KlüterNLP System Software Engineering Verbmobil from a Software Engineering point of view System Design and Software Integration.

Bernd Möbius CoE MMCI Saarland University Lecture 7 8 Dec 2010 Unit Selection Synthesis B Möbius Unit selection synthesis Text-to-Speech Synthesis.

1 CSI 5180: Topics in AI: Natural Language Processing, A Statistical Approach Instructor: Nathalie Japkowicz Objectives of.

German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.

October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.

Automatic Cue-Based Dialogue Act Tagging Discourse & Dialogue CMSC November 3, 2006.

NESPOLE! is a project which aims at providing a system capable of supporting communication in the field of e-commerce and e-service by resorting to automatic.

Recognizing Discourse Structure: Speech Discourse & Dialogue CMSC October 11, 2006.

For Monday Read chapter 26 Last Homework –Chapter 23, exercise 7.

Translingual Information Management Stephan Busemann Language Technology Lab German Research Center for Artificial Intelligence.

DFKI GmbH, , R. Karger Perspectives for the Indo German Scientific and Technological Cooperation in the Field of Language Technology Reinhard.

Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.

金聲玉振 Taiwan Univ. & Academia Sinica 1 Spoken Dialogue in Information Retrieval Jia-lin Shen Oct. 22, 1998.

Presentation Title 1 1/27/2016 Lucent Technologies - Proprietary Voice Interface On Wireless Applications Protocol A PDA Implementation Sherif Abdou Qiru.

Behrooz ChitsazLorrie Apple Johnson Microsoft ResearchU.S. Department of Energy.

Chapter 7 Speech Recognition Framework  7.1 The main form and application of speech recognition  7.2 The main factors of speech recognition  7.3 The.

Integrating Multiple Knowledge Sources For Improved Speech Understanding Sherif Abdou, Michael Scordilis Department of Electrical and Computer Engineering,

Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects. MAO Yuhang, DING Xiao-Qing, NI Yang, LIN Shiuan-Sung, Laurence LIKFORMAN,

For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.

Speech Recognition Created By : Kanjariya Hardik G.

Message Source Linguistic Channel Articulatory Channel Acoustic Channel Observable: MessageWordsSounds Features Bayesian formulation for speech recognition:

1 7-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches Recognition Theories Bayse Rule Simple Language Model P(A|W) Network Types.

September 2003, 7 th EDG Conference, Heidelberg – Roberta Faggian, CERN/IT CERN – European Organization for Nuclear Research The GRACE Project GRid enabled.

Using Human Language Technology for Automatic Annotation and Indexing of Digital Library Content Kalina Bontcheva, Diana Maynard, Hamish Cunningham, Horacio.

NCP meeting Jan 27-28, 2003, Brussels Colette Maloney Interfaces, Knowledge and Content technologies, Applications & Information Market DG INFSO Multimodal.

© W. Wahlster, DFKI IST ´98 Workshop „The Language of Business - the Business of Language“ Vienna, 2 December 1998 German Research Center for Artificial.

3.0 Map of Subject Areas.

Huawei CBG AI Challenges

Presentation transcript:

Prof. Wolfgang Wahlster German Research Center for Artificial Intelligence, DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: ( ) WWW: Brain and Communication Mainz Friday, 24 November 2000 Computers that read, hear and understand

© Wolfgang Wahlster, DFKI Pervasive Speech and Language Technology A capuccino in 10 minutes, please! Send the following to Mark Maybury: Hi Mark, please forward the following agenda to your project partners! Let‘s go to Baker Street in Berkeley! I would like to hear Mozart‘s piano concert No. 3! Speech-controlled coffee machine Speech-based car navigation Speech-enabled music selection Dictation

© Wolfgang Wahlster, DFKI Show me all CNN news of the last 3 months that feature Bill Clinton discussing health care! I would like to make an appointment with Dr. Kuremastu in Kyoto next week! Pervasive Speech and Language Technology What has Jim Hendler said about DAML during our recent Dagstuhl seminar? Information on demand Audio Mining Speech-to-Speech Translation

© Wolfgang Wahlster, DFKI What has the speaker said? 100 Alternatives What has the speaker meant? 10 Alternatives What does the speaker want? Unambiguous Understanding in the Dialog Context Reduction of Uncertainty Sprachanalyse Speech Recognition Speech Input Discourse Context Knowledge about Domain of Discourse Grammar Lexical Meaning Acoustic Language Models Word Lists Speech Analysis Speech Under- standing Three Levels of Language Processing

© Wolfgang Wahlster, DFKI Input Conditions Naturalness Adaptability Dialog Capabilities Increasing Complexity Close-Speaking Microphone/Headset Push-to-talk Telephone, Pause-based Segmentation Isolated Words Read Continuous Speech Speaker Independent Speaker Dependent Monolog Dictation Information- seeking Dialog Open Microphone, GSM Quality Spontaneous Speech Speaker adaptive Multiparty Negotiation Challenges for Language Engineering

© Wolfgang Wahlster, DFKI Wann fährt der nächste Zug nach Hamburg ab? When does the next train to Hamburg depart? Wo befindet sich das nächste Hotel? Where is the nearest hotel? Context-Sensitive Speech-to-Speech Translation Verbmobil Server

© Wolfgang Wahlster, DFKI Mobile Speech-to-Speech Translation of Spontaneous Dialogs Verbmobil Speech Translation Server Solution: Conference Call: The Verbmobil Speech Translation Server is accessed by GSM mobile phones.

© Wolfgang Wahlster, DFKI Speech-to-Speech Translation

© Wolfgang Wahlster, DFKI The Control Panel of Verbmobil

© Wolfgang Wahlster, DFKI General Speech Recognition Task German English Japanese Audio SignalRecognizersWord Hypotheses Graph

© Wolfgang Wahlster, DFKI Machine Learning for the Integration of Statistical Properties into Symbolic Models for Speech Recognition, Parsing, Dialog Processing, Translation Transcribed Speech Data Segmented Speech with Prosodic Labels Annotated Dialogs with Dialog Acts Treebanks & Predicate- Argument Structures Aligned Bilingual Corpora Hidden Markov Models Neural Nets, Multilayered Perceptrons Probabilistic Automata Probabilistic Grammars Probabilistic Transfer Rules Extracting Statistical Properties from Large Corpora

© Wolfgang Wahlster, DFKI The Use of Prosodic Information at All Processing Stages Speech SignalWord Hypotheses Graph Multilingual Prosody Module Prosodic features: duration pitch energy pause Search Space Restriction Parsing Dialog Act Segmentation and Recognition Dialog Understanding Constraints for Transfer Translation Lexical Choice Generation Speech Synthesis Speaker Adaptation Boundary Information Boundary Information Boundary Information Boundary Information Sentence Mood Sentence Mood Accented Words Accented Words Prosodic Feature Vector

© Wolfgang Wahlster, DFKI I need a car next Tuesdayoops Monday Original Utterance Editing PhaseRepair Phase Reparandum Hesitation Reparans Recognition of Substitutions Transformation of the Word Hypothesis Graph I need a car next Monday Verbmobil Technology:Understands Speech Repairs and extracts the intended meaning Dictation Systems like: ViaVoice, VoiceXpress, FreeSpeech, Naturally Speaking cannot deal with spontaneous speech and transcribe the corrupted utterances. The Understanding of Spontaneous Speech Repairs

© Wolfgang Wahlster, DFKI Wir treffen uns in Mannheim, äh, in Saarbrücken. (We are meeting in Mannheim, oops, in Saarbruecken.) We are meeting in Saarbruecken. English German Automatic Understanding and Correction of Speech Repairs in Spontaneous Telephone Dialogs

© Wolfgang Wahlster, DFKI Fielded applications Train schedules (German Railway System, DB) TABA (Philips) OSCAR (DaimlerChrysler) Flight Schedules (Lufthansa) ALF (Philips) Technical Challenges: phone -based dialogs, many proper names, clarification subdialogs Spoken Dialogs about Schedules

© Wolfgang Wahlster, DFKI Microphone Push-to-talk Switch Please call Doris Wahlster. Open the left window in the back. I want to hear the weather channel. When will I reach the next gas station? Where is the next parking lot? Speech control of: cellular phone, radio, windows / AC, route guidance system Option for S-, C-, and E-Class of Mercedes and BMW Speaker-independent, Garbage models for non-speech (blinker, AC, wheels) Linguatronic : Spoken Dialogs with Mercedes-Benz

© Wolfgang Wahlster, DFKI With Maier on 25 Oktober, with Tetzlaff, and with Streit too. Oops, not with Streit. From 2 to 3. Okay! Speech-based Interaction with an Organizer on a WAP Phone (Voice In - WML out)

© Wolfgang Wahlster, DFKI Augmented Reality: Combining Speech, Gestures and Graphics for Mobile Access to a Digital Library Mobile Dialog with a Virtual Tourist Guide for the Heidelberg Castle Location-adaptive Query Interpretation

© Wolfgang Wahlster, DFKI Multimodal Route Description Mobile Speech Translation and Multilingual Information Access Augmented Reality: Combining Speech, Gestures and Graphics for Mobile Access to a Digital Library

© Wolfgang Wahlster, DFKI Speech-based Access to 3D Virtual Views Multimodal Output from a Digital Library and Speech-based Access to Internet Content Augmented Reality: Combining Speech, Gestures and Graphics for Mobile Access to a Digital Library

© Wolfgang Wahlster, DFKI Multilingual and Mobile Communication Assistants Multimodal Interfaces SmartKom Speech-based Web Access to Multilingual Web pages WAP Phones WebTV Multilingual Audio Retrieval and Audio Mining Discussions Lecture Notes Organizers Multilingual Indexing and Annotation of Videos Video Archives News Archives Call Centers ECommerce Mobile Travel Assistance Telephone Translations Verbmobil Dialog Translation International Research Trends in Multilingual Systems Multilingual Language Technology Speech Recognition, Language Understanding, Language Generation, and Speech Synthesis Multilingual Language Technology Speech Recognition, Language Understanding, Language Generation, and Speech Synthesis Spontaneous Speech, Robust Processing and Translation, Semantic and Pragmatic Understanding

© Wolfgang Wahlster, DFKI Open Problems for the Next Decade l Problems with current machine learning approaches  Expensive data collection  Cognitively unrealistic training data  Data sparseness l Problems with current hand-crafted knowledge sources  Brittleness  Domain dependence  Limited scalability

© Wolfgang Wahlster, DFKI A Speculative Conclusion (+50 years) -500 years TODAY +50 years Oral Society  Textual Society  Oral Society News and knowledge is passed orally No mass storage No automatic processing No automatic retrieval Mass storage of texts Text Processing Text Retrieval Mass storage of speech Speech Processing Audio Retrieval News and knowledge is passed textually News and knowledge is passed orally