Tight Coupling between ASR and MT in Speech-to-Speech Translation Arthur Chan Prepared for Advanced Machine Translation Seminar.

Slides:



Advertisements
Similar presentations
Toward Better Research: Cross-Threads Nick Feamster, Alex Gray, Charles Isbell College of Computing Georgia Tech.
Advertisements

Measuring the Influence of Long Range Dependencies with Neural Network Language Models Le Hai Son, Alexandre Allauzen, Franc¸ois Yvon Univ. Paris-Sud and.
Confidence Measures for Speech Recognition Reza Sadraei.
Chinese Word Segmentation Method for Domain-Special Machine Translation Su Chen; Zhang Yujie; Guo Zhen; Xu Jin’an Beijing Jiaotong University.
Hidden Markov Models Theory By Johan Walters (SR 2003)
Coupling between ASR and MT in Speech-to- Speech Translation Arthur Chan Prepared for Advanced Machine Translation Seminar.
Red Opal: Product-Feature Scoring from Reviews Christopher Scaffidi Kevin Bierhoff Eric Chang Mikhael Felker Herman Ng Chun Jin School of Computer Science.
Introduction to machine learning
Spoken Language Translation 1 Intelligent Robot Lecture Note.
IMSS005 Computer Science Seminar
Speech and Language Processing
University of Southern California Department Computer Science Bayesian Logistic Regression Model (Final Report) Graduate Student Teawon Han Professor Schweighofer,
From Machine Learning to Deep Learning. Topics that I will Cover (subject to some minor adjustment) Week 2: Introduction to Deep Learning Week 3: Logistic.
DISCRIMINATIVE TRAINING OF LANGUAGE MODELS FOR SPEECH RECOGNITION Hong-Kwang Jeff Kuo, Eric Fosler-Lussier, Hui Jiang, Chin-Hui Lee ICASSP 2002 Min-Hsuan.
CS 6961: Structured Prediction Fall 2014 Course Information.
Advanced MT Seminar Spring 2008 Instructors: Alon Lavie and Stephan Vogel.
Learning to Link with Wikipedia David Milne and Ian H. Witten Department of Computer Science, University of Waikato CIKM 2008 (Best Paper Award) Presented.
22CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University. Lecture 10: Advanced Input.
Automatic Speech Recognition: Conditional Random Fields for ASR Jeremy Morris Eric Fosler-Lussier Ray Slyh 9/19/2008.
Round-Robin Discrimination Model for Reranking ASR Hypotheses Takanobu Oba, Takaaki Hori, Atsushi Nakamura INTERSPEECH 2010 Min-Hsuan Lai Department of.
Handing Uncertain Observations in Unsupervised Topic-Mixture Language Model Adaptation Ekapol Chuangsuwanich 1, Shinji Watanabe 2, Takaaki Hori 2, Tomoharu.
Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alexander Fraser Institute for Natural Language Processing Universität Stuttgart.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
C SC 620 Advanced Topics in Natural Language Processing Lecture 25 5/4.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Supervised Learning Resources: AG: Conditional Maximum Likelihood DP:
A DYNAMIC APPROACH TO THE SELECTION OF HIGH ORDER N-GRAMS IN PHONOTACTIC LANGUAGE RECOGNITION Mikel Penagarikano, Amparo Varona, Luis Javier Rodriguez-
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Elements of a Discrete Model Evaluation.
公司 標誌 Question Answering System Introduction to Q-A System 資訊四 B 張弘霖 資訊四 B 王惟正.
Multi-Modal Dialogue in Personal Navigation Systems Arthur Chan.
Dynamic Tuning Of Language Model Score In Speech Recognition Using A Confidence Measure Sherif Abdou, Michael Scordilis Department of Electrical and Computer.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Statistical Significance Hypothesis Testing.
1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.
Tight Coupling between ASR and MT in Speech-to-Speech Translation Arthur Chan Prepared for Advanced Machine Translation Seminar.
ECE 8443 – Pattern Recognition EE 8524 – Speech Signal Processing Objectives: Word Graph Generation Lattices Hybrid Systems Resources: ISIP: Search ISIP:
Gaussian Mixture Language Models for Speech Recognition Mohamed Afify, Olivier Siohan and Ruhi Sarikaya.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Bayes Rule Mutual Information Conditional.
Christoph Prinz / Automatic Speech Recognition Research Progress Hits the Road.
1 Minimum Bayes-risk Methods in Automatic Speech Recognition Vaibhava Geol And William Byrne IBM ; Johns Hopkins University 2003 by CRC Press LLC 2005/4/26.
Bayes Risk Minimization using Metric Loss Functions R. Schlüter, T. Scharrenbach, V. Steinbiss, H. Ney Present by Fang-Hui, Chu.
Detection Of Anger In Telephone Speech Using Support Vector Machine and Gaussian Mixture Model Prepared By : Siti Marahaini Binti Mahamood.
Olivier Siohan David Rybach
Automatic Speech Recognition
Machine Learning for Computer Security
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
IS301 – Software Engineering Dept of Computer Information Systems
School of Computer Science & Engineering
Juicer: A weighted finite-state transducer speech decoder
Coupling between ASR and MT in Speech-to-Speech Translation
Deep Exploration and Filtering of Text (DEFT)
Issues in Spoken Dialogue Systems
Automatic Speech Recognition
Statistical Machine Translation Part III – Phrase-based SMT / Decoding
Dynamic Routing Using Inter Capsule Routing Protocol Between Capsules
Tight Coupling between ASR and MT in Speech-to-Speech Translation
What is Pattern Recognition?
Mohamed Kamel Omar and Lidia Mangu ICASSP 2007
Advanced NLP: Speech Research and Technologies
Sphinx 3.X (X=4) Four-Layer Categorization Scheme of Fast GMM Computation Techniques in Large Vocabulary Continuous Speech Recognition Systems
Introduction to Pattern Recognition
Automatic Speech Recognition: Conditional Random Fields for ASR
Summarizing and Note-Taking
Voice Activation for Wealth Management
Coupling between ASR and MT in Speech-to-Speech Translation
Speech recognition, machine learning
Anthor: Andreas Tsiartas, Prasanta Kumar Ghosh,
Writing an Abstract Based on slides prepared by Dr
Presenter : Jen-Wei Kuo
Module Recognition Algorithms
Statistical Machine Translation Part VI – Phrase-based Decoding
Speech recognition, machine learning
Presentation transcript:

Tight Coupling between ASR and MT in Speech-to-Speech Translation Arthur Chan Prepared for Advanced Machine Translation Seminar

This Seminar Introduction (4 slides)

A Conceptual Model of Speech- to-Speech Translation Speech Recognizer Machine Translator Speech Synthesizer waveforms Decoding Result(s) Translation waveforms

Motivation of Tight Coupling between ASR and MT One best of ASR could be wrong MT could be benefited from wide range of supplementary information provided by ASR N-best list Lattice Sentenced/Word-based Confidence Scores E.g. Word posterior probability Confusion network Or consensus decoding (Mangu 1999) Some observed that MT quality depends on WER.

Scope of this talk Speech Recognizer Machine Translator Speech Synthesizer waveforms 1-best? Translation waveforms Lattice? N-best? Confusion network? 1, Should we combine the two? 2, How tight should be the coupling?

Topics Covered Today The concept of Coupling The “tightness” of coupling between ASR and X (Ringger 95) Interfaces between ASR and MT in loose coupling What could ASR provide? What could MT use? Very tight coupling Ney’s formulae AT&T Approach Combination of features of ASR and MT Direct Modeling

The Concept of Coupling

Classification of Coupling of ASR and Natural Language Understanding (NLU) Proposed in Ringger 95, Harper 94 3 Dimensions of ASR/NLU Complexity of the search algorithm Simple N-gram? Incrementality of the coupling On-line? Left-to-right? Tightness of the coupling Tight? Loose? Semi-tight?

Tightness of Coupling Tight Semi- Tight Loose

Summary of Coupling between ASR and NLU

Implication on ASR/MT coupling Generalize many systems Loose coupling Any system which uses 1-best, n-best, lattice for 1-way module communication Tight coupling AT&T FST-based system Semi-tight coupling [Filled in a quote here]

Interfaces in Loose Coupling

Perspectives What output could an ASR generates? Not all of them are used but it could mean opportunity in future. What algorithms could MT uses given a certain inputs? On-line algorithm is a focus

Decoding of HMM-based ASR Searching the best path in a huge HMM-state lattice. 1-best ASR result The best path one could find from backtracking. State Lattice (Next page)

Things one could extract from the state lattice From the backtracking information: N-best list The N best decoding results from the state lattice Lattice A lattice of the decoding but in the word level From the lattice N-best list Confusion network. Or “consensus decoding” (Mangu 99)

Other things one could extract from the decoder Begin time and end time Useful in time-sensitive application E.g. multi-modal applications Sentence/Word-based Confidence Scores Found to be pretty useful in many other occasions

Experimental Results

How MT used the output? What decoding algorithms are using?

Tight Coupling

Literature Eric K. Ringger, “A Robust Loose Coupling for Speech Recognition and Natural Language Understanding”, Technical Report 592, Computer Science Department, Rochester University, 1995 [The AT&T paper]