CNTS LTG (UA) (i) Phoneme-to-Grapheme (ii) Transcription-to-Subtitles Bart Decadt Erik Tjong Kim Sang Walter Daelemans.

Slides:

Advertisements

Similar presentations

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki

Advertisements

Customized Spell Corrector

WP4: Normalization of Transcriptions. From Transcriptions to Subtitles Erik Tjong Kim Sang University of Antwerp.

Atomatic summarization of voic messages using lexical and prosodic features Koumpis and Renals Presented by Daniel Vassilev.

1 Minimally Supervised Morphological Analysis by Multimodal Alignment David Yarowsky and Richard Wicentowski.

Ch. Eick: More on Machine Learning & Neural Networks Different Forms of Learning: –Learning agent receives feedback with respect to its actions (e.g. using.

Levenshtein-distance-based post-processing shared task spotlight Antal van den Bosch ILK / CL and AI, Tilburg University Ninth Conference on Computational.

Part of Speech Tagging Importance Resolving ambiguities by assigning lower probabilities to words that don’t fit Applying to language grammatical rules.

TT Centre for Speech Technology Early error detection on word level Gabriel Skantze and Jens Edlund Centre for Speech Technology.

Probabilistic Detection of Context-Sensitive Spelling Errors Johnny Bigert Royal Institute of Technology, Sweden

Development of Automatic Speech Recognition and Synthesis Technologies to Support Chinese Learners of English: The CUHK Experience Helen Meng, Wai-Kit.

Languages & The Media, 5 Nov 2004, Berlin 1 New Markets, New Trends The technology side Stelios Piperidis

S1S1 S2S2 S3S3 ATraNoS Workshop 12 April 2002 Patrick Wambacq.

Instance Based Learning. Nearest Neighbor Remember all your data When someone asks a question –Find the nearest old data point –Return the answer associated.

1 Noun Homograph Disambiguation Using Local Context in Large Text Corpora Marti A. Hearst Presented by: Heng Ji Mar. 29, 2004.

Methodologies for improving the g2p conversion of Dutch names Henk van den Heuvel, Nanneke Konings (CLST, Radboud Universiteit Nijmegen) Jean-Pierre Martens.

A Framework for Named Entity Recognition in the Open Domain Richard Evans Research Group in Computational Linguistics University of Wolverhampton UK

Semantic and phonetic automatic reconstruction of medical dictations STEFAN PETRIK, CHRISTINA DREXEL, LEO FESSLER, JEREMY JANCSARY, ALEXANDRA KLEIN,GERNOT.

Word-subword based keyword spotting with implications in OOV detection Jan “Honza” Černocký, Igor Szöke, Mirko Hannemann, Stefan Kombrink Brno University.

Automated Compounding as a means for Maximizing Lexical Coverage Vincent Vandeghinste Centrum voor Computerlinguïstiek K.U. Leuven.

Tying up loose ends.  Understand your data  No answers available, only data.

Data Mining Joyeeta Dutta-Moscato July 10, Wherever we have large amounts of data, we have the need for building systems capable of learning information.

Real-Time Speech Recognition Subtitling in Education Respeaking 2009 Dr Mike Wald University of Southampton.

Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.

Graphical models for part of speech tagging

Comparative study of various Machine Learning methods For Telugu Part of Speech tagging -By Avinesh.PVS, Sudheer, Karthik IIIT - Hyderabad.

Neural Networks Chapter 6 Joost N. Kok Universiteit Leiden.

Machine Learning in Spoken Language Processing Lecture 21 Spoken Language Processing Prof. Andrew Rosenberg.

CROSSMARC Web Pages Collection: Crawling and Spidering Components Vangelis Karkaletsis Institute of Informatics & Telecommunications NCSR “Demokritos”

NEURAL NETWORKS FOR DATA MINING

Xiaoxiao Shi, Qi Liu, Wei Fan, Philip S. Yu, and Ruixin Zhu

Transfer Learning with Applications to Text Classification Jing Peng Computer Science Department.

Web-Assisted Annotation, Semantic Indexing and Search of Television and Radio News (proceedings page 255) Mike Dowman Valentin Tablan Hamish Cunningham.

LML Speech Recognition Speech Recognition Introduction I E.M. Bakker.

Improving out of vocabulary name resolution The Hanks David Palmer and Mari Ostendorf Computer Speech and Language 19 (2005) Presented by Aasish Pappu,

Transformation-Based Learning Advanced Statistical Methods in NLP Ling 572 March 1, 2012.

1 Pattern Recognition Pattern recognition is: 1. A research area in which patterns in data are found, recognized, discovered, …whatever. 2. A catchall.

Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.

PIER Research Methods Protocol Analysis Module Hua Ai Language Technologies Institute/ PSLC.

Hendrik J Groenewald Centre for Text Technology (CTexT™) Research Unit: Languages and Literature in the South African Context North-West University, Potchefstroom.

Optimized Nearest Neighbor Methods Cam Weighted Distance vs. Statistical Confidence Robert R. Puckett.

Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.

Linked Data Profiling Andrejs Abele National University of Ireland, Galway Supervisor: Paul Buitelaar.

A Primer on Reading Terminology. AUTOMATICITY Readers construct meaning through recognition of words and passages (strings of words). Proficient readers.

1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.

A Maximum Entropy Language Model Integrating N-grams and Topic Dependencies for Conversational Speech Recognition Sanjeev Khudanpur and Jun Wu Johns Hopkins.

S1S1 S2S2 S3S3 8 October 2002 DARTS ATraNoS Automatic Transcription and Normalisation of Speech Jacques Duchateau, Patrick Wambacq, Johan Depoortere,

Maximum Entropy techniques for exploiting syntactic, semantic and collocational dependencies in Language Modeling Sanjeev Khudanpur, Jun Wu Center for.

Message Source Linguistic Channel Articulatory Channel Acoustic Channel Observable: MessageWordsSounds Features Bayesian formulation for speech recognition:

Learning A Better Compiler Predicting Unroll Factors using Supervised Classification And Integrating CPU and L2 Cache Voltage Scaling using Machine Learning.

Named Entities in Domain Unlimited Speech Translation Alex Waibel, Stephan Vogel, Tanja Schultz Carnegie Mellon University Interactive Systems Labs.

How do you get here?

Tasneem Ghnaimat. Language Model An abstract representation of a (natural) language. An approximation to real language Assume we have a set of sentences,

N-best list reranking using higher level phonetic, lexical, syntactic and semantic knowledge sources Mithun Balakrishna, Dan Moldovan and Ellis K. Cave.

Big data classification using neural network

Linguistic knowledge for Speech recognition

Topics discussed in this section:

School of Computer Science & Engineering

Hidden Markov Models (HMM)

HUMAN LANGUAGE TECHNOLOGY: From Bits to Blogs

Machine Learning Week 1.

An Improved Neural Network Algorithm for Classifying the Transmission Line Faults Slavko Vasilic Dr Mladen Kezunovic Texas A&M University.

Pattern Recognition & Machine Learning

Research on the Modeling of Chinese Continuous Speech Recognition

Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton

Preposition error correction using Graph Convolutional Networks

Natural Language Processing (NLP) Systems Joseph E. Gonzalez

Topics discussed in this section:

Emre Yılmaz, Henk van den Heuvel and David A. van Leeuwen

Presentation transcript:

CNTS LTG (UA) (i) Phoneme-to-Grapheme (ii) Transcription-to-Subtitles Bart Decadt Erik Tjong Kim Sang Walter Daelemans

Machine Learning of Phoneme-to-Grapheme Conversion For Out-of-Vocabulary handling in Speech Recognition

CNTS-Atranos3 Proper Names Domain Terminology Complex Morphology (compounds) gespreksonderwerp (topic of conversation) gesprek zonder werk (conversation without work) Out Of Vocabulary word problem

CNTS-Atranos4 Speech Recognizer (ESAT) input: speech output: text Confidence threshold Suspected OOV Phoneme Recognizer (ESAT) Phoneme string P2G Converter (TIMBL) Spelling Spelling correction with large vocabulary Training Data Architecture

CNTS-Atranos5 Memory-Based Learning Classification-based (alignment) =,=,k,A,s,t,= a Similarity-based Parameter Optimization MBL algorithm (ib1, igtree) Number of nearest neighbors Feature weighting method Class distance weighting Timbl (1998, 2002)

CNTS-Atranos6 Experiment Training data (129k words – 9k OOVs): –from ESAT’s phoneme recognizer –error rate = ~29% (substitutions + insertions + deletions) –phoneme deletions are problematic Baselines –Near-perfect phoneme data (CELEX) 99.1 (grapheme)91.4 (word) –Probabilistic 70.5 (grapheme)60.2 (word) 30.0 (grapheme) 3.0 (word) (OOV only)

CNTS-Atranos7 Results Performance: all wordsOOVs grapheme-level word-level Spelling correction: Net effect: 8.6 (OOVs) (Simulated) interaction with speech recognizer: Increases WER, but improves readability

CNTS-Atranos8 Examples –gespreksonderwerp speech recognizer  gesprek zonder werk P2G-converter  gespreksonberwerp –speelgoedmitrailleur /sperGutnitrKj-yr/ speech recognizer  speelgoed moet hier P2G-converter  spergoetmietrijer

Automatic subtitling (normalization) Data collection and alignment

CNTS-Atranos10 Architecture News autocuesSubtitles (semi-)automatic alignment (semi-)automatic data capture Machine Learner Training Data Linguistic Annotation Classifier autocues subtitles

CNTS-Atranos11 Status (March 02) Teletext subtitle data capture hardware and software Software for VRT autocue file processing Software for alignment autocues with subtitles Autocue-subtitle alignment Similar procedure for VRT soap series “Thuis” data

CNTS-Atranos12 Statistical Subtitle Prediction Baseline experiment –8000 words soap (Thuis) –actor scenario word-aligned with subtitles –classification task (memory-based learning) predict deletion, substitution, copy –Features: focus word + 8 words context + pos tags –Feature selection (hill-climbing) selects only focus word Results (10-fold CV) –71.7% (copy all: 67.3%) –Most frequent replacement: {ge, gij, u, uw} je