Recent Paper of Md. Akmal Haidar Meeting before ICASSP 2013 報告者:郝柏翰 2013/05/23.

Slides:



Advertisements
Similar presentations
Language Modeling.
Advertisements

1 Language Models for TR (Lecture for CS410-CXZ Text Info Systems) Feb. 25, 2011 ChengXiang Zhai Department of Computer Science University of Illinois,
Title: The Author-Topic Model for Authors and Documents
Improved Neural Network Based Language Modelling and Adaptation J. Park, X. Liu, M.J.F. Gales and P.C. Woodland 2010 INTERSPEECH Bang-Xuan Huang Department.
An Introduction to LDA Tools Kuan-Yu Chen Institute of Information Science, Academia Sinica.
Statistical Topic Modeling part 1
Joint Sentiment/Topic Model for Sentiment Analysis Chenghua Lin & Yulan He CIKM09.
Generative Topic Models for Community Analysis
1 Language Model (LM) LING 570 Fei Xia Week 4: 10/21/2009 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAA A A.
1 LM Approaches to Filtering Richard Schwartz, BBN LM/IR ARDA 2002 September 11-12, 2002 UMASS.
WEB-DATA AUGMENTED LANGUAGE MODEL FOR MANDARIN SPEECH RECOGNITION Tim Ng 1,2, Mari Ostendrof 2, Mei-Yuh Hwang 2, Manhung Siu 1, Ivan Bulyko 2, Xin Lei.
Why is ASR Hard? Natural speech is continuous
JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 30, (2014) BERLIN CHEN, YI-WEN CHEN, KUAN-YU CHEN, HSIN-MIN WANG2 AND KUEN-TYNG YU Department of Computer.
Multi-Style Language Model for Web Scale Information Retrieval Kuansan Wang, Xiaolong Li and Jianfeng Gao SIGIR 2010 Min-Hsuan Lai Department of Computer.
Lightly Supervised and Unsupervised Acoustic Model Training Lori Lamel, Jean-Luc Gauvain and Gilles Adda Spoken Language Processing Group, LIMSI, France.
A Survey of ICASSP 2013 Language Model Department of Computer Science & Information Engineering National Taiwan Normal University 報告者:郝柏翰 2013/06/19.
1 Bayesian Learning for Latent Semantic Analysis Jen-Tzung Chien, Meng-Sun Wu and Chia-Sheng Wu Presenter: Hsuan-Sheng Chiu.
Topic Models in Text Processing IR Group Meeting Presented by Qiaozhu Mei.
6. N-GRAMs 부산대학교 인공지능연구실 최성자. 2 Word prediction “I’d like to make a collect …” Call, telephone, or person-to-person -Spelling error detection -Augmentative.
Chapter 6: Statistical Inference: n-gram Models over Sparse Data
Statistical NLP: Lecture 8 Statistical Inference: n-gram Models over Sparse Data (Ch 6)
This work is supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center contract number.
Transfer Learning Task. Problem Identification Dataset : A Year: 2000 Features: 48 Training Model ‘M’ Testing 98.6% Training Model ‘M’ Testing 97% Dataset.
LOG-ENERGY DYNAMIC RANGE NORMALIZATON FOR ROBUST SPEECH RECOGNITION Weizhong Zhu and Douglas O’Shaughnessy INRS-EMT, University of Quebec Montreal, Quebec,
COMPARISON OF A BIGRAM PLSA AND A NOVEL CONTEXT-BASED PLSA LANGUAGE MODEL FOR SPEECH RECOGNITION Md. Akmal Haidar and Douglas O’Shaughnessy INRS-EMT,
DISCRIMINATIVE TRAINING OF LANGUAGE MODELS FOR SPEECH RECOGNITION Hong-Kwang Jeff Kuo, Eric Fosler-Lussier, Hui Jiang, Chin-Hui Lee ICASSP 2002 Min-Hsuan.
Chapter6. Statistical Inference : n-gram Model over Sparse Data 이 동 훈 Foundations of Statistic Natural Language Processing.
Style & Topic Language Model Adaptation Using HMM-LDA Bo-June (Paul) Hsu, James Glass.
An Overview of Nonparametric Bayesian Models and Applications to Natural Language Processing Narges Sharif-Razavian and Andreas Zollmann.
Handing Uncertain Observations in Unsupervised Topic-Mixture Language Model Adaptation Ekapol Chuangsuwanich 1, Shinji Watanabe 2, Takaaki Hori 2, Tomoharu.
Efficient Language Model Look-ahead Probabilities Generation Using Lower Order LM Look-ahead Information Langzhou Chen and K. K. Chin Toshiba Research.
Cluster-specific Named Entity Transliteration Fei Huang HLT/EMNLP 2005.
Latent Dirichlet Allocation D. Blei, A. Ng, and M. Jordan. Journal of Machine Learning Research, 3: , January Jonathan Huang
1 Modeling Long Distance Dependence in Language: Topic Mixtures Versus Dynamic Cache Models Rukmini.M Iyer, Mari Ostendorf.
Recurrent neural network based language model Tom´aˇs Mikolov, Martin Karafia´t, Luka´sˇ Burget, Jan “Honza” Cˇernocky, Sanjeev Khudanpur INTERSPEECH 2010.
Yuya Akita , Tatsuya Kawahara
Introduction to LDA Jinyang Gao. Outline Bayesian Analysis Dirichlet Distribution Evolution of Topic Model Gibbs Sampling Intuition Analysis of Parameter.
Language and Statistics
National Taiwan University, Taiwan
Topic Models Presented by Iulian Pruteanu Friday, July 28 th, 2006.
Carnegie Mellon Novelty and Redundancy Detection in Adaptive Filtering Yi Zhang, Jamie Callan, Thomas Minka Carnegie Mellon University {yiz, callan,
Topic Modeling using Latent Dirichlet Allocation
A Word Clustering Approach for Language Model-based Sentence Retrieval in Question Answering Systems Saeedeh Momtazi, Dietrich Klakow University of Saarland,Germany.
Latent Dirichlet Allocation
Latent Topic Modeling of Word Vicinity Information for Speech Recognition Kuan-Yu Chen, Hsuan-Sheng Chiu, Berlin Chen ICASSP 2010 Hao-Chin Chang Department.
Comparative Experiments on Sentiment Classification for Online Product Reviews Hang Cui, Vibhu Mittal, and Mayur Datar AAAI 2006.
Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,
Language Modeling Putting a curve to the bag of words Courtesy of Chris Jordan.
Local Linear Matrix Factorization for Document Modeling Institute of Computing Technology, Chinese Academy of Sciences Lu Bai,
Relevance Language Modeling For Speech Recognition Kuan-Yu Chen and Berlin Chen National Taiwan Normal University, Taipei, Taiwan ICASSP /1/17.
Berlin Chen Department of Computer Science & Information Engineering
Dynamic Multi-Faceted Topic Discovery in Twitter Date : 2013/11/27 Source : CIKM’13 Advisor : Dr.Jia-ling, Koh Speaker : Wei, Chang 1.
Statistical Models for Automatic Speech Recognition Lukáš Burget.
1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.
A Maximum Entropy Language Model Integrating N-grams and Topic Dependencies for Conversational Speech Recognition Sanjeev Khudanpur and Jun Wu Johns Hopkins.
An Empirical Study on Language Model Adaptation Jianfeng Gao, Hisami Suzuki, Microsoft Research Wei Yuan Shanghai Jiao Tong University Presented by Patty.
Maximum Entropy techniques for exploiting syntactic, semantic and collocational dependencies in Language Modeling Sanjeev Khudanpur, Jun Wu Center for.
Text-classification using Latent Dirichlet Allocation - intro graphical model Lei Li
N-Gram Model Formulas Word sequences Chain rule of probability Bigram approximation N-gram approximation.
Concept-Based Analysis of Scientific Literature Chen-Tse Tsai, Gourab Kundu, Dan Roth UIUC.
Discriminative n-gram language modeling Brian Roark, Murat Saraclar, Michael Collins Presented by Patty Liu.
Flexible Speaker Adaptation using Maximum Likelihood Linear Regression Authors: C. J. Leggetter P. C. Woodland Presenter: 陳亮宇 Proc. ARPA Spoken Language.
Language Modeling Again So are we smooth now? Courtesy of Chris Jordan.
Online Multiscale Dynamic Topic Models
Language Models for Information Retrieval
LTI Student Research Symposium 2004 Antoine Raux
Language and Statistics
Topic Models in Text Processing
Anthor: Andreas Tsiartas, Prasanta Kumar Ghosh,
Presentation transcript:

Recent Paper of Md. Akmal Haidar Meeting before ICASSP 2013 報告者:郝柏翰 2013/05/23

Outline “Novel Weighting Scheme for Unsupervised Language Model Adaptation Using Latent Dirichlet Allocation”, 2010 “Unsupervised Language Model Adaptation Using Latent Dirichlet Allocation and Dynamic Marginals”, 2011 “Topic N-gram Count Language Model Adaptation for Speech Recognition”, 2012 “Comparison of a Bigram PLSA and a Novel Context-Based PLSA Language Model for Speech Recognition”,

Novel Weighting Scheme for Unsupervised Language Model Adaptation Using Latent Dirichlet Allocation Md. Akmal Haidar and Douglas O’Shaughnessy INTERSPEECH 2010

Introduction Adaptation is required when the styles, domains or topics of the test data are mismatched with the training data. It is also important as natural language is highly variable since the topic information is highly non-stationary. The idea of an unsupervised LM adaptation approach is to extract the latent topics from the training set and then adapt the topic specific LM with proper mixture weights, finally interpolated with the generic n-gram LM. In this paper, we propose the idea that the weights of topic models are generated using the word count of the topics generated by a hard-clustering method. 4

Proposed Method 5 Adaptation: –we can create a dynamically adapted topic model by using a mixture of LMs from different topics as:

Proposed Method 6 The adapted topic model is then interpolated with the generic LM as:

Experiment Setup We evaluated the LM adaptation approach using the Brown Corpus and WSJ1 corpus transcription text data. 7

Experiments 8

Unsupervised Language Model Adaptation Using Latent Dirichlet Allocation and Dynamic Marginals Md. Akmal Haidar and Douglas O’Shaughnessy INTERSPEECH 2010

Introduction To overcome the mismatch problem. we introduce an unsupervised language model adaptation approach using latent Dirichlet allocation (LDA) and dynamic marginals: locally estimated (smoothed) unigram probabilities from in-domain text data. we extend our previous work to find an adapted model by using the minimum discriminant information (MDI), which uses KL divergence as the distance measure between probability distributions. The final adapted model is formed by minimizing the KL divergence between the final adapted model and the LDA adapted topic model. 10

Proposed Method Topic clustering 11 LDA adapted topic mixture model generation

Proposed Method Adaptation using dynamic marginals –The adapted model using dynamic marginals is obtained by minimizing the KL-divergence between the adapted model and the background model subject to the marginalization constraint for each word w in the vocabulary: 12 –The constraint optimization problem has close connection to the maximum entropy approach, which provides that the adapted model is a rescaled version of the background model:

Proposed Method The background and the LDA adapted topic model have standard back-off structure and the above constraint, so the adapted LM has the following recursive formula: 13

Experiments 14

Topic N-gram Count Language Model Adaptation for Speech Recognition Md. Akmal Haidar and Douglas O’Shaughnessy INTERSPEECH 2010

Introduction 16

Proposed Method 17 Using these features of the LDA model, we proposed two confidence measures to compute the topic mixture weights for each n-gram: The topic n-gram language models are then generated using the topic n-gram counts and defined as TNCLM.

Proposed Method 18 The ANCLM are then interpolated with the background n- gram model to capture the local constraints using the linear interpolation as:

Experiments 19