An Adaptive Learning with an Application to Chinese Homophone Disambiguation from Yue-shi Lee International Journal of Computer Processing of Oriental.

Slides:



Advertisements
Similar presentations
1 CS 388: Natural Language Processing: N-Gram Language Models Raymond J. Mooney University of Texas at Austin.
Advertisements

1 Minimally Supervised Morphological Analysis by Multimodal Alignment David Yarowsky and Richard Wicentowski.
Punctuation Generation Inspired Linguistic Features For Mandarin Prosodic Boundary Prediction CHEN-YU CHIANG, YIH-RU WANG AND SIN-HORNG CHEN 2012 ICASSP.
Language Model for Cyrillic Mongolian to Traditional Mongolian Conversion Feilong Bao, Guanglai Gao, Xueliang Yan, Hongwei Wang
DOMAIN DEPENDENT QUERY REFORMULATION FOR WEB SEARCH Date : 2013/06/17 Author : Van Dang, Giridhar Kumaran, Adam Troy Source : CIKM’12 Advisor : Dr. Jia-Ling.
Applicability of N-Grams to Data Classification A review of 3 NLP-related papers Presented by Andrei Missine (CS 825, Fall 2003)
Chinese Word Segmentation Method for Domain-Special Machine Translation Su Chen; Zhang Yujie; Guo Zhen; Xu Jin’an Beijing Jiaotong University.
A New Block Based Motion Estimation with True Region Motion Field Jozef Huska & Peter Kulla EUROCON 2007 The International Conference on “Computer as a.
Confidence Estimation for Machine Translation J. Blatz et.al, Coling 04 SSLI MTRG 11/17/2004 Takahiro Shinozaki.
Introduction to Lexical Semantics Vasileios Hatzivassiloglou University of Texas at Dallas.
1 Noun Homograph Disambiguation Using Local Context in Large Text Corpora Marti A. Hearst Presented by: Heng Ji Mar. 29, 2004.
Designing clustering methods for ontology building: The Mo’K workbench Authors: Gilles Bisson, Claire Nédellec and Dolores Cañamero Presenter: Ovidiu Fortu.
Presented by Zeehasham Rasheed
Distributional clustering of English words Authors: Fernando Pereira, Naftali Tishby, Lillian Lee Presenter: Marian Olteanu.
(Some issues in) Text Ranking. Recall General Framework Crawl – Use XML structure – Follow links to get new pages Retrieve relevant documents – Today.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Software Reliability Categorising and specifying the reliability of software systems.
1 Location Estimation in ZigBee Network Based on Fingerprinting Department of Computer Science and Information Engineering National Cheng Kung University,
Comparative study of various Machine Learning methods For Telugu Part of Speech tagging -By Avinesh.PVS, Sudheer, Karthik IIIT - Hyderabad.
International Conference on Intelligent and Advanced Systems 2007 Chee-Ming Ting Sh-Hussain Salleh Tian-Swee Tan A. K. Ariff. Jain-De,Lee.
Chapter 6: Statistical Inference: n-gram Models over Sparse Data
1 Statistical NLP: Lecture 9 Word Sense Disambiguation.
Regression Approaches to Voice Quality Control Based on One-to-Many Eigenvoice Conversion Kumi Ohta, Yamato Ohtani, Tomoki Toda, Hiroshi Saruwatari, and.
Part-Of-Speech Tagging using Neural Networks Ankur Parikh LTRC IIIT Hyderabad
DISCRIMINATIVE TRAINING OF LANGUAGE MODELS FOR SPEECH RECOGNITION Hong-Kwang Jeff Kuo, Eric Fosler-Lussier, Hui Jiang, Chin-Hui Lee ICASSP 2002 Min-Hsuan.
Chapter6. Statistical Inference : n-gram Model over Sparse Data 이 동 훈 Foundations of Statistic Natural Language Processing.
Recognizing Names in Biomedical Texts: a Machine Learning Approach GuoDong Zhou 1,*, Jie Zhang 1,2, Jian Su 1, Dan Shen 1,2 and ChewLim Tan 2 1 Institute.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
A Model for Learning the Semantics of Pictures V. Lavrenko, R. Manmatha, J. Jeon Center for Intelligent Information Retrieval Computer Science Department,
Chapter 23: Probabilistic Language Models April 13, 2004.
Iterative Translation Disambiguation for Cross Language Information Retrieval Christof Monz and Bonnie J. Dorr Institute for Advanced Computer Studies.
Chinese Word Segmentation Adaptation for Statistical Machine Translation Hailong Cao, Masao Utiyama and Eiichiro Sumita Language Translation Group NICT&ATR.
1 Modeling Long Distance Dependence in Language: Topic Mixtures Versus Dynamic Cache Models Rukmini.M Iyer, Mari Ostendorf.
Sparse Signals Reconstruction Via Adaptive Iterative Greedy Algorithm Ahmed Aziz, Ahmed Salim, Walid Osamy Presenter : 張庭豪 International Journal of Computer.
Word Translation Disambiguation Using Bilingial Bootsrapping Paper written by Hang Li and Cong Li, Microsoft Research Asia Presented by Sarah Hunter.
Yuya Akita , Tatsuya Kawahara
Carnegie Mellon Novelty and Redundancy Detection in Adaptive Filtering Yi Zhang, Jamie Callan, Thomas Minka Carnegie Mellon University {yiz, callan,
Mutual bilingual terminology extraction Le An Ha*, Gabriela Fernandez**, Ruslan Mitkov*, Gloria Corpas*** * University of Wolverhampton ** Universidad.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Rival-Model Penalized Self-Organizing Map Yiu-ming Cheung.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Unsupervised word sense disambiguation for Korean through the acyclic weighted digraph using corpus and.
Language Modeling Putting a curve to the bag of words Courtesy of Chris Jordan.
Finding document topics for improving topic segmentation Source: ACL2007 Authors: Olivier Ferret (18 route du Panorama, BP6) Reporter:Yong-Xiang Chen.
Ranking of Database Query Results Nitesh Maan, Arujn Saraswat, Nishant Kapoor.
Global Clock Synchronization in Sensor Networks Qun Li, Member, IEEE, and Daniela Rus, Member, IEEE IEEE Transactions on Computers 2006 Chien-Ku Lai.
Computational Linguistics Courses Experiment Test.
1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.
Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning Speech Communication, 2000 Authors: S. M. Witt, S. J. Young Presenter:
Gaussian Mixture Language Models for Speech Recognition Mohamed Afify, Olivier Siohan and Ruhi Sarikaya.
N-Gram Model Formulas Word sequences Chain rule of probability Bigram approximation N-gram approximation.
Sentence Similarity Based on Semantic Nets and Corpus Statistics
Computacion Inteligente Least-Square Methods for System Identification.
Present by: Fang-Hui Chu Large Margin Gaussian Mixture Modeling for Phonetic Classification and Recognition Fei Sha*, Lawrence K. Saul University of Pennsylvania.
Meta-Path-Based Ranking with Pseudo Relevance Feedback on Heterogeneous Graph for Citation Recommendation By: Xiaozhong Liu, Yingying Yu, Chun Guo, Yizhou.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Chinese Named Entity Recognition using Lexicalized HMMs.
Flexible Speaker Adaptation using Maximum Likelihood Linear Regression Authors: C. J. Leggetter P. C. Woodland Presenter: 陳亮宇 Proc. ARPA Spoken Language.
A Tutorial on Speaker Verification First A. Author, Second B. Author, and Third C. Author.
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Author: Kazunari Sugiyama, etc. (WWW2004)
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Presentation transcript:

An Adaptive Learning with an Application to Chinese Homophone Disambiguation from Yue-shi Lee International Journal of Computer Processing of Oriental Languages, Vol. 15, No. 3 (2002) 245 – 260

Introduction Contextual language processing: Given a specified corpus, finding a plausible probabilistic model that translate the perceived syllable sequence into characters, words, or sentences correctly (By the sense of maximum likelihood). Two main errors affect the correctness: Modeling error : Weakness of the language model – enhance or refine the weak language model. Estimation error : Small size training corpus ; Variant run-time context domain ;

Introduction (cont.) Estimation error : Small size training corpus: smoothing, class-based and similarity-based models to STATICALLY adjust the unreliable probabilities. Variant run-time context domain: cache-based model: capturing short-term fluctuation of the frequencies of words, effective for frequent words or repeated sentences. multiple language model: Defining a “current time context”, and importing several similar texts from distinct fields for comparison.

Introduction (cont.) Author’s work: Goal: To decrease the estimation error under the variant context situation. Method: Developing an adaptive algorithm to adjust statistical information under the variant context environment; Application: Disambiguation of homophone Chinese.

Language Model Chinese Fact: 1,300 syllables, 13,094 commonly used characters, 100,000 words. Disambiguation is important: Many different characters share the same syllables.

Language Model (cont.) Disambiguation Definition: Translate a sequence of syllables into a sequence of characters (sentence) with correct meaning. Modeling:

Language Model (cont.) Simplification:

Language Model (cont.) Stored context statistical information: 2-D matrix 1-D vector Contributes of author ’ s work: Adapt the context statistical information according to feedback estimation errors under variant context situation.

Adaptive Learning Model Detecting the components needing be adjusted

Adaptive Learning Model (cont.) Detecting the components needing be adjusted (cont.)

Adaptive Learning Model (cont.) Modeling and Learning algorithm:

Experimental Results Two parts of data come from two different context. Part1 has 123 sentences and 1057 characters Part2 has 123 sentences and 977 characters Two Stages of Experiments Stage1: Parts1 as learning data, part2 as testing data Stage2: Parts2 as learning data, part1 as testing data

Experimental Results

Experimental Results (cont.)

Conclusion propose an adaptive learning algorithm for task adaptation to fit best the run-time context domain in the application of Chinese homophone disambiguation. This adaptive learning algorithm is also suitable for incremental training. Personal comment (1) : The author proposed an MLP based method. However, I think the modeling of the learning problem is quite a linear one, therefore the MLP is not necessary. Personal comment (2) : I do not think the experimental results are good enough, however, the adaptive learning idea in nature language processing is interesting. Link : from Inspec database, available on MSU librarywebsite.