Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Presenter : Chien-Hsing Chen Author: Jong-Hoon Oh Key-Sun.

Similar presentations


Presentation on theme: "Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Presenter : Chien-Hsing Chen Author: Jong-Hoon Oh Key-Sun."— Presentation transcript:

1 Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Presenter : Chien-Hsing Chen Author: Jong-Hoon Oh Key-Sun choi Hitoshi Isahara A machine transliteration model based on correspondence between graphemes and phonemes 2007.TALIP (ACM Transactions on Asian Language Information Processing)

2 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 2 Outline Motivation Objective Previous work Method Experiments Conclusions Opinion

3 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 3 Motivation Machine transliteration (MT) automatically convert in one language into phonetically equivalent ones in another language Such as from English to Korean, Japanese, or Chinese a special case of CLIR, it is useful for query translation … Graphemes-based Source G target G Phonemes-based Source G source P target G Hybrid linear interpolation dynamically handle source graphemes and phonemes data (English) deiteo (korean); deta (Jap.) data (English) deiteo (korean); deta (Jap.) [`det#]

4 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 4 Objective Correspondence-based correspondence between source G and P dynamically handle source G and P based on the contexts an example: neomycin (G + P) data (English) deiteo (korean); deta (Jap.) [`det#] data (English) deiteo (korean); deta (Jap.) [`det#]

5 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 5 Previous work- Grapheme-Based 1/4 G-based transliteration modes are classified into: statistical translation, decision trees, transliteration network, joint source channels board (/B AO R D/); b, oa, r, d are PUs 依音節切割 E i = epu i1, … epu in [1998, 1999] K i = kpu i1, … kpu in E=b:oar:d, b:oa:r:d, b:o:a:r:d, K=b:o:deu, b:o:reu:deu, b:o:a:reu:deu error PUs, incorrect PUs for each word, time

6 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 6 Previous work- Grapheme-Based 2/4 Decision trees [2000; 2001] English grapheme to Korean grapheme conversion no consider the phonetic aspect of the transliteration

7 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 7 Previous work- Grapheme-Based 3/4 network [2000] Each node is composed of more than one English grapheme and the corresponding Korean graphemes. Each arc represents a possible link between nodes. The optimal path is the highest total weight, Viterbi and tree-trellis algorithms ca ka ca ki

8 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 8 Previous work- Grapheme-Based 4/4 Network [2003] EN: actinium Jap: a ku chi ni u mu

9 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Previous work- Phoneme-Based 1/3 source language word pronunciation target language Weighted finite-state transducers (WFSTs) sord sequence word to English sound English sound to Japanese sound Japanese sound to katakana katakana to OCR A basic framework for Phoneme-based 0.6

10 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Previous work- Phoneme-Based 2/3 Two-step procedure English PUs English phoneme, [statistical translation model] English phoneme Korean PUs, [EKSCRs standard conversion rule] Two problems: error propagation: English PU English phoneme usually error limitation EKSCRs

11 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Previous work- Phoneme-Based 3/3 decision trees Phoneme-based English Korean transliteration depend on a pronunciation dictionary

12 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Previous work- Hybrid Transliteration 1/1 Combined through linear interpolation 0.4 G-based + 0.6 P-based not consider the dependence between the source graphemes and phonemes in the combining process

13 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Summary G-based source grapheme target grapheme P-based source grapheme source phoneme target grapheme Correspondence-based minimize error caused by error propagation by using source grapheme corresponding to a source phoneme use dynamically source graphemes and source phoneme depending on context, produce effectively data (English) deiteo (korean); deta (Jap.) [`det#]

14 Intelligent Database Systems Lab N.Y.U.S.T. I. M. C-based d

15 Intelligent Database Systems Lab N.Y.U.S.T. I. M. C-baed

16 Intelligent Database Systems Lab N.Y.U.S.T. I. M. C-based Producing Pronunciation The most relevant source phoneme of b, /B/ can be produced by means of the context, f s, f Stype, and f p at L1-L3, C0, and R1-R3.

17 Intelligent Database Systems Lab N.Y.U.S.T. I. M. C-based Producing Target Graphmemes

18 Intelligent Database Systems Lab N.Y.U.S.T. I. M. C-based Maximum Entropy Model 1/2/1/3

19 Intelligent Database Systems Lab N.Y.U.S.T. I. M. C-based Maximum Entropy Model 2/2/1/3

20 Intelligent Database Systems Lab N.Y.U.S.T. I. M. C-based Decision Tree 2/3

21 Intelligent Database Systems Lab N.Y.U.S.T. I. M. C-based Memory-Based Learning 3/3 k-nearest neighborhood algorithm

22 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments 1/2 P-based G-based C-based

23 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments 2/2

24 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Discuss

25 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Conclusion The author plans to apply the transliteration model to an English-to-Chinese transliteration.

26 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Conclusion The author plans to apply the transliteration model to an English-to-Chinese transliteration.

27 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 27 Opinion Advantage Combine Grapheme and Phoneme Drawback lack dynamic alignment Application machine translation, CLIR, IR, NLP applications


Download ppt "Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Presenter : Chien-Hsing Chen Author: Jong-Hoon Oh Key-Sun."

Similar presentations


Ads by Google