Download presentation
Presentation is loading. Please wait.
1
English-Korean Machine Translation System
Sung-Dong Kim Dept. of Computer System Engineering, Hansung University
2
Dept. of CSE, Hansung Univ.
Contents History Methodologies System Structure Capabilities of EKMT System Dept. of CSE, Hansung Univ.
3
Dept. of CSE, Hansung Univ.
History (1) KSHALT (Korean System for Human Assisted Language Translation) Late 1980’s Implemented using LISP Run on IBM 3090 mainframe Enkor 1991 ~ 1995 Implemented using C Dept. of CSE, Hansung Univ.
4
Dept. of CSE, Hansung Univ.
History (2) ETran Several versions: 98, 99, 2000, 2003 Enhanced version of Enkor Text translation, internet translation on browser SmarTran 2004 Provide more user-friendly capabilities Improve dictionary access time Dept. of CSE, Hansung Univ.
5
Dept. of CSE, Hansung Univ.
Methodologies (1) Rule-based method Transfer method Idiom-based analysis and translation Statistical method Dept. of CSE, Hansung Univ.
6
Dept. of CSE, Hansung Univ.
Methodologies (2) Rule-based method Context free grammar Phrase, clause, sentence structures Dept. of CSE, Hansung Univ.
7
Dept. of CSE, Hansung Univ.
Methodologies (3) Transfer method Using transfer rules About 70 rules Structural transfer for catching up with the differences between English and Korean “little”: negative meaning in Korean It ~ to-INF: real, pseudo subject, object relation … Dept. of CSE, Hansung Univ.
8
Dept. of CSE, Hansung Univ.
Methodologies (4) Idiom-based analysis and translation Reduce analysis complexity Enhance translation quality bread and butter: Not bread and butter but bread spread with butter Takes one edge instead of three in chart provide him with money Provide A with B Give him a money Dept. of CSE, Hansung Univ.
9
Dept. of CSE, Hansung Univ.
Methodologies (5) Statistical method: intra-sentence segmentation based on maximum entropy Partial parsing: reduce parsing complexity long sentence analysis Maximum entropy probability model Dept. of CSE, Hansung Univ.
10
Dept. of CSE, Hansung Univ.
Structure (1) English sentences Lexical Rules Lexical Analysis Lexical Dictionary Syntactic Rules Syntactic Analysis EK Transfer Dictionary Transfer EK Transfer Rules Korean Dictionary Korean Generation Korean Generation Rules Korean sentences Dept. of CSE, Hansung Univ.
11
Dept. of CSE, Hansung Univ.
Structure (2) System modules Lexical analysis Syntactic analysis: Parser Chart-based parsing Idiom-based analysis Idiom recognition before parsing Partial parsing using intra-sentence segmentation Transfer Korean generation Dept. of CSE, Hansung Univ.
12
Result of lexical analysis
Intra-sentence segmentation Segment 1 Segment 2 … Segment n EK Idiom Dictionary Idiom recognition & translation Grammar Segment Analysis Global structuring Tree selection Parser Parsing tree Dept. of CSE, Hansung Univ.
13
Dept. of CSE, Hansung Univ.
Structure (4) Dictionaries Lexical dictionaries Word usage dictionary About 70,000 entries Information for POS probability calculation Word information dictionary About 83,000 entries Information about POS, … Dept. of CSE, Hansung Univ.
14
Dept. of CSE, Hansung Univ.
Structure (5) EK transfer dictionary Structure Default meaning Collocations Idioms General dictionary For each POS: noun, adjective, adverb, verb, … Domain dictionary 14 types: military, economy, politics, computer, medical, … User dictionary Facility for user to make his own dictionary Dept. of CSE, Hansung Univ.
15
Dept. of CSE, Hansung Univ.
Structure (6) Statistics of transfer dictionary Number of entries General dictionary: about 68,000 Noun: 40,546 Adjective: 16,255 Adverb: 2,671 Verb: 8,612 Domain dictionary: about 36,000 Number of idioms: 42,600 Number of collocations: 6,150 Dept. of CSE, Hansung Univ.
16
Dept. of CSE, Hansung Univ.
Structure (7) Rules Lexical rules Syntactic rules About 600 syntactic rules Transfer rules Korean generation rules Dept. of CSE, Hansung Univ.
17
Capabilities of EKMT system
Text translation Internet translation on the browser MS-office document translation Dept. of CSE, Hansung Univ.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.