Presentation is loading. Please wait.

Presentation is loading. Please wait.

ETAP-3: State of the Art, Options, and Prospects of Development Leonid Iomdin Institute for Information Transmission Problems Russian Academy of Sciences.

Similar presentations


Presentation on theme: "ETAP-3: State of the Art, Options, and Prospects of Development Leonid Iomdin Institute for Information Transmission Problems Russian Academy of Sciences."— Presentation transcript:

1 ETAP-3: State of the Art, Options, and Prospects of Development Leonid Iomdin Institute for Information Transmission Problems Russian Academy of Sciences iomdin@iitp.ru

2 Prague, May 12, 2008 Theoretical Background Igor Mel’čuk: «Meaning  Text» theory Jurij Apresjan: Integrated Theory of Linguistic Description and Systemic lexicography

3 Prague, May 12, 2008 ETAP-3 Options Machine translation SynTagRus: the tagged corpus of Russian Texts Generation from and to UNL (Quasi)synonymous Paraphrasing Computer-Aided Language Learning Tool

4 Prague, May 12, 2008 Machine Translation Russian  English 120,000-strong morphological dictionaries 95,000-strong combinatorial dictionaries Russian  German prototype Russian  French prototype Russian  Korean prototype Russian  Spanish prototype Arabic  English prototype

5 Prague, May 12, 2008 Major Features of ETAP Environment Rule-based Approach Stratificational Approach Syntactic Dependencies Lexicalistic Approach Self-Tuning Maximum Reusability of Linguistic resources

6 Prague, May 12, 2008 General Layout of Translation Process

7 Prague, May 12, 2008 Dependency Syntactic Structure They made a general remark that it was true.

8 Prague, May 12, 2008 Self-Tuning: Grammar vs. Dictionary General regularities: general rules that apply to very large classes of words and occur very often. Example: agreement Adj + N Restricted-scope regularities: specific rules that apply to restricted classes of words and have limited occurrence. Example: compound numerals

9 Prague, May 12, 2008 Multiple Translation They made a general remark that … (a) ‘they remarked in a general way that…’ (b) ‘they forced a general to remark that…’

10 Prague, May 12, 2008 Synonymous Paraphrasing The director ordered John to write a report The director gave John an order to write a report John was ordered by the director to write a report John received an order from the director to write a report

11 Prague, May 12, 2008 Lexical Functions Substitute LF synonyms, antonyms, converse terms, derivatives Collocate LF MAGN = 'a high degree of what is denoted by X’ OPER/FUNC...

12 Prague, May 12, 2008 Lexical Functions MAGN diseasegrave MAGN fogheavy MAGN controlstrict

13 Prague, May 12, 2008 Oper / Func Family of LF

14 Prague, May 12, 2008 Examples of LF Oper Oper 1 (invitation) = issue Oper 2 (invitation) = receive Oper 1 (defeat) = suffer Oper 2 (resistence) = encounter Oper 2 (respect) = enjoy

15 Prague, May 12, 2008 Examples of LF Func Func 1 (fear) = possess Func 2 (decision) = concern Func 1 (responsibility) = rest (with) Func 2 (vengeance) = fall (upon)

16 Prague, May 12, 2008 General Properties of Lexical Functions Universality Intralinguistic idiomaticity grave disease, heavy fog *heavy disease, *grave fog. Cross-linguistic idiomaticity Rus. тяжелая болезнь ‘heavy disease’ Rus. густой туман ‘dense fog’

17 Prague, May 12, 2008 General Properties of Lexical Functions ( cont.) Paraphrasing Potential: He respects [X] his teachers He has [OPER 1 (S 0 (X))] respect [S 0 (X)] for his teachers He treats [LABOR 12 (S 0 (X))] his teachers with respect His teachers enjoy [OPER 2 (S 0 (X))] his respect

18 Prague, May 12, 2008 LF in Practical Applications Syntactic and Lexical Ambiguity Resolution in Parsers Idiomatic Translation of a Large Class of Set Expressions in Machine Translation Sentence Paraphrasing

19 Prague, May 12, 2008 Lexical Ambiguity Resolution to draw a distinction - provodit' razlichie Both verbs are extremely ambiguous: draw - more than 50 meanings provodit’ - more than 10 meanings

20 Prague, May 12, 2008 Syntactic Ambiguity Resolution support of the parliament 'support by the parliament' 'support (given) to the parliament' The president had [Y=OPER 2 (X)] the support [X] of the parliament The fear [X] of his wife possessed [Y = FUNC 1 (X)] Peter The fears of his wife infected Peter.

21 Prague, May 12, 2008 Idiomatic translation: LF Temp March: in- март: в2 Tuesday: on- вторник: в1 dawn: at- рассвет: на2 moment: at - момент: в1 Easter: at – пасха: на1

22 Prague, May 12, 2008 Sentence Paraphrasing X = CONV 12 (X) This group consists of 20 persons – Twenty persons comprise this group; X + Y = ANTI 1 (X) + ANTI 2 (Y) He began to observe the rules – He stopped violating the rules X = LABOR 12 + S 0 (X) He respects his parents – He treats his parents with respect

23 Prague, May 12, 2008 Sample Dictionary Entry (Excerpt): CHANCE CHANCE1 POR:S SYNT:COUNT,PREDTO,PREDTHAT DES:'FACT','ABSTRACT’

24 Prague, May 12, 2008 CHANCE D1.1:OF,'PERSON' D2.1:OF,'FACT' D2.2:TO2 D2.3:THAT1

25 Prague, May 12, 2008 CHANCE SYN1: OPPORTUNITY MAGN: GOOD1, FAIR1, EXCELLENT ANTIMAGN: SLIGHT, SLIM, POOR, LITTLE1, SMALL OPER1: HAVE, STAND1 REAL1-M: TAKE

26 Prague, May 12, 2008 CHANCE ANTIREAL1-M: MISS1 INCEPOPER1: GET FINOPER1: LOSE CAUSFUNC1: GIVE ZONE:R TRANS:ШАНС/СЛУЧАЙ

27 Prague, May 12, 2008 CHANCE REG:TRADUCT2.00 TAKE:X LOC:R R:COMPOS/MODIF/POSSES CHECK 1.1 DEP-LEXA(X,Z,PREPOS,BY1) N:01 CHECK 1.1 DOM(X,*,R) DO 1 ZAMRUZ:Z(PO1) 2 ZAMRUZ:X(SLUCHAJNOST’)

28 Prague, May 12, 2008 CHANCE N:02 CHECK 2.1 DOM(X,*,*) DO 1 ZAMRUZ:Z(SLUCHAJNO) 2 STERUZ:X TRAF:RA-EXPANS.16 LA:THAT1 TRAF:RA-EXPANS.22

29 Prague, May 12, 2008 What is UNL? UNL is a formal language for meaning representation A minimal unit of UNL is a UNL expression UNL expression corresponds to a sentence of natural language in the amount of information conveyed

30 Prague, May 12, 2008 Internet UNL Architecture French People Hindu People Spanish People Chinese People French Chinese Spanish UNL System Hindi

31 Prague, May 12, 2008 How is UNL made? UNL is a formal language of meaning representation A minimum UNL unit is UNL graph The amount of sense rendered by a UNL graph corresponds to a natural language sentence

32 Prague, May 12, 2008 Two MT architectures: Transfer vs. Interlingua Source textTarget text Interlingua Transfer

33 Prague, May 12, 2008 UNL approach to lexical design Semantic units of UNL (universal words, UW) are designed on the basis of natural language (English) words which can be semantically modified if need be

34 Prague, May 12, 2008 UNL strategy Lexical meanings of the natural language are represented by UWs. 1.Lexical meaning coincides with the meaning of an unambiguous English word 2.Lexical meaning coincides with one of the senses of an unambiguous English word 3.Lexical meaning does not coincide with any of the lexical meanings of English

35 Prague, May 12, 2008 Disambiguation of natural word senses Coach: bus, trainer, train, drill,... coach(icl>bus>transport) сoach(icl>person,obj>sportsman) coach(icl>do,obj>sportsman) coach(icl>do,obj>student)

36 Prague, May 12, 2008 прибежать прилететь приплыть приползти come(met>run) come(met>plane) come(met>swim) come(met>crawl) Formation of new UWs

37 Prague, May 12, 2008 Formation of new UWs жениться  marry(agt>man) выходить замуж  marry(agt>woman)


Download ppt "ETAP-3: State of the Art, Options, and Prospects of Development Leonid Iomdin Institute for Information Transmission Problems Russian Academy of Sciences."

Similar presentations


Ads by Google