Download presentation
Presentation is loading. Please wait.
Published byHerbert Franklin Modified over 8 years ago
1
Machine Translation: An Introduction and Overview Alon Lavie Language Technologies Institute Carnegie Mellon University JHU Summer School June 28, 2006
2
June 27, 2006JHU Summer School2 Machine Translation: History MT started in 1940’s, one of the first conceived application of computers Promising “toy” demonstrations in the 1950’s, failed miserably to scale up to “real” systems AIPAC Report: MT recognized as an extremely difficult, “AI-complete” problem in the early 1960’s MT Revival started in earnest in 1980s (US, Japan) Field dominated by rule-based approaches, requiring 100s of K-years of manual development Economic incentive for developing MT systems for small number of language pairs (mostly European languages)
3
June 27, 2006JHU Summer School3 Machine Translation: Where are we today? Age of Internet and Globalization – great demand for MT: –Multiple official languages of UN, EU, Canada, etc. –Documentation dissemination for large manufacturers (Microsoft, IBM, Caterpillar) Economic incentive is still primarily within a small number of language pairs Some fairly good commercial products in the market for these language pairs –Primarily a product of rule-based systems after many years of development Pervasive MT between most language pairs still non- existent and not on the immediate horizon
4
June 27, 2006JHU Summer School4 Best Current General-purpose MT PAHO’s Spanam system: Mediante petición recibida por la Comisión Interamericana de Derechos Humanos (en adelante …) el 6 de octubre de 1997, el señor Lino César Oviedo (en adelante …) denunció que la República del Paraguay (en adelante …) violó en su perjuicio los derechos a las garantías judiciales … en su contra. Through petition received by the `Inter-American Commission on Human Rights` (hereinafter …) on 6 October 1997, Mr. Linen César Oviedo (hereinafter “the petitioner”) denounced that the Republic of Paraguay (hereinafter …) violated to his detriment the rights to the judicial guarantees, to the political participation, to // equal protection and to the honor and dignity consecrated in articles 8, 23, 24 and 11, respectively, of the `American Convention on Human Rights` (hereinafter …”), as a consequence of judgments initiated against it.
5
June 27, 2006JHU Summer School5 Core Challenges of MT Ambiguity: –Human languages are highly ambiguous, and differently in different languages –Ambiguity at all “levels”: lexical, syntactic, semantic, language-specific constructions and idioms Amount of required knowledge: –At least several 100k words, at least as many phrases, plus syntactic and semantic knowledge about both languages. How do you acquire and construct a knowledge base that big that is (even mostly) correct and consistent?
6
June 27, 2006JHU Summer School6 How to Tackle the Core Challenges Manual Labor: 1000s of person-years of human experts developing large word and phrase translation lexicons and translation rules. Example: Systran’s RBMT systems. Lots of Parallel Data: data-driven approaches for finding word and phrase correspondences automatically from large amounts of sentence-aligned parallel texts. Example: Statistical MT systems. Learning Approaches: learn translation rules automatically from word-aligned parallel data. Example: CMU’s XFER approach. Simplify the Problem: build systems that are limited- domain or constrained in other ways.
7
June 27, 2006JHU Summer School7 State-of-the-Art in MT What users want: –General purpose (any text) –High quality (human level) –Fully automatic (no user intervention) We can meet any 2 of these 3 goals today, but not all three at once: –FA HQ: Knowledge-Based MT (KBMT) –FA GP: Corpus-Based (Example-Based) MT –GP HQ: Human-in-the-loop (efficiency tool)
8
June 27, 2006JHU Summer School8 Types of MT Applications: Assimilation: multiple source languages, uncontrolled style/topic. General purpose MT, no semantic analysis. (GP FA or GP HQ) Dissemination: one source language, controlled style, single topic/domain. Special purpose MT, full semantic analysis. (FA HQ) Communication: Lower quality may be okay, but degraded input, real-time required.
9
June 27, 2006JHU Summer School9 Mi chiamo Alon LavieMy name is Alon Lavie Give-information+personal-data (name=alon_lavie) [ s [ vp accusative_pronoun “chiamare” proper_name]] [ s [ np [possessive_pronoun “name”]] [ vp “be” proper_name]] Direct Transfer Interlingua Analysis Feneration Approaches to MT: Vaquois MT Triangle
10
June 27, 2006JHU Summer School10 Analysis and Generation Main Steps Analysis: –Morphological analysis (word-level) and POS tagging –Syntactic analysis and disambiguation (produce syntactic parse-tree) –Semantic analysis and disambiguation (produce symbolic frames or logical form representation) –Map to language-independent Interlingua Generation: –Generate semantic representation in TL –Sentence Planning: generate syntactic structure and lexical selections for concepts –Surface-form realization: generate correct forms of words
11
June 27, 2006JHU Summer School11 Direct Approaches No intermediate stage in the translation First MT systems developed in the 1950’s-60’s (assembly code programs) –Morphology, bi-lingual dictionary lookup, local reordering rules –“Word-for-word, with some local word-order adjustments” Modern Approaches: EBMT and SMT
12
June 27, 2006JHU Summer School12 Statistical MT (SMT) Proposed by IBM in early 1990s: a direct, purely statistical, model for MT Statistical translation models are trained on a sentence- aligned translation corpus –Train word-level alignment models –Extract phrase-to-phrase correspondences –Apply them at runtime on source input and “decode” Attractive: completely automatic, no manual rules, much reduced manual labor Main drawbacks: –Effective only with large volumes (several mega-words) of parallel text –Broad domain, but domain-sensitive –Still viable only for small number of language pairs! Impressive progress in last 5 years –Large DARPA funding program (TIDES) –Lots of research in this direction –GIZA++, Pharoah, CAIRO
13
June 27, 2006 EBMT Paradigm New Sentence (Source) Yesterday, 200 delegates met with President Clinton. Matches to Source Found Yesterday, 200 delegates met behind closed doors… Difficulties with President Clinton… Gestern trafen sich 200 Abgeordnete hinter verschlossenen… Schwierigkeiten mit Praesident Clinton… Alignment (Sub-sentential) Translated Sentence (Target) Gestern trafen sich 200 Abgeordnete mit Praesident Clinton. Yesterday, 200 delegates met behind closed doors… Difficulties with President Clinton over… Gestern trafen sich 200 Abgeordnete hinter verschlossenen… Schwierigkeiten mit Praesident Clinton…
14
June 27, 2006JHU Summer School14 Transfer Approaches Syntactic Transfer: –Analyze SL input sentence to its syntactic structure (parse tree) –Transfer SL parse-tree to TL parse-tree (various formalisms for specifying mappings) –Generate TL sentence from the TL parse-tree Semantic Transfer: –Analyze SL input to a language-specific semantic representation (i.e., Case Frames, Logical Form) –Transfer SL semantic representation to TL semantic representation –Generate syntactic structure and then surface sentence in the TL
15
June 27, 2006JHU Summer School15 Transfer Approaches Main Advantages and Disadvantages: Syntactic Transfer: –No need for semantic analysis and generation –Syntactic structures are general, not domain specific Less domain dependent, can handle open domains –Requires word translation lexicon Semantic Transfer: –Requires deeper analysis and generation, symbolic representation of concepts and predicates difficult to construct for open or unlimited domains –Can better handle non-compositional meaning structures can be more accurate –No word translation lexicon – generate in TL from symbolic concepts
16
June 27, 2006JHU Summer School16 Knowledge-based Interlingual MT The classic “deep” Artificial Intelligence approach: –Analyze the source language into a detailed symbolic representation of its meaning –Generate this meaning in the target language “Interlingua”: one single meaning representation for all languages –Nice in theory, but extremely difficult in practice
17
June 27, 2006JHU Summer School17 Interlingua versus Transfer With interlingua, need only N parsers/ generators instead of N 2 transfer systems: L1 L2 L3 L4 L5 L6 L1 L2 L3 L6 L5 L4 interlingua
18
June 27, 2006JHU Summer School18 Multi-Engine MT Apply several MT engines to each input in parallel Create a combined translation from the individual translations Goal is to combine strengths, and avoid weaknesses. Along all dimensions: domain limits, quality, development time/cost, run-time speed, etc.
19
June 27, 2006JHU Summer School19 Speech-to-Speech MT Speech just makes MT (much) more difficult: –Spoken language is messier False starts, filled pauses, repetitions, out-of- vocabulary words Lack of punctuation and explicit sentence boundaries –Current Speech technology is far from perfect Need for speech recognition and synthesis in foreign languages Robustness: MT quality degradation should be proportional to SR quality Tight Integration: rather than separate sequential tasks, can SR + MT be integrated in ways that improves end-to-end performance?
20
June 27, 2006JHU Summer School20 Major Sources of Translation Problems Lexical Differences: –Multiple possible translations for SL word, or difficulties expressing SL word meaning in a single TL word Structural Differences: –Syntax of SL is different than syntax of the TL: word order, sentence and constituent structure Differences in Mappings of Syntax to Semantics: –Meaning in TL is conveyed using a different syntactic structure than in the SL Idioms and Constructions
21
June 27, 2006JHU Summer School21 Lexical Differences SL word has several different meanings, that translate differently into TL –Ex: financial bank vs. river bank Lexical Gaps: SL word reflects a unique meaning that cannot be expressed by a single word in TL –Ex: English snub doesn’t have a corresponding verb in French or German TL has finer distinctions than SL SL word should be translated differently in different contexts –Ex: English wall can be German wand (internal), mauer (external)
22
June 27, 2006JHU Summer School22 Lexical Differences Lexical gaps: –Examples: these have no direct equivalent in English: gratiner (v., French, “to cook with a cheese coating”) ōtosanrin (n., Japanese, “three-wheeled truck or van”)
23
June 27, 2006JHU Summer School23 [From Hutchins & Somers] Lexical Differences
24
June 27, 2006JHU Summer School24 MT Handling of Lexical Differences Direct MT and Syntactic Transfer: –Lexical Transfer stage uses bilingual lexicon –SL word can have multiple translation entries, possibly augmented with disambiguation features or probabilities –Lexical Transfer can involve use of limited context (on SL side, TL side, or both) –Lexical Gaps can partly be addressed via phrasal lexicons Semantic Transfer: –Ambiguity of SL word must be resolved during analysis correct symbolic representation at semantic level –TL Generation must select appropriate word or structure for correctly conveying the concept in TL
25
June 27, 2006JHU Summer School25 Structural Differences Syntax of SL is different than syntax of the TL: –Word order within constituents: English NPs: art adj n the big boy Hebrew NPs: art n art adj ha yeled ha gadol –Constituent structure: English is SVO: Subj Verb Obj I saw the man Modern Arabic is VSO: Verb Subj Obj –Different verb syntax: Verb complexes in English vs. in German I can eat the apple Ich kann den apfel essen –Case marking and free constituent order German and other languages that mark case: den apfel esse Ich the (acc) apple eat I (nom)
26
June 27, 2006JHU Summer School26 MT Handling of Structural Differences Direct MT Approaches: –No explicit treatment: Phrasal Lexicons and sentence level matches or templates Syntactic Transfer: –Structural Transfer Grammars Trigger rule by matching against syntactic structure on SL side Rule specifies how to reorder and re-structure the syntactic constituents to reflect syntax of TL side Semantic Transfer: –SL Semantic Representation abstracts away from SL syntax to functional roles done during analysis –TL Generation maps semantic structures to correct TL syntax
27
June 27, 2006JHU Summer School27 Syntax-to-Semantics Differences Meaning in TL is conveyed using a different syntactic structure than in the SL –Changes in verb and its arguments –Passive constructions –Motion verbs and state verbs –Case creation and case absorption Main Distinction from Structural Differences: –Structural differences are mostly independent of lexical choices and their semantic meaning addressed by transfer rules that are syntactic in nature –Syntax-to-semantic mapping differences are meaning-specific: require the presence of specific words (and meanings) in the SL
28
June 27, 2006JHU Summer School28 Syntax-to-Semantics Differences Structure-change example: I like swimming “Ich scwhimme gern” I swim gladly
29
June 27, 2006JHU Summer School29 Syntax-to-Semantics Differences Verb-argument example: Jones likes the film. “Le film plait à Jones.” (lit: “the film pleases to Jones”) Use of case roles can eliminate the need for this type of transfer – Jones = Experiencer – film = Theme
30
June 27, 2006JHU Summer School30 Syntax-to-Semantics Differences Passive Constructions Example: French reflexive passives: Ces livres se lisent facilement *”These books read themselves easily” These books are easily read
31
June 27, 2006JHU Summer School31 Same intention, different syntax rigly bitiwgacny my leg hurts candy wagac fE rigly I have pain in my leg rigly bitiClimny my leg hurts fE wagac fE rigly there is pain in my leg rigly bitinqaH calya my leg bothers on me Romanization of Arabic from CallHome Egypt.
32
June 27, 2006JHU Summer School32 MT Handling of Syntax-to-Semantics Differences Direct MT Approaches: –No Explicit treatment: Phrasal Lexicons and sentence level matches or templates Syntactic Transfer: –“Lexicalized” Structural Transfer Grammars Trigger rule by matching against “lexicalized” syntactic structure on SL side: lexical and functional features Rule specifies how to reorder and re-structure the syntactic constituents to reflect syntax of TL side Semantic Transfer: –SL Semantic Representation abstracts away from SL syntax to functional roles done during analysis –TL Generation maps semantic structures to correct TL syntax
33
June 27, 2006JHU Summer School33 Example of Structural Transfer Rule (verb-argument) [From Hutchins & Somers]
34
June 27, 2006JHU Summer School34 Semantic Transfer: Theta Structure (case roles) [From Hutchins & Somers] Abstracts away from grammatical functions Looks more like a “semantic f-structure” The basis for “semantic transfer”
35
June 27, 2006JHU Summer School35 Idioms and Constructions Main Distinction: meaning of whole is not directly compositional from meaning of its sub-parts no compositional translation Examples: –George is a bull in a china shop –He kicked the bucket –Can you please open the window?
36
June 27, 2006JHU Summer School36 Formulaic Utterances Good night. tisbaH cala xEr waking up on good Romanization of Arabic from CallHome Egypt
37
June 27, 2006JHU Summer School37 Constructions Identifying speaker intention rather than literal meaning for formulaic and task-oriented sentences. How about … suggestion Why don’t you… suggestion Could you tell me… request info. I was wondering… request info.
38
June 27, 2006JHU Summer School38 MT Handling of Constructions and Idioms Direct MT Approaches: –No Explicit treatment: Phrasal Lexicons and sentence level matches or templates Syntactic Transfer: –No effective treatment –“Highly Lexicalized” Structural Transfer rules can handle some constructions Trigger rule by matching against entire construction, including structure on SL side Rule specifies how to generate the correct construction on the TL side Semantic Transfer: –Analysis must capture non-compositional representation of the idiom or construction specialized rules –TL Generation maps construction semantic structures to correct TL syntax and lexical words
39
June 27, 2006JHU Summer School39 Summary Main challenges for current state-of-the-art MT approaches - Coverage and Accuracy: –Acquiring broad-coverage high-accuracy translation lexicons (for words and phrases) –learning syntactic mappings between languages from parallel word-aligned data –overcoming syntax-to-semantics differences and dealing with constructions –Stronger Target Language Modeling
40
June 27, 2006JHU Summer School40 AVENUE: Learning Transfer Rules Develop new approaches for automatically acquiring syntactic MT transfer rules from small amounts of elicited translated and word- aligned data –Specifically designed to bootstrap MT for languages for which only limited amounts of electronic resources are available (particularly indigenous minority languages) –Use machine learning techniques to generalize transfer rules from specific translated examples –Combine with decoding techniques from SMT for producing the best translation of new input from a lattice of translation segments Languages: Hebrew, Hindi, Brazilian Portuguese, Mapudungun, Quechua
41
June 27, 2006JHU Summer School41 Multi-Engine MT New approach developed over past two years Main ideas: –Treat original engines as “black boxes” –Align the word and phrase correspondences between the translations –Build a collection of synthetic combinations based on the aligned words and phrases –Score the synthetic combinations based on Language Model and confidence measures –Select the top-scoring synthetic combination Architecture Issues: integrating “workflows” that produce multiple translations and then combine them with MEMT –IBM’s UIMA architecture
42
June 27, 2006JHU Summer School42 Example Sys1: feature prominently venezuela ranked fifth in exporting oil field in the world and eighth in production Sys2: Venezuela is occupied by the fifth place to export oil in the world, eighth in production Sys3: Venezuela the top ranked fifth in the oil export in the world and the eighth in the production MEMT Sentence : Selected : venezuela is the top ranked fifth in the oil export in the world to eighth in production.
43
June 27, 2006JHU Summer School43 Example Sys1: announced afghan authorities on Saturday reconstituted four intergovernmental committees accelerate the process of disarmament removal packing between fighters and pictures of war are still have enjoyed substantial influence Sys2: The Afghan authorities on Saturday the formation of the four committees of government to speed up the process of disarmament demobilization of fighters of the leaders of the war who still have a significant influence. Sys3: the authorities announced Saturday Afghan form four committees government accelerate the process of disarmament and complete disarmament and demobilization followed the leaders of the war who continues to enjoy considerable influence MEMT Sentence : Selected : the afghan authorities on Saturday announced the formation of the four committees of government to speed up the process of disarmament and demobilization of fighters of the leaders of the war who still have a significant influence.
44
June 27, 2006JHU Summer School44 Automatic MT Evaluation METEOR: new metric developed at CMU Improves upon BLEU metric developed by IBM and used extensively in recent years Main ideas: –Assess the similarity between a machine-produced translation and (several) human reference translations –Similarity is based on word-to-word matching that matches: Identical words Morphological variants of same word (stemming) synonyms –Similarity is based on weighted combination of Precision and Recall –Address fluency/grammaticality via a direct penalty: how well-ordered is the matching of the MT output with the reference? Improved levels of correlation with human judgments of MT Quality
45
June 27, 2006JHU Summer School45 The METEOR Metric Example: –Reference: “the Iraqi weapons are to be handed over to the army within two weeks” –MT output: “in two weeks Iraq’s weapons will give army” Matching: Ref: Iraqi weapons army two weeks MT: two weeks Iraq’s weapons army P = 5/8 =0.625 R = 5/14 = 0.357 Fmean = 10*P*R/(9P+R) = 0.3731 Fragmentation: 3 frags of 5 words = (3-1)/(5-1) = 0.50 Discounting factor: DF = 0.5 * (frag**3) = 0.0625 Final score: Fmean * (1- DF) = 0.3731*0.9375 = 0.3498
46
June 27, 2006JHU Summer School46 Questions…
47
June 27, 2006JHU Summer School47 MT for Minority and Indigenous Languages: Challenges Minimal amount of parallel text Possibly competing standards for orthography/spelling Often relatively few trained linguists Access to native informants possible Need to minimize development time and cost
48
June 27, 2006JHU Summer School48 Learning Transfer-Rules for Languages with Limited Resources Rationale: –Large bilingual corpora not available –Bilingual native informant(s) can translate and align a small pre-designed elicitation corpus, using elicitation tool –Elicitation corpus designed to be typologically comprehensive and compositional –Transfer-rule engine and new learning approach support acquisition of generalized transfer-rules from the data
49
June 27, 2006JHU Summer School49 English-Hindi Example
50
June 27, 2006JHU Summer School50 Why Machine Translation for Minority and Indigenous Languages? Commercial MT economically feasible for only a handful of major languages with large resources (corpora, human developers) Is there hope for MT for languages with limited resources? Benefits include: –Better government access to indigenous communities (Epidemics, crop failures, etc.) –Better indigenous communities participation in information-rich activities (health care, education, government) without giving up their languages. –Language preservation –Civilian and military applications (disaster relief)
51
June 27, 2006JHU Summer School51 English-Chinese Example
52
June 27, 2006JHU Summer School52 Spanish-Mapudungun Example
53
June 27, 2006JHU Summer School53 English-Arabic Example
54
June 27, 2006JHU Summer School54 The Elicitation Corpus Translated, aligned by bilingual informant Corpus consists of linguistically diverse constructions Based on elicitation and documentation work of field linguists (e.g. Comrie 1977, Bouquiaux 1992) Organized compositionally: elicit simple structures first, then use them as building blocks Goal: minimize size, maximize linguistic coverage
55
June 27, 2006JHU Summer School55 Transfer Rule Formalism Type information Part-of-speech/constituent information Alignments x-side constraints y-side constraints xy-constraints, e.g. ((Y1 AGR) = (X1 AGR)) ; SL: the old man, TL: ha-ish ha-zaqen NP::NP [DET ADJ N] -> [DET N DET ADJ] ( (X1::Y1) (X1::Y3) (X2::Y4) (X3::Y2) ((X1 AGR) = *3-SING) ((X1 DEF = *DEF) ((X3 AGR) = *3-SING) ((X3 COUNT) = +) ((Y1 DEF) = *DEF) ((Y3 DEF) = *DEF) ((Y2 AGR) = *3-SING) ((Y2 GENDER) = (Y4 GENDER)) )
56
June 27, 2006JHU Summer School56 Transfer Rule Formalism (II) Value constraints Agreement constraints ;SL: the old man, TL: ha-ish ha-zaqen NP::NP [DET ADJ N] -> [DET N DET ADJ] ( (X1::Y1) (X1::Y3) (X2::Y4) (X3::Y2) ((X1 AGR) = *3-SING) ((X1 DEF = *DEF) ((X3 AGR) = *3-SING) ((X3 COUNT) = +) ((Y1 DEF) = *DEF) ((Y3 DEF) = *DEF) ((Y2 AGR) = *3-SING) ((Y2 GENDER) = (Y4 GENDER)) )
57
June 27, 2006JHU Summer School57 The Transfer Engine Analysis Source text is parsed into its grammatical structure. Determines transfer application ordering. Example: 他 看 书。 (he read book) S NP VP N V NP 他 看 书 Transfer A target language tree is created by reordering, insertion, and deletion. S NP VP N V NP he read DET N a book Article “a” is inserted into object NP. Source words translated with transfer lexicon. Generation Target language constraints are checked and final translation produced. E.g. “reads” is chosen over “read” to agree with “he”. Final translation: “He reads a book”
58
June 27, 2006JHU Summer School58 Rule Learning - Overview Goal: Acquire Syntactic Transfer Rules Use available knowledge from the source side (grammatical structure) Three steps: 1.Flat Seed Generation: first guesses at transfer rules; flat syntactic structure 2.Compositionality: use previously learned rules to add hierarchical structure 3.Seeded Version Space Learning: refine rules by learning appropriate feature constraints
59
June 27, 2006JHU Summer School59 Flat Seed Rule Generation Learning Example: NP Eng: the big apple Heb: ha-tapuax ha-gadol Generated Seed Rule: NP::NP [ART ADJ N] [ART N ART ADJ] ((X1::Y1) (X1::Y3) (X2::Y4) (X3::Y2))
60
June 27, 2006JHU Summer School60 Flat Seed Generation Create a transfer rule that is specific to the sentence pair, but abstracted to the POS level. No syntactic structure. ElementSource SL POS sequencef-structure TL POS sequenceTL dictionary, aligned SL words Type informationcorpus, same on SL and TL Alignmentsinformant x-side constraintsf-structure y-side constraintsTL dictionary, aligned SL words (list of projecting features)
61
June 27, 2006JHU Summer School61 Compositionality Initial Flat Rules: S::S [ART ADJ N V ART N] [ART N ART ADJ V P ART N] ((X1::Y1) (X1::Y3) (X2::Y4) (X3::Y2) (X4::Y5) (X5::Y7) (X6::Y8)) NP::NP [ART ADJ N] [ART N ART ADJ] ((X1::Y1) (X1::Y3) (X2::Y4) (X3::Y2)) NP::NP [ART N] [ART N] ((X1::Y1) (X2::Y2)) Generated Compositional Rule: S::S [NP V NP] [NP V P NP] ((X1::Y1) (X2::Y2) (X3::Y4))
62
June 27, 2006JHU Summer School62 Compositionality - Overview Traverse the c-structure of the English sentence, add compositional structure for translatable chunks Adjust constituent sequences, alignments Remove unnecessary constraints, i.e. those that are contained in the lower- level rule
63
June 27, 2006JHU Summer School63 Seeded Version Space Learning Input: Rules and their Example Sets S::S [NP V NP] [NP V P NP] {ex1,ex12,ex17,ex26} ((X1::Y1) (X2::Y2) (X3::Y4)) NP::NP [ART ADJ N] [ART N ART ADJ] {ex2,ex3,ex13} ((X1::Y1) (X1::Y3) (X2::Y4) (X3::Y2)) NP::NP [ART N] [ART N] {ex4,ex5,ex6,ex8,ex10,ex11} ((X1::Y1) (X2::Y2)) Output: Rules with Feature Constraints: S::S [NP V NP] [NP V P NP] ((X1::Y1) (X2::Y2) (X3::Y4) (X1 NUM = X2 NUM) (Y1 NUM = Y2 NUM) (X1 NUM = Y1 NUM))
64
June 27, 2006JHU Summer School64 Seeded Version Space Learning: Overview Goal: add appropriate feature constraints to the acquired rules Methodology: –Preserve general structural transfer –Learn specific feature constraints from example set Seed rules are grouped into clusters of similar transfer structure (type, constituent sequences, alignments) Each cluster forms a version space: a partially ordered hypothesis space with a specific and a general boundary The seed rules in a group form the specific boundary of a version space The general boundary is the (implicit) transfer rule with the same type, constituent sequences, and alignments, but no feature constraints
65
June 27, 2006JHU Summer School65 Seeded Version Space Learning: Generalization The partial order of the version space: Definition: A transfer rule tr 1 is strictly more general than another transfer rule tr 2 if all f- structures that are satisfied by tr 2 are also satisfied by tr 1. Generalize rules by merging them: –Deletion of constraint –Raising two value constraints to an agreement constraint, e.g. ((x1 num) = *pl), ((x3 num) = *pl) ((x1 num) = (x3 num))
66
June 27, 2006JHU Summer School66 Seeded Version Space Learning NP v det nNP VP … 1.Group seed rules into version spaces as above. 2.Make use of partial order of rules in version space. Partial order is defined via the f-structures satisfying the constraints. 3.Generalize in the space by repeated merging of rules: 1.Deletion of constraint 2.Moving value constraints to agreement constraints, e.g. ((x1 num) = *pl), ((x3 num) = *pl) ((x1 num) = (x3 num) 4. Check translation power of generalized rules against sentence pairs
67
June 27, 2006JHU Summer School67 Examples of Learned Rules (Hindi-to-English) {NP,14244} ;;Score:0.0429 NP::NP [N] -> [DET N] ( (X1::Y2) ) {NP,14434} ;;Score:0.0040 NP::NP [ADJ CONJ ADJ N] -> [ADJ CONJ ADJ N] ( (X1::Y1) (X2::Y2) (X3::Y3) (X4::Y4) ) {PP,4894} ;;Score:0.0470 PP::PP [NP POSTP] -> [PREP NP] ( (X2::Y1) (X1::Y2) )
68
June 27, 2006JHU Summer School68 Manual Transfer Rules: Hindi Example ;; PASSIVE OF SIMPLE PAST (NO AUX) WITH LIGHT VERB ;; passive of 43 (7b) {VP,28} VP::VP : [V V V] -> [Aux V] ( (X1::Y2) ((x1 form) = root) ((x2 type) =c light) ((x2 form) = part) ((x2 aspect) = perf) ((x3 lexwx) = 'jAnA') ((x3 form) = part) ((x3 aspect) = perf) (x0 = x1) ((y1 lex) = be) ((y1 tense) = past) ((y1 agr num) = (x3 agr num)) ((y1 agr pers) = (x3 agr pers)) ((y2 form) = part) )
69
June 27, 2006JHU Summer School69 Manual Transfer Rules: Example ; NP1 ke NP2 -> NP2 of NP1 ; Ex: jIvana ke eka aXyAya ; life of (one) chapter ; ==> a chapter of life ; {NP,12} NP::NP : [PP NP1] -> [NP1 PP] ( (X1::Y2) (X2::Y1) ; ((x2 lexwx) = 'kA') ) {NP,13} NP::NP : [NP1] -> [NP1] ( (X1::Y1) ) {PP,12} PP::PP : [NP Postp] -> [Prep NP] ( (X1::Y2) (X2::Y1) ) NP PP NP1 NP P Adj N N1 ke eka aXyAya N jIvana NP NP1 PP Adj N P NP one chapter of N1 N life
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.