A Trainable Transfer-based MT Approach for Languages with Limited Resources Alon Lavie Language Technologies Institute Carnegie Mellon University Joint.

Slides:

Advertisements

Similar presentations

Improving Machine Translation Quality via Hybrid Systems and Refined Evaluation Methods Andreas Eisele DFKI GmbH and Saarland University Helsinki, November.

Advertisements

Enabling MT for Languages with Limited Resources Alon Lavie Language Technologies Institute Carnegie Mellon University.

NICE: Native language Interpretation and Communication Environment Lori Levin, Jaime Carbonell, Alon Lavie, Ralf Brown Carnegie Mellon University.

The current status of Chinese-English EBMT research -where are we now Joy, Ralf Brown, Robert Frederking, Erik Peterson Aug 2001.

Automatic Rule Learning for Resource-Limited Machine Translation Alon Lavie, Katharina Probst, Erik Peterson, Jaime Carbonell, Lori Levin, Ralf Brown Language.

1 Improving a Statistical MT System with Automatically Learned Rewrite Patterns Fei Xia and Michael McCord (Coling 2004) UW Machine Translation Reading.

Course Summary LING 575 Fei Xia 03/06/07. Outline Introduction to MT: 1 Major approaches –SMT: 3 –Transfer-based MT: 2 –Hybrid systems: 2 Other topics.

Semi-Automatic Learning of Transfer Rules for Machine Translation of Low-Density Languages Katharina Probst April 5, 2002.

MT Summit VIII, Language Technologies Institute School of Computer Science Carnegie Mellon University Pre-processing of Bilingual Corpora for Mandarin-English.

Does Syntactic Knowledge help English- Hindi SMT ? Avinesh. PVS. K. Taraka Rama, Karthik Gali.

Language Technologies Institute School of Computer Science Carnegie Mellon University NSF August 6, 2001 NICE: Native language Interpretation and Communication.

Czech-to-English Translation: MT Marathon 2009 Session Preview Jonathan Clark Greg Hanneman Language Technologies Institute Carnegie Mellon University.

Building NLP Systems for Two Resource Scarce Indigenous Languages: Mapudungun and Quechua, and some other languages Christian Monson, Ariadna Font Llitjós,

Profile The METIS Approach Future Work Evaluation METIS II Architecture METIS II, the continuation of the successful assessment project METIS I, is an.

Machine Translation Overview Alon Lavie Language Technologies Institute Carnegie Mellon University LTI Open House March 24, 2006.

Can Controlled Language Rules increase the value of MT? Fred Hollowood & Johann Rotourier Symantec Dublin.

Transfer-based MT with Strong Decoding for a Miserly Data Scenario Alon Lavie Language Technologies Institute Carnegie Mellon University Joint work with:

Coping with Surprise: Multiple CMU MT Approaches Alon Lavie Lori Levin, Jaime Carbonell, Alex Waibel, Stephan Vogel, Ralf Brown, Robert Frederking Language.

Statistical XFER: Hybrid Statistical Rule-based Machine Translation Alon Lavie Language Technologies Institute Carnegie Mellon University Joint work with:

Recent Major MT Developments at CMU Briefing for Joe Olive February 5, 2008 Alon Lavie and Stephan Vogel Language Technologies Institute Carnegie Mellon.

Dependency Tree-to-Dependency Tree Machine Translation November 4, 2011 Presented by: Jeffrey Flanigan (CMU) Lori Levin, Jaime Carbonell In collaboration.

Reordering Model Using Syntactic Information of a Source Tree for Statistical Machine Translation Kei Hashimoto, Hirohumi Yamamoto, Hideo Okuma, Eiichiro.

Approaches to Machine Translation CSC 5930 Machine Translation Fall 2012 Dr. Tom Way.

Rapid Prototyping of a Transfer-based Hebrew-to-English Machine Translation System Alon Lavie Language Technologies Institute Carnegie Mellon University.

Rule Learning - Overview Goal: Syntactic Transfer Rules 1) Flat Seed Generation: produce rules from word- aligned sentence pairs, abstracted only to POS.

Hindi SLE Debriefing AVENUE Transfer System July 3, 2003.

AMTEXT: Extraction-based MT for Arabic Faculty: Alon Lavie, Jaime Carbonell Students and Staff: Laura Kieras, Peter Jansen Informant: Loubna El Abadi.

Transfer-based MT with Strong Decoding for a Miserly Data Scenario Alon Lavie Language Technologies Institute Carnegie Mellon University Joint work with:

MEMT: Multi-Engine Machine Translation Faculty: Alon Lavie, Robert Frederking, Ralf Brown, Jaime Carbonell Students: Shyamsundar Jayaraman, Satanjeev Banerjee.

AVENUE Automatic Machine Translation for low-density languages Ariadna Font Llitjós Language Technologies Institute SCS Carnegie Mellon University.

Ideas for 100K Word Data Set for Human and Machine Learning Lori Levin Alon Lavie Jaime Carbonell Language Technologies Institute Carnegie Mellon University.

Carnegie Mellon Goal Recycle non-expert post-editing efforts to: - Refine translation rules automatically - Improve overall translation quality Proposed.

Hebrew-to-English XFER MT Project - Update Alon Lavie June 2, 2004.

An Overview of the AVENUE Project Presented by Lori Levin Language Technologies Institute School of Computer Science Carnegie Mellon University Pittsburgh,

Machine Translation Overview Alon Lavie Language Technologies Institute Carnegie Mellon University Open House March 18, 2005.

Language Technologies Institute School of Computer Science Carnegie Mellon University NSF, August 6, 2001 Machine Translation for Indigenous Languages.

Coping with Surprise: Multiple CMU MT Approaches Alon Lavie Lori Levin, Jaime Carbonell, Alex Waibel, Stephan Vogel, Ralf Brown, Robert Frederking Language.

MEMT: Multi-Engine Machine Translation Faculty: Alon Lavie, Jaime Carbonell Students and Staff: Gregory Hanneman, Justin Merrill (Shyamsundar Jayaraman,

A Trainable Transfer-based MT Approach for Languages with Limited Resources Alon Lavie Language Technologies Institute Carnegie Mellon University Joint.

The CMU Mill-RADD Project: Recent Activities and Results Alon Lavie Language Technologies Institute Carnegie Mellon University.

Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart

Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement Ariadna Font Llitjós, Katharina Probst, Jaime Carbonell Language Technologies.

Approaching a New Language in Machine Translation Anna Sågvall Hein, Per Weijnitz.

Machine Translation Overview Alon Lavie Language Technologies Institute Carnegie Mellon University August 25, 2004.

Large Vocabulary Data Driven MT: New Developments in the CMU SMT System Stephan Vogel, Alex Waibel Work done in collaboration with: Ying Zhang, Alicia.

Bridging the Gap: Machine Translation for Lesser Resourced Languages

Avenue Architecture Learning Module Learned Transfer Rules Lexical Resources Run Time Transfer System Decoder Translation Correction Tool Word- Aligned.

October 10, 2003BLTS Kickoff Meeting1 Transfer with Strong Decoding Learning Module Transfer Rules {PP,4894} ;;Score: PP::PP [NP POSTP] -> [PREP.

CMU Statistical-XFER System Hybrid “rule-based”/statistical system Scaled up version of our XFER approach developed for low-resource languages Large-coverage.

Eliciting a corpus of word- aligned phrases for MT Lori Levin, Alon Lavie, Erik Peterson Language Technologies Institute Carnegie Mellon University.

Seed Generation and Seeded Version Space Learning Version 0.02 Katharina Probst Feb 28,2002.

CMU MilliRADD Small-MT Report TIDES PI Meeting 2002 The CMU MilliRADD Team: Jaime Carbonell, Lori Levin, Ralf Brown, Stephan Vogel, Alon Lavie, Kathrin.

AVENUE: Machine Translation for Resource-Poor Languages NSF ITR

Developing affordable technologies for resource-poor languages Ariadna Font Llitjós Language Technologies Institute Carnegie Mellon University September.

FROM BITS TO BOTS: Women Everywhere, Leading the Way Lenore Blum, Anastassia Ailamaki, Manuela Veloso, Sonya Allin, Bernardine Dias, Ariadna Font Llitjós.

Nov 17, 2005Learning-based MT1 Learning-based MT Approaches for Languages with Limited Resources Alon Lavie Language Technologies Institute Carnegie Mellon.

A Simple English-to-Punjabi Translation System By : Shailendra Singh.

MEMT: Multi-Engine Machine Translation Faculty: Alon Lavie, Robert Frederking, Ralf Brown, Jaime Carbonell Students: Shyamsundar Jayaraman, Satanjeev Banerjee.

Semi-Automatic Learning of Transfer Rules for Machine Translation of Minority Languages Katharina Probst Language Technologies Institute Carnegie Mellon.

Enabling MT for Languages with Limited Resources Alon Lavie and Lori Levin Language Technologies Institute Carnegie Mellon University.

LingWear Language Technology for the Information Warrior Alex Waibel, Lori Levin Alon Lavie, Robert Frederking Carnegie Mellon University.

The AVENUE Project: Automatic Rule Learning for Resource-Limited Machine Translation Faculty: Alon Lavie, Jaime Carbonell, Lori Levin, Ralf Brown Students:

Eliciting a corpus of word-aligned phrases for MT

Approaches to Machine Translation

Faculty: Alon Lavie, Jaime Carbonell, Lori Levin, Ralf Brown Students:

Urdu-to-English Stat-XFER system for NIST MT Eval 2008

Alon Lavie, Jaime Carbonell, Lori Levin,

Vamshi Ambati 14 Sept 2007 Student Research Symposium

Approaches to Machine Translation

Presentation transcript:

A Trainable Transfer-based MT Approach for Languages with Limited Resources Alon Lavie Language Technologies Institute Carnegie Mellon University Joint Work with: Lori Levin, Jaime Carbonell, Katharina Probst, Erik Peterson, Stephan Vogel and Ariadna Font-Llitjos

26 April, 2004EAMT Meeting/ Malta2 Why Machine Translation for Languages with Limited Resources? Commercial MT economically feasible for only a handful of major languages with large resources (corpora, human developers) Statistical MT looks promising – but requires very large volumes of parallel texts Is there hope for MT for languages with limited electronic data resources? Benefits include: –Better government information access to indigenous communities –Better indigenous communities participation in information-rich activities (health care, education, government) without giving up their languages. –Civilian and military applications (disaster relief) –Language preservation

26 April, 2004EAMT Meeting/ Malta3 MT for Languages with Limited Resources: Challenges Minimal amount of parallel text Possibly lack of standards for orthography/spelling Often relatively few trained linguists Access to native bilingual informants possible No real economic incentive, Limited financial resources for developing MT –Need to minimize development time and cost

26 April, 2004EAMT Meeting/ Malta4 AVENUE Partners LanguageCountryInstitutions Mapudungun (in place) Chile Universidad de la Frontera, Institute for Indigenous Studies, Ministry of Education Quechua (discussion) Peru Ministry of Education Aymara (discussion) Bolivia, Peru Ministry of Education

26 April, 2004EAMT Meeting/ Malta5 AVENUE: Two Technical Approaches Generalized EBMT Parallel text 50K- 2MB (uncontrolled corpus) Rapid implementation Proven for major L’s with reduced data Transfer-rule learning Elicitation (controlled) corpus to extract grammatical properties Seeded version- space learning

26 April, 2004EAMT Meeting/ Malta6 Transfer with Strong Decoding Learning Module Transfer Rules {PP,4894} ;;Score: PP::PP [NP POSTP] -> [PREP NP] ((X2::Y1) (X1::Y2)) Translation Lexicon Run Time Transfer System Lattice Decoder English Language Model Word-to-Word Translation Probabilities Word-aligned elicited data SL input TL output

26 April, 2004EAMT Meeting/ Malta7 Learning Transfer-Rules for Languages with Limited Resources Rationale: –Large bilingual corpora not available –Bilingual native informant(s) can translate and align a small pre-designed elicitation corpus, using elicitation tool –Elicitation corpus designed to be typologically comprehensive and compositional –Transfer-rule engine and new learning approach support acquisition of generalized transfer-rules from the data

26 April, 2004EAMT Meeting/ Malta8 English-Hindi Example

26 April, 2004EAMT Meeting/ Malta9 Spanish-Mapudungun Example

26 April, 2004EAMT Meeting/ Malta10 English-Arabic Example

26 April, 2004EAMT Meeting/ Malta11 The Elicitation Corpus Translated, aligned by bilingual informant Rich information about the sentences elicited Corpus consists of linguistically diverse constructions Based on elicitation and documentation work of field linguists (e.g. Comrie 1977, Bouquiaux 1992) Organized compositionally: elicit simple structures first, then use them as building blocks Goal: minimize size, maximize linguistic coverage Typological EC currently of about ~1000 sentences Work in progress: –Feature Detection –Navigation control through the corpus during elicitation –Extensions to phenomena not currently covered –Experimenting with alternative types of elicited data

26 April, 2004EAMT Meeting/ Malta12 Transfer Rule Formalism Type information Part-of-speech/constituent information Alignments x-side constraints y-side constraints xy-constraints, e.g. ((Y1 AGR) = (X1 AGR)) ; SL: the old man, TL: ha-ish ha-zaqen NP::NP [DET ADJ N] -> [DET N DET ADJ] ( (X1::Y1) (X1::Y3) (X2::Y4) (X3::Y2) ((X1 AGR) = *3-SING) ((X1 DEF = *DEF) ((X3 AGR) = *3-SING) ((X3 COUNT) = +) ((Y1 DEF) = *DEF) ((Y3 DEF) = *DEF) ((Y2 AGR) = *3-SING) ((Y2 GENDER) = (Y4 GENDER)) )

26 April, 2004EAMT Meeting/ Malta13 Transfer Rule Formalism (II) Value constraints Agreement constraints ;SL: the old man, TL: ha-ish ha-zaqen NP::NP [DET ADJ N] -> [DET N DET ADJ] ( (X1::Y1) (X1::Y3) (X2::Y4) (X3::Y2) ((X1 AGR) = *3-SING) ((X1 DEF = *DEF) ((X3 AGR) = *3-SING) ((X3 COUNT) = +) ((Y1 DEF) = *DEF) ((Y3 DEF) = *DEF) ((Y2 AGR) = *3-SING) ((Y2 GENDER) = (Y4 GENDER)) )

26 April, 2004EAMT Meeting/ Malta14 The Transfer Engine Analysis Source text is parsed into its grammatical structure. Determines transfer application ordering. Example: 他看书。 (he read book) S NP VP N V NP 他看书 Transfer A target language tree is created by reordering, insertion, and deletion. S NP VP N V NP he read DET N a book Article “a” is inserted into object NP. Source words translated with transfer lexicon. Generation Target language constraints are checked and final translation produced. E.g. “reads” is chosen over “read” to agree with “he”. Final translation: “He reads a book”

26 April, 2004EAMT Meeting/ Malta15 Rule Learning - Overview Goal: Acquire Syntactic Transfer Rules Use available knowledge from the source side (grammatical structure) Three steps: 1.Flat Seed Generation: first guesses at transfer rules; flat syntactic structure 2.Compositionality: use previously learned rules to add hierarchical structure 3.Seeded Version Space Learning: refine rules by learning appropriate feature constraints

26 April, 2004EAMT Meeting/ Malta16 Flat Seed Rule Generation Learning Example: NP Eng: the big apple Heb: ha-tapuax ha-gadol Generated Seed Rule: NP::NP [ART ADJ N]  [ART N ART ADJ] ((X1::Y1) (X1::Y3) (X2::Y4) (X3::Y2))

26 April, 2004EAMT Meeting/ Malta17 Compositionality Initial Flat Rules: S::S [ART ADJ N V ART N]  [ART N ART ADJ V P ART N] ((X1::Y1) (X1::Y3) (X2::Y4) (X3::Y2) (X4::Y5) (X5::Y7) (X6::Y8)) NP::NP [ART ADJ N]  [ART N ART ADJ] ((X1::Y1) (X1::Y3) (X2::Y4) (X3::Y2)) NP::NP [ART N]  [ART N] ((X1::Y1) (X2::Y2)) Generated Compositional Rule: S::S [NP V NP]  [NP V P NP] ((X1::Y1) (X2::Y2) (X3::Y4))

26 April, 2004EAMT Meeting/ Malta18 Compositionality - Overview Traverse the c-structure of the English sentence, add compositional structure for translatable chunks Adjust constituent sequences, alignments in the transfer rule

26 April, 2004EAMT Meeting/ Malta19 Seeded Version Space Learning Input: Rules and their Example Sets S::S [NP V NP]  [NP V P NP] {ex1,ex12,ex17,ex26} ((X1::Y1) (X2::Y2) (X3::Y4)) NP::NP [ART ADJ N]  [ART N ART ADJ] {ex2,ex3,ex13} ((X1::Y1) (X1::Y3) (X2::Y4) (X3::Y2)) NP::NP [ART N]  [ART N] {ex4,ex5,ex6,ex8,ex10,ex11} ((X1::Y1) (X2::Y2)) Output: Rules with Feature Constraints: S::S [NP V NP]  [NP V P NP] ((X1::Y1) (X2::Y2) (X3::Y4) (X1 NUM = X2 NUM) (Y1 NUM = Y2 NUM) (X1 NUM = Y1 NUM))

26 April, 2004EAMT Meeting/ Malta20 Seeded Version Space Learning: Overview Goal: add appropriate feature constraints to the acquired rules Methodology: –Preserve general structural transfer –Learn specific feature constraints from example set Seed rules are grouped into clusters of similar transfer structure (type, constituent sequences, alignments) Each cluster forms a version space: a partially ordered hypothesis space with a specific and a general boundary The seed rules in a group form the specific boundary of a version space The general boundary is the (implicit) transfer rule with the same type, constituent sequences, and alignments, but no feature constraints

26 April, 2004EAMT Meeting/ Malta21 Examples of Automatically Learned Rules (Hindi-to-English) {NP,14244} ;;Score: NP::NP [N] -> [DET N] ( (X1::Y2) ) {NP,14434} ;;Score: NP::NP [ADJ CONJ ADJ N] -> [ADJ CONJ ADJ N] ( (X1::Y1) (X2::Y2) (X3::Y3) (X4::Y4) ) {PP,4894} ;;Score: PP::PP [NP POSTP] -> [PREP NP] ( (X2::Y1) (X1::Y2) )

26 April, 2004EAMT Meeting/ Malta22 Manual Transfer Rules: Hindi Example ;; PASSIVE OF SIMPLE PAST (NO AUX) WITH LIGHT VERB ;; passive of 43 (7b) {VP,28} VP::VP : [V V V] -> [Aux V] ( (X1::Y2) ((x1 form) = root) ((x2 type) =c light) ((x2 form) = part) ((x2 aspect) = perf) ((x3 lexwx) = 'jAnA') ((x3 form) = part) ((x3 aspect) = perf) (x0 = x1) ((y1 lex) = be) ((y1 tense) = past) ((y1 agr num) = (x3 agr num)) ((y1 agr pers) = (x3 agr pers)) ((y2 form) = part) )

26 April, 2004EAMT Meeting/ Malta23 Manual Transfer Rules: Example ; NP1 ke NP2 -> NP2 of NP1 ; Ex: jIvana ke eka aXyAya ; life of (one) chapter ; ==> a chapter of life ; {NP,12} NP::NP : [PP NP1] -> [NP1 PP] ( (X1::Y2) (X2::Y1) ; ((x2 lexwx) = 'kA') ) {NP,13} NP::NP : [NP1] -> [NP1] ( (X1::Y1) ) {PP,12} PP::PP : [NP Postp] -> [Prep NP] ( (X1::Y2) (X2::Y1) ) NP PP NP1 NP P Adj N N1 ke eka aXyAya N jIvana NP NP1 PP Adj N P NP one chapter of N1 N life

26 April, 2004EAMT Meeting/ Malta24 A Limited Data Scenario for Hindi-to-English Conducted during a DARPA “Surprise Language Exercise” (SLE) in June 2003 Put together a scenario with “miserly” data resources: –Elicited Data corpus: phrases –Cleaned portion (top 12%) of LDC dictionary: ~2725 Hindi words (23612 translation pairs) –Manually acquired resources during the SLE: 500 manual bigram translations 72 manually written phrase transfer rules 105 manually written postposition rules 48 manually written time expression rules No additional parallel text!!

26 April, 2004EAMT Meeting/ Malta25 Manual Grammar Development Covers mostly NPs, PPs and VPs (verb complexes) ~70 grammar rules, covering basic and recursive NPs and PPs, verb complexes of main tenses in Hindi (developed in two weeks)

26 April, 2004EAMT Meeting/ Malta26 Adding a “Strong” Decoder XFER system produces a full lattice of translation fragments, ranging from single words to long phrases or sentences Edges are scored using word-to-word translation probabilities, trained from the limited bilingual data Decoder uses an English LM (70m words) Decoder can also reorder words or phrases (up to 4 positions ahead) For XFER (strong), ONLY edges from basic XFER system are used!

26 April, 2004EAMT Meeting/ Malta27 Testing Conditions Tested on section of JHU provided data: 258 sentences with four reference translations –SMT system (stand-alone) –EBMT system (stand-alone) –XFER system (naïve decoding) –XFER system with “strong” decoder No grammar rules (baseline) Manually developed grammar rules Automatically learned grammar rules –XFER+SMT with strong decoder (MEMT)

26 April, 2004EAMT Meeting/ Malta28 Automatic MT Evaluation Metrics Intend to replace or complement human assessment of translation quality of MT produced translation Principle idea: compare how similar is the MT produced translation with human reference translation(s) of the same input Main metric in use today: IBM’s BLEU –Count n-gram (unigrams, bigrams, trigrams, etc) overlap between the MT output and several reference translations –Calculate a combined n-gram precision score NIST variant of BLEU used for official DARPA evaluations

26 April, 2004EAMT Meeting/ Malta29 Results on JHU Test Set SystemBLEUM-BLEUNIST EBMT SMT XFER (naïve) man grammar XFER (strong) no grammar XFER (strong) learned grammar XFER (strong) man grammar XFER+SMT

26 April, 2004EAMT Meeting/ Malta30 Effect of Reordering in the Decoder

26 April, 2004EAMT Meeting/ Malta31 Observations and Lessons (I) XFER with strong decoder outperformed SMT even without any grammar rules in the miserly data scenario –SMT Trained on elicited phrases that are very short –SMT has insufficient data to train more discriminative translation probabilities –XFER takes advantage of Morphology Token coverage without morphology: Token coverage with morphology: Manual grammar currently somewhat better than automatically learned grammar –Learned rules did not yet use version-space learning –Large room for improvement on learning rules –Importance of effective well-founded scoring of learned rules

26 April, 2004EAMT Meeting/ Malta32 Observations and Lessons (II) MEMT (XFER and SMT) based on strong decoder produced best results in the miserly scenario. Reordering within the decoder provided very significant score improvements –Much room for more sophisticated grammar rules –Strong decoder can carry some of the reordering “burden”

26 April, 2004EAMT Meeting/ Malta33 XFER MT for Hebrew-to-English Two month intensive effort to apply our XFER approach to the development of a Hebrew-to-English MT system Challenges: –No large parallel corpus –Limited coverage translation lexicon –Rich Morphology: incomplete analyzer available Plan: –Collect available resources, establish methodology for processing Hebrew input –Translate and align Elicitation Corpus –Learn XFER rules –Develop (small) manual XFER grammar as a point of comparison –System debugging and development –Evaluate performance on unseen test data using automatic evaluation metrics

26 April, 2004EAMT Meeting/ Malta34 Hebrew-to-English XFER System Accomplished: –Baseline system in place –Good lexical coverage: translation pairs –Reasonable morphological coverage –Small manual grammar: 29 rules, mostly NPs –Translated and aligned elicitation corpora –Learning of automatic grammar –Testing and development on dev-test in progress –Results on unseen data within a couple of weeks… Translation Example: in agreement with the interior ministry that copy fund will come to Haaretz agreed hotel homes to do all efforts to remove the african employees from Israel within days from the arrival of the new workers and to let people activities immigration police

26 April, 2004EAMT Meeting/ Malta35 Conclusions Transfer rules (both manual and learned) offer significant contributions that can outperform existing data-driven approaches –Also in medium and large data settings? Initial steps to development of a well-grounded transfer-based MT system with: –Translation segments that are scored based on a well-founded probability model –Strong and effective decoding that incorporates the most advanced techniques used in SMT decoding Working from the “opposite” end of research on incorporating models of syntax into “standard” SMT systems [Knight et al] Our direction makes sense in the limited data scenario

26 April, 2004EAMT Meeting/ Malta36 Future Directions Continued work on automatic rule learning (especially Seeded Version Space Learning) –Use Hebrew and Hindi systems as test platforms for experimenting with advanced learning research Correcting and refining transfer rules by interaction with native bilingual speakers Developing a well-founded model for assigning scores (probabilities) to transfer rules Improving the strong decoder to better fit the specific characteristics of the XFER model Further improved MEMT with: –Combination of output from different translation engines with different scorings – strong decoding capabilities