Bridging the Gap: Machine Translation for Lesser Resourced Languages

Slides:

Advertisements

Similar presentations

Word list entry: (spiser (V spise Pres)) Stem list entry: (spise (V Transitive (sense eat'))) Template list entries: (V ((sense) (trans relation))) (Pres((syntax.

Advertisements

Computational language: week 10 Lexical Knowledge Representation concluded Syntax-based computational language Sentence structure: syntax Context free.

Data Mining and Text Analytics By Saima Rahna & Anees Mohammad Quranic Arabic Corpus.

Feature Structures and Parsing Unification Grammars Algorithms for NLP 18 November 2014.

CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 2 (06/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Part of Speech (PoS)

May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Approaches to Parsing.

ParaMor Minimally Supervised Induction of Paradigm Structure and Morphological Analysis Christian Monson, Jaime Carbonell, Alon Lavie, Lori Levin Monolingual.

Are Linguists Dinosaurs? 1.Statistical language processors seem to be doing away with the need for linguists. –Why do we need linguists when a machine.

NICE: Native language Interpretation and Communication Environment Lori Levin, Jaime Carbonell, Alon Lavie, Ralf Brown Carnegie Mellon University.

09:10 Mikko Kurimo: "Unsupervised Morpheme Analysis -- Morpho Challenge Workshop 2007" 09:30 Mikko Kurimo: "Evaluation by a Comparison to a Linguistic.

Automatic Rule Learning for Resource-Limited Machine Translation Alon Lavie, Katharina Probst, Erik Peterson, Jaime Carbonell, Lori Levin, Ralf Brown Language.

Machine Translation with Scarce Resources The Avenue Project.

Semi-Automatic Learning of Transfer Rules for Machine Translation of Low-Density Languages Katharina Probst April 5, 2002.

MT Summit VIII, Language Technologies Institute School of Computer Science Carnegie Mellon University Pre-processing of Bilingual Corpora for Mandarin-English.

1 Statistical NLP: Lecture 13 Statistical Alignment and Machine Translation.

Language Technologies Institute School of Computer Science Carnegie Mellon University NSF August 6, 2001 NICE: Native language Interpretation and Communication.

March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing INTRODUCTION Muhammed Al-Mulhem March 1, 2009.

Czech-to-English Translation: MT Marathon 2009 Session Preview Jonathan Clark Greg Hanneman Language Technologies Institute Carnegie Mellon University.

8/19/20151 بسم الله الرحمن الرحيم ICS 482 Natural Language Processing Lecture 24: Project Ideas + Students Presentations Husni Al-Muhtaseb.

Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.

MT for Languages with Limited Resources Machine Translation April 20, 2011 Based on Joint Work with: Lori Levin, Jaime Carbonell, Stephan Vogel,

Machine Translation Dr. Radhika Mamidi. What is Machine Translation? A sub-field of computational linguistics It investigates the use of computer software.

Building NLP Systems for Two Resource Scarce Indigenous Languages: Mapudungun and Quechua, and some other languages Christian Monson, Ariadna Font Llitjós,

Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.

Eliciting Features from Minor Languages The elicitation tool provides a simple interface for bilingual informants with no linguistic training and limited.

Carnegie Mellon Christian Monson ParaMor Finding Paradigms Across Morphology Christian Monson.

Transfer-based MT with Strong Decoding for a Miserly Data Scenario Alon Lavie Language Technologies Institute Carnegie Mellon University Joint work with:

Morpho Challenge competition Evaluations and results Authors Mikko Kurimo Sami Virpioja Ville Turunen Krista Lagus.

Coping with Surprise: Multiple CMU MT Approaches Alon Lavie Lori Levin, Jaime Carbonell, Alex Waibel, Stephan Vogel, Ralf Brown, Robert Frederking Language.

Recent Major MT Developments at CMU Briefing for Joe Olive February 5, 2008 Alon Lavie and Stephan Vogel Language Technologies Institute Carnegie Mellon.

Dependency Tree-to-Dependency Tree Machine Translation November 4, 2011 Presented by: Jeffrey Flanigan (CMU) Lori Levin, Jaime Carbonell In collaboration.

Machine Translation  Machine translation is of one of the earliest uses of AI  Two approaches:  Traditional approach using grammars, rewrite rules,

Reordering Model Using Syntactic Information of a Source Tree for Statistical Machine Translation Kei Hashimoto, Hirohumi Yamamoto, Hideo Okuma, Eiichiro.

Morphology An Introduction to the Structure of Words Lori Levin and Christian Monson Grammars and Lexicons Fall Term, 2004.

Rule Learning - Overview Goal: Syntactic Transfer Rules 1) Flat Seed Generation: produce rules from word- aligned sentence pairs, abstracted only to POS.

Transfer-based MT with Strong Decoding for a Miserly Data Scenario Alon Lavie Language Technologies Institute Carnegie Mellon University Joint work with:

AVENUE Automatic Machine Translation for low-density languages Ariadna Font Llitjós Language Technologies Institute SCS Carnegie Mellon University.

AVENUE/LETRAS: Learning-based MT for Languages with Limited Resources Faculty: Jaime Carbonell, Alon Lavie, Lori Levin, Ralf Brown, Robert Frederking Students.

Ideas for 100K Word Data Set for Human and Machine Learning Lori Levin Alon Lavie Jaime Carbonell Language Technologies Institute Carnegie Mellon University.

MT with an Interlingua Lori Levin April 13, 2009.

Carnegie Mellon Goal Recycle non-expert post-editing efforts to: - Refine translation rules automatically - Improve overall translation quality Proposed.

Data Collection and Language Technologies for Mapudungun Lori Levin, Rodolfo Vega, Jaime Carbonell, Ralf Brown, Alon Lavie Language Technologies Institute.

Nov 17, 2005Learning-based MT1 Learning-based MT Approaches for Languages with Limited Resources Alon Lavie Language Technologies Institute Carnegie Mellon.

Computational support for minority languages using a typologically oriented questionnaire system Lori Levin Language Technologies Institute School of Computer.

An Overview of the AVENUE Project Presented by Lori Levin Language Technologies Institute School of Computer Science Carnegie Mellon University Pittsburgh,

Designing a Machine Translation Project Lori Levin and Alon Lavie Language Technologies Institute Carnegie Mellon University CATANAL Planning Meeting Barrow,

A Trainable Transfer-based MT Approach for Languages with Limited Resources Alon Lavie Language Technologies Institute Carnegie Mellon University Joint.

LING 6520: Comparative Topics in Linguistics (from a computational perspective) Martha Palmer Jan 15,

Coping with Surprise: Multiple CMU MT Approaches Alon Lavie Lori Levin, Jaime Carbonell, Alex Waibel, Stephan Vogel, Ralf Brown, Robert Frederking Language.

Semi-Automated Elicitation Corpus Generation The elicitation tool provides a simple interface for bilingual informants with no linguistic training and.

Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement Ariadna Font Llitjós, Katharina Probst, Jaime Carbonell Language Technologies.

Natural Language Processing Group Computer Sc. & Engg. Department JADAVPUR UNIVERSITY KOLKATA – , INDIA. Professor Sivaji Bandyopadhyay

Avenue Architecture Learning Module Learned Transfer Rules Lexical Resources Run Time Transfer System Decoder Translation Correction Tool Word- Aligned.

October 10, 2003BLTS Kickoff Meeting1 Transfer with Strong Decoding Learning Module Transfer Rules {PP,4894} ;;Score: PP::PP [NP POSTP] -> [PREP.

Eliciting a corpus of word- aligned phrases for MT Lori Levin, Alon Lavie, Erik Peterson Language Technologies Institute Carnegie Mellon University.

CMU MilliRADD Small-MT Report TIDES PI Meeting 2002 The CMU MilliRADD Team: Jaime Carbonell, Lori Levin, Ralf Brown, Stephan Vogel, Alon Lavie, Kathrin.

AVENUE: Machine Translation for Resource-Poor Languages NSF ITR

Developing affordable technologies for resource-poor languages Ariadna Font Llitjós Language Technologies Institute Carnegie Mellon University September.

FROM BITS TO BOTS: Women Everywhere, Leading the Way Lenore Blum, Anastassia Ailamaki, Manuela Veloso, Sonya Allin, Bernardine Dias, Ariadna Font Llitjós.

A Simple English-to-Punjabi Translation System By : Shailendra Singh.

Semi-Automatic Learning of Transfer Rules for Machine Translation of Minority Languages Katharina Probst Language Technologies Institute Carnegie Mellon.

Learning to Generate Complex Morphology for Machine Translation Einat Minkov †, Kristina Toutanova* and Hisami Suzuki* *Microsoft Research † Carnegie Mellon.

LingWear Language Technology for the Information Warrior Alex Waibel, Lori Levin Alon Lavie, Robert Frederking Carnegie Mellon University.

The AVENUE Project: Automatic Rule Learning for Resource-Limited Machine Translation Faculty: Alon Lavie, Jaime Carbonell, Lori Levin, Ralf Brown Students:

Eliciting a corpus of word-aligned phrases for MT

Urdu-to-English Stat-XFER system for NIST MT Eval 2008

Basque language: is IT right on?

Alon Lavie, Jaime Carbonell, Lori Levin,

Artificial Intelligence 2004 Speech & Natural Language Processing

Presentation transcript:

Bridging the Gap: Machine Translation for Lesser Resourced Languages Christian Monson, Ariadna Font Llitjós, Lori Levin, Alon Lavie, Alison Alvarez, Roberto Aranovich, Jaime Carbonell, Robert Frederking, Erik Peterson, Kathrin Probst

Inupiaq Katrina Quechua Mapudungun 100’s of Speakers Katrina 100’s of Speakers Quechua 6 Million Speakers Mapudungun 900,000 Speakers

Machine Translation (MT) Source Language Target Language

Machine Translation (MT) Source Language Target Language Direct Statistical MT Example Based MT

Machine Translation (MT) Transfer Rule Based MT Morphologial Analysis Syntactic Parsing Text Generation + Source Language Target Language Direct Statistical MT Example Based MT

Machine Translation (MT) Interlingua Semantic Analysis Sentence Planning Transfer Rule Based MT Morphologial Analysis Syntactic Parsing Text Generation + Source Language Target Language Direct Statistical MT Example Based MT

Machine Translation (MT) Interlingua + High quality - Expertise intensive development cycle Semantic Analysis Transfer Rule Based MT Morphologial Analysis Syntactic Parsing Text Generation + Source Language Target Language Direct Statistical MT Example Based MT

Machine Translation (MT) Interlingua + Short development time - Requires large bilingual corpus Semantic Analysis Transfer Rule Based MT Morphologial Analysis Syntactic Parsing Text Generation + Source Language Target Language Direct Statistical MT Example Based MT

Machine Translation (MT) Interlingua Semantic Analysis Our Approach Transfer Rule Based MT Morphologial Analysis Syntactic Parsing Text Generation + Source Language Target Language Direct Statistical MT Example Based MT

Machine Translation (MT) Interlingua + High quality - Expertise intensive development cycle Semantic Analysis Transfer Rule Based MT Morphologial Analysis Syntactic Parsing Text Generation + Source Language Target Language Direct Statistical MT Example Based MT

Machine Translation (MT) Interlingua + High quality - Expertise intensive development cycle Semantic Analysis Morphologial Analysis Syntactic Parsing Text Generation + Automate the development of deep-analysis MT Source Language Target Language

Our Position Linguistic Structure and Bilingual Informants help automate the development of deep-analysis machine translation systems

Sub-Problems Morphology Induction Syntax Refinement

Morphology Induction 1. Linguistic Structure 2. Bilingual Informants

Morphology Induction 1. Linguistic Structure 2. Bilingual Informants

Paradigms Organize Morphology Mapudungun Loc Asp pa tu pu ka Ø Hab Mode Report Pol / Mood Tense Obj Agr ke pe (ü)rke la a fi ki fu Ø nu afu Subj Agr / Mood (ü)n li chi yu …

Paradigm Discovery in 3 Steps Search out partial paradigms in a network of candidates Cluster overlapping partial paradigms Filter the clusters, keeping the largest clusters most likely to model true paradigms e.er.erá.ido.ieron.ió 28: deb, escog, ofrec, roconoc, vend, ... e.ido.ieron.ir.irá.ió 28: asist, dirig, exig, ocurr, sufr, ... e.erá.ido.ieron.ió 28: deb, escog, ... e.er.ido.ieron.ió 46: deb, parec, recog... e.ido.ieron.irá.ió 28: asist, dirig, ... e.ido.ieron.ir.ió 39: asist, bat, sal, ... e.er.erá.ieron.ió 32: deb, padec, romp, ... e.ido.ieron.ió 86: asist, deb, hund,... e.erá.ieron.ió 32: deb, padec, ... er.ido.ieron.ió 58: ascend, ejerc, recog, ... ido.ieron.ir.ió 44: interrump, sal, ... azar.e.ido.ieron.ir.ió 1: sal A portion of a Spanish paradigm candidate network

Morpho Challenge 2007 Unsupervised Morphology Induction Competition English 3rd Place Overall Bested the Strong Baseline Morfessor (Creutz, 2006) German 1st Place when Combined with Morfessor

Morpho Challenge 2007 Unsupervised Morphology Induction Competition English 3rd Place Overall Bested the Strong Baseline Morfessor (Creutz, 2006) German 1st Place when Combined with Morfessor No Mapudungun yet Agglutinative sequences of suffixes coming soon

Our Machine Translation Architecture INPUT TEXT Finish feedback loop Given an arbitrary small set of linguistic resources, for example a small grammar and a small lexicon, if we add a RR component at the end of our Translation process, we can use bilingual speaker feedback to AUGMENT and IMPROVE the initial resources (G and L). The approach I am proposing can be generalized to any rule-based system. We chose to implement our work on this system developed at CMU Propagate corrections to the underlying representations that produce translations

Our Machine Translation Architecture INPUT TEXT Morphology Analysis Lexicon Morphology Analysis Finish feedback loop Given an arbitrary small set of linguistic resources, for example a small grammar and a small lexicon, if we add a RR component at the end of our Translation process, we can use bilingual speaker feedback to AUGMENT and IMPROVE the initial resources (G and L). The approach I am proposing can be generalized to any rule-based system. We chose to implement our work on this system developed at CMU Propagate corrections to the underlying representations that produce translations

Our Machine Translation Architecture INPUT TEXT Morphology Analysis Lexicon Grammar & Lexicon Morphology Analysis Machine Translation System Finish feedback loop Given an arbitrary small set of linguistic resources, for example a small grammar and a small lexicon, if we add a RR component at the end of our Translation process, we can use bilingual speaker feedback to AUGMENT and IMPROVE the initial resources (G and L). The approach I am proposing can be generalized to any rule-based system. We chose to implement our work on this system developed at CMU Propagate corrections to the underlying representations that produce translations

Our Machine Translation Architecture INPUT TEXT Morphology Analysis Lexicon Grammar & Lexicon Morphology Analysis Machine Translation System Finish feedback loop Given an arbitrary small set of linguistic resources, for example a small grammar and a small lexicon, if we add a RR component at the end of our Translation process, we can use bilingual speaker feedback to AUGMENT and IMPROVE the initial resources (G and L). The approach I am proposing can be generalized to any rule-based system. We chose to implement our work on this system developed at CMU Propagate corrections to the underlying representations that produce translations Morphology Generation Lexicon Morphology Generation

Our Machine Translation Architecture INPUT TEXT Morphology Analysis Lexicon Grammar & Lexicon Morphology Analysis Machine Translation System Finish feedback loop Given an arbitrary small set of linguistic resources, for example a small grammar and a small lexicon, if we add a RR component at the end of our Translation process, we can use bilingual speaker feedback to AUGMENT and IMPROVE the initial resources (G and L). The approach I am proposing can be generalized to any rule-based system. We chose to implement our work on this system developed at CMU Propagate corrections to the underlying representations that produce translations Morphology Generation Lexicon Morphology Generation OUTPUT TEXT

Our Machine Translation Architecture INPUT TEXT Morphology Analysis Lexicon Grammar & Lexicon Morphology Analysis Machine Translation System Finish feedback loop Given an arbitrary small set of linguistic resources, for example a small grammar and a small lexicon, if we add a RR component at the end of our Translation process, we can use bilingual speaker feedback to AUGMENT and IMPROVE the initial resources (G and L). The approach I am proposing can be generalized to any rule-based system. We chose to implement our work on this system developed at CMU Propagate corrections to the underlying representations that produce translations Morphology Generation Lexicon Morphology Generation OUTPUT TEXT

Our Machine Translation Architecture INPUT TEXT Morphology Analysis Lexicon Grammar & Lexicon Morphology Analysis Machine Translation System Finish feedback loop Given an arbitrary small set of linguistic resources, for example a small grammar and a small lexicon, if we add a RR component at the end of our Translation process, we can use bilingual speaker feedback to AUGMENT and IMPROVE the initial resources (G and L). The approach I am proposing can be generalized to any rule-based system. We chose to implement our work on this system developed at CMU Propagate corrections to the underlying representations that produce translations Morphology Generation Lexicon Morphology Generation OUTPUT TEXT

Sub-Problems Morphology Induction Syntax Refinement

Syntax Refinement 1. Linguistic Structure 2. Bilingual Informants

Syntax Refinement 1. Linguistic Structure 2. Bilingual Informants

Linguistic Structure: Syntax English I didn’t see Maria Mapudungun pelafiñ Maria Spanish No vi a María

Linguistic Structure: Syntax English I didn’t see Maria Mapudungun pelafiñ Maria pe -la -fi -ñ Maria see -neg -3.obj -1.subj.indicative Maria Spanish No vi a María No vi a María neg see.1.subj.past.indicative acc Maria

pe-la-fi-ñ Maria V pe

pe-la-fi-ñ Maria V pe VSuff Negation = + la

pe-la-fi-ñ Maria V pe VSuffG Pass all features up VSuff la

pe-la-fi-ñ Maria V pe VSuffG VSuff object person = 3 VSuff fi la

pe-la-fi-ñ Maria V pe VSuffG Pass all features up from both children

pe-la-fi-ñ Maria V pe VSuffG VSuff person = 1 number = sg mood = ind

pe-la-fi-ñ Maria V VSuffG pe VSuffG VSuff Pass all features up from both children VSuffG VSuff ñ VSuff fi la

pe-la-fi-ñ Maria Pass all features up from both children V Check that: 1) negation = + 2) tense is undefined V VSuffG pe VSuffG VSuff VSuffG VSuff ñ VSuff fi la

pe-la-fi-ñ Maria V NP V VSuffG N person = 3 number = sg human = + pe

pe-la-fi-ñ Maria S Check that NP is human = + Pass features up from V VP V NP V VSuffG N pe VSuffG VSuff N VSuffG VSuff ñ Maria VSuff fi la

Transfer to Spanish: Top-Down VP VP V NP V VSuffG N pe VSuffG VSuff N VSuffG VSuff ñ Maria VSuff fi la

Transfer to Spanish: Top-Down Pass all features to Spanish side S S VP VP V NP V “a” NP V VSuffG N pe VSuffG VSuff N VSuffG VSuff ñ Maria VSuff fi la

Transfer to Spanish: Top-Down Pass all features down VP VP V NP V “a” NP V VSuffG N pe VSuffG VSuff N VSuffG VSuff ñ Maria VSuff fi la

Transfer to Spanish: Top-Down Pass object features down VP VP V NP V “a” NP V VSuffG N pe VSuffG VSuff N VSuffG VSuff ñ Maria VSuff fi la

Transfer to Spanish: Top-Down VP VP V NP V “a” NP V VSuffG N Accusative marker on objects is introduced because human = + pe VSuffG VSuff N VSuffG VSuff ñ Maria VSuff fi la

Transfer to Spanish: Top-Down VP VP::VP [VBar NP] -> [VBar "a" NP] ( (X1::Y1) (X2::Y3) ((X2 type) = (*NOT* personal)) ((X2 human) =c +) (X0 = X1) ((X0 object) = X2) (Y0 = X0) ((Y0 object) = (X0 object)) (Y1 = Y0) (Y3 = (Y0 object)) ((Y1 objmarker person) = (Y3 person)) ((Y1 objmarker number) = (Y3 number)) ((Y1 objmarker gender) = (Y3 gender))) VP V NP V “a” NP V VSuffG N pe VSuffG VSuff N VSuffG VSuff ñ Maria VSuff fi la

Transfer to Spanish: Top-Down Pass person, number, and mood features to Spanish Verb VP VP V NP V “a” NP Assign tense = past V VSuffG N “no” V pe VSuffG VSuff N VSuffG VSuff ñ Maria VSuff fi la

Transfer to Spanish: Top-Down VP VP V NP V “a” NP V VSuffG N “no” V pe VSuffG VSuff N VSuffG VSuff ñ Maria Introduced because negation = + VSuff fi la

Transfer to Spanish: Top-Down VP VP V NP V “a” NP V VSuffG N “no” V pe VSuffG VSuff N ver VSuffG VSuff ñ Maria VSuff fi la

Transfer to Spanish: Top-Down VP VP V NP V “a” NP V VSuffG N “no” V pe VSuffG VSuff N ver vi VSuffG VSuff ñ Maria person = 1 number = sg mood = indicative tense = past VSuff fi la

Transfer to Spanish: Top-Down Pass features over to Spanish side VP VP V NP V “a” NP V VSuffG N “no” V N pe VSuffG VSuff N vi N VSuffG VSuff ñ Maria María VSuff fi la

I didn’t see Maria S S VP VP V NP V “a” NP V VSuffG N “no” V N pe vi N VSuffG VSuff ñ Maria María VSuff fi la

Syntax Refinement 1. Linguistic Structure 2. Bilingual Informants

Syntax Refinement Architecture INPUT TEXT Morphology Analysis Lexicon Grammar & Lexicon Morphology Analysis Run-Time MT System Finish feedback loop Given an arbitrary small set of linguistic resources, for example a small grammar and a small lexicon, if we add a RR component at the end of our Translation process, we can use bilingual speaker feedback to AUGMENT and IMPROVE the initial resources (G and L). The approach I am proposing can be generalized to any rule-based system. We chose to implement our work on this system developed at CMU Propagate corrections to the underlying representations that produce translations Morphology Generation Lexicon Morphology Generation OUTPUT TEXT

Syntax Refinement Architecture INPUT TEXT Rule Refinement Grammar & Lexicon Morphology Analysis Online Translation Correction Tool Run-Time MT System Finish feedback loop Given an arbitrary small set of linguistic resources, for example a small grammar and a small lexicon, if we add a RR component at the end of our Translation process, we can use bilingual speaker feedback to AUGMENT and IMPROVE the initial resources (G and L). The approach I am proposing can be generalized to any rule-based system. We chose to implement our work on this system developed at CMU Propagate corrections to the underlying representations that produce translations Morphology Generation OUTPUT TEXT

Syntax Refinement Architecture INPUT TEXT Rule Refinement Grammar & Lexicon Morphology Analysis Online Translation Correction Tool Run-Time MT System Finish feedback loop Given an arbitrary small set of linguistic resources, for example a small grammar and a small lexicon, if we add a RR component at the end of our Translation process, we can use bilingual speaker feedback to AUGMENT and IMPROVE the initial resources (G and L). The approach I am proposing can be generalized to any rule-based system. We chose to implement our work on this system developed at CMU Propagate corrections to the underlying representations that produce translations

Syntax Refinement Architecture INPUT TEXT Rule Refinement Grammar & Lexicon Morphology Analysis Online Translation Correction Tool Run-Time MT System Finish feedback loop Given an arbitrary small set of linguistic resources, for example a small grammar and a small lexicon, if we add a RR component at the end of our Translation process, we can use bilingual speaker feedback to AUGMENT and IMPROVE the initial resources (G and L). The approach I am proposing can be generalized to any rule-based system. We chose to implement our work on this system developed at CMU Propagate corrections to the underlying representations that produce translations Morphologhy Generation OUTPUT TEXT

Children played a game Translation Correction Tool (TCTool): online GUI to elicit correction of MT output from non-expert bilingual speakers

The children played a game

Refining the Grammar S NP VP N VP N PolP NP niños V Det N V un N jugaron juego

Refining the Grammar los S NP VP N VP N PolP NP niños V Det N V un N jugaron juego

Refining the Grammar los S NP VP N VP N PolP NP niños V Det N V un N jugaron juego

Syntax Refinement Summary Increases translation quality on unseen data English-Spanish experiments (Font Llitjós et al, 2007, MT Summit) Generalizes to a Mapudungun-Spanish machine translation system Today I’ve shown you an example of grammar expansion, but the ARR can also automatically augment the lexicon (see paper).

Overall Summary Linguistic Structure and Bilingual Informants help automate the development of deep-analysis machine translation systems: Morphology Induction Syntax Refinement

Thank You!