Discriminative Modeling extraction Sets for Machine Translation Author John DeNero and Dan KleinUC Berkeley Presenter Justin Chiu.

Discriminative Modeling extraction Sets for Machine Translation Author John DeNero and Dan KleinUC Berkeley Presenter Justin Chiu

Contribution Extraction set ◦ Nested collections of all the overlapping phrase pairs consistent with an underlying word- alignment Advantages over word-factored alignment model ◦ Can incorporate features on phrase pairs, more than word link ◦ Optimize a extraction-based loss function really direct to generating translation Perform better than both supervised and unsupervised baseline

Progress of Statistical MT Generate translated sentences word by word Using while fragments of training example, building translation rules ◦ Aligned at the word level ◦ Extract fragment-level rules from word aligned sentence pair  Tree to string translation Extraction Set Models ◦ Set of all overlapping phrasal translation rule + alignment

Outline Extraction Set Models Model Estimation Model Inference Experiments

EXTRACTION SET MODELS

Extraction Set Models Input ◦ Unaligned sentence Output ◦ Extraction set of phrasal translation rules ◦ Word alignment

Extraction Sets from Word Alignments

Possible and Null Alignment Links Possible links has two types ◦ Function words that is unique in its language ◦ Short phrase that has no lexical equivalent Null alignment ◦ Express content that is absent in its translation

Interpreting Possible and Null Alignment Links

Linear Model for Extraction Set

Scoring Extraction Sets

MODEL ESTIMATION

MIRA(Margin-infused Relaxed Algorithm)

Extraction Set Loss Function

MODEL INFERENCE

Possible Decompositions

DP for Extraction Sets

Finding Pseudo-Gold ITG Alignment

EXPERIMENTS

Five systems for comparison Unsupervised baseline ◦ Giza++ ◦ Joint HMM Supervised baseline ◦ Block ITG Extraction Set Coarse Pass ◦ Does not score bispans that corss bracketing of ITG derivations Full Extraction Set Model

Data Discriminative training and alignment evaluation ◦ Trained baseline HMM on 11.3 million words of FBIS newswire data ◦ Hand-aligned portion of the NIST MT02 test set  150 training and 191 test sentences End-to-end translation experiments ◦ Trained on 22.1 million word prarllel corpus consisting of sentence up to 40 of newswire data from GALE program ◦ NIST MT04/MT05 test sets

Results

Discussion Syntax labels v.s words Word align to rule  Rule to word align Information from two directions 65% of type 1 error

Discriminative Modeling extraction Sets for Machine Translation Author John DeNero and Dan KleinUC Berkeley Presenter Justin Chiu.

Similar presentations

Presentation on theme: "Discriminative Modeling extraction Sets for Machine Translation Author John DeNero and Dan KleinUC Berkeley Presenter Justin Chiu."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Discriminative Modeling extraction Sets for Machine Translation Author John DeNero and Dan KleinUC Berkeley Presenter Justin Chiu.

Similar presentations

Presentation on theme: "Discriminative Modeling extraction Sets for Machine Translation Author John DeNero and Dan KleinUC Berkeley Presenter Justin Chiu."— Presentation transcript:

Similar presentations

About project

Feedback