Download presentation
Presentation is loading. Please wait.
Published byVictor Carroll Modified over 9 years ago
1
Bayesian Subtree Alignment Model based on Dependency Trees Toshiaki Nakazawa Sadao Kurohashi Kyoto University 1 2011/11/11 @ IJCNLP2011
2
Outline Background Related Work Bayesian Subtree Alignment Model Model Training Experiments Conclusion 2
3
Background Alignment quality of GIZA++ is quite insufficient for distant language pairs – Wide range reordering, many-to-many alignment 3 En: He is my bother. Zh: 他 是 我 哥哥 。 Fr: Il est mon frère. Ja: 彼 は 私 の 兄 です 。
4
Alignment Accuracy of GIZA++ Language pairPrecisionRecallAER French-English87.2896.309.80 English-Japanese81.1762.1929.25 Chinese-Japanese83.7775.3820.39 with combination heuristic Sure alignment: clearly right Possible alignment: reasonable to 4 Automatic (GIZA++) alignment make but not so clear
5
Background Alignment quality is quite insufficient for distant language pairs – Wide range reordering, many-to-many alignment – English-Japanese, Chinese-Japanese > 20% AER – Need to incorporate syntactic information En: He is my bother. Zh: 他 是 我 哥哥 。 Fr: Il est mon frère. Ja: 彼 は 私 の 兄 です 。 5
6
Related Work Cherry and Lin (2003) – Discriminative alignment model using source side dependency tree – Allows one-to-one alignment only Nakazawa and Kurohashi (2009) – Generative model using both side dependency trees – Allows phrasal alignment – Degeneracy of acquiring incorrect larger phrases derived from Maximum Likelihood Estimation 6
7
Related Work DeNero et al. (2008) – Incorporate prior knowledge about the parameter to void the degeneracy of the model – Place Dirichlet Process (DP) prior over phrase generation model – Simple distortion model: position-based This work – Take advantage of two works by Nakazawa et al. and DeNero et al. 7
8
Related Work Generative story of (sequential) phrase-based joint probability model 1.Choose a number of components 2.Generate each of phrase pairs independently Nonparametric Bayesian prior 3.Choose an ordering for the phrases Model Step 1Step 3 Step 2 [DeNero et al., 2008] 8
9
Example of the Model Step 1Step 3 Step 2 He is my brother 彼 は です 私 の 兄 兄 C1C1 C1C1 C2C2 C2C2 C3C3 C3C3 C4C4 C4C4 9 Simple position-based distortion
10
Proposed Model Step 1Step 3 Step 2 10 Dependency Tree-based distortion He is my brother 兄 です 彼 は 私 の C1C1 C1C1 C2C2 C2C2 C3C3 C3C3 C4C4 C4C4
11
Bayesian Subtree Alignment Model based on Dependency Trees 11
12
Model Decomposition 12 NullNon-null dependency relations dependency of phrases cf. [DeNero et al., 2008]
13
Dependency Relations rel(“He”, “is”) 13 彼 は 私 の 兄 He is borther my rel(“brother”, “is”) = (1, 0) rel(“my”, “brother”) = (1, 0) です # of steps for going up # of steps for going down = (Up, Down) = (1, 0)
14
Dependency Relations rel(“ 彼 は ”, “ です ”) = (1, 0) 14 彼 は 私 の 兄 He is borther my rel(“ 私 の ”, “ 兄 ”) = (1, 0) rel(“ 兄 ”, “ です ”) = (1, 0) です
15
Dependency Relations rel(“long”, “hair”) = (0, 1) 15 彼女 は 髪 が 長い She has hair long rel(“hair”, “she has”) = (1, 2) rel(“ 髪 が ”, “ 長い ”) = (0, 1) NULL
16
Dependency Relations 16 彼女 は 髪 が 長い She has hair long rel(“ 彼女 ”, “ は ”) = ? NULL rel(“ 彼女 ”, “ 長い ”) = (0, 2) N(“ 彼女 ”) = 1 # of NULL words on the way to non-null parent
17
Dependency Relation Probability Assign probability to the tuple: p(R e = (N, rel) = (N, Up, Down)) ~ Reordering model is decomposed as: 17
18
Model Training 18
19
Model Training Initialization – Create heuristic phrase alignment like ‘grow-diag- final-and’ on dependency trees using results from GIZA++ – Count phrase alignment and dependency relations Refine the model by Gibbs sampling – Operators: SWAP, TOGGLE, EXPAND 19
20
SWAP Operator ・・・ NULL ・・・ SWAP-1 SWAP-2 Swap the counterparts of two alignments 20
21
TOGGLE Operator Remove existing alignment or add new alignment NULL TOGGLE 21
22
EXPAND Operator Expand or contract an aligned subtree NULL EXPAND-1 EXPAND-2 22
23
Alignment Experiment Training: 1M for Ja-En, 678K for Ja-Zh Testing: about 500 hand-annotated parallel sentences (with Sure and Possible alignments) Measure: Precision, Recall, Alignment Error Rate Japanese Tools: JUMAN and KNP English Tool: Charniak’s nlparser Chinese Tools: MMA and CNP (from NICT) 23
24
Alignment Experiment Ja-En (paper abstract: 1M sentences) PrecisionRecallAER Initialization82.3961.8228.99 Proposed85.9364.7125.73 GIZA++ & grow81.1762.1929.25 Berkeley Aligner85.0053.8233.72
25
Alignment Experiment Ja-Zh (technical paper: 680K sentences) PrecisionRecallAER Initialization84.7175.4619.90 Proposed85.4975.2619.60 GIZA++ & grow83.7775.3820.39 Berkeley Aligner88.4369.7721.60
26
Improved Example GIZA++Proposed 26
27
GIZA++ Proposed
28
28 GIZA++ Proposed
29
Japanese-to-English Translation Experiment Baseline: Just run Moses and MERT 29
30
Japanese-to-English Translation Experiment Initialization: Use the result of initialization as the alignment result for Moses 30
31
Japanese-to-English Translation Experiment Proposed: Use the alignment result of proposed model after few iterations for Moses 31
32
GIZA++ Proposed
33
Bayesian Tree-based Phrase Alignment Model – Better alignment accuracy than GIZA++ in distant language pairs Translation – Currently (temporally), not improved Future work – Robustness for parsing errors Using N-best parsing result or forest – Show improvement in translation Tree-based decoder(, Hiero?) Conclusion 33
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.