Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bayesian Subtree Alignment Model based on Dependency Trees Toshiaki Nakazawa Sadao Kurohashi Kyoto University 1 IJCNLP2011.

Similar presentations


Presentation on theme: "Bayesian Subtree Alignment Model based on Dependency Trees Toshiaki Nakazawa Sadao Kurohashi Kyoto University 1 IJCNLP2011."— Presentation transcript:

1 Bayesian Subtree Alignment Model based on Dependency Trees Toshiaki Nakazawa Sadao Kurohashi Kyoto University 1 2011/11/11 @ IJCNLP2011

2 Outline Background Related Work Bayesian Subtree Alignment Model Model Training Experiments Conclusion 2

3 Background Alignment quality of GIZA++ is quite insufficient for distant language pairs – Wide range reordering, many-to-many alignment 3 En: He is my bother. Zh: 他 是 我 哥哥 。 Fr: Il est mon frère. Ja: 彼 は 私 の 兄 です 。

4 Alignment Accuracy of GIZA++ Language pairPrecisionRecallAER French-English87.2896.309.80 English-Japanese81.1762.1929.25 Chinese-Japanese83.7775.3820.39 with combination heuristic Sure alignment: clearly right Possible alignment: reasonable to 4 Automatic (GIZA++) alignment make but not so clear

5 Background Alignment quality is quite insufficient for distant language pairs – Wide range reordering, many-to-many alignment – English-Japanese, Chinese-Japanese > 20% AER – Need to incorporate syntactic information En: He is my bother. Zh: 他 是 我 哥哥 。 Fr: Il est mon frère. Ja: 彼 は 私 の 兄 です 。 5

6 Related Work Cherry and Lin (2003) – Discriminative alignment model using source side dependency tree – Allows one-to-one alignment only Nakazawa and Kurohashi (2009) – Generative model using both side dependency trees – Allows phrasal alignment – Degeneracy of acquiring incorrect larger phrases derived from Maximum Likelihood Estimation 6

7 Related Work DeNero et al. (2008) – Incorporate prior knowledge about the parameter to void the degeneracy of the model – Place Dirichlet Process (DP) prior over phrase generation model – Simple distortion model: position-based This work – Take advantage of two works by Nakazawa et al. and DeNero et al. 7

8 Related Work Generative story of (sequential) phrase-based joint probability model 1.Choose a number of components 2.Generate each of phrase pairs independently Nonparametric Bayesian prior 3.Choose an ordering for the phrases Model Step 1Step 3 Step 2 [DeNero et al., 2008] 8

9 Example of the Model Step 1Step 3 Step 2 He is my brother 彼 は です 私 の 兄 兄 C1C1 C1C1 C2C2 C2C2 C3C3 C3C3 C4C4 C4C4 9 Simple position-based distortion

10 Proposed Model Step 1Step 3 Step 2 10 Dependency Tree-based distortion He is my brother 兄 です 彼 は 私 の C1C1 C1C1 C2C2 C2C2 C3C3 C3C3 C4C4 C4C4

11 Bayesian Subtree Alignment Model based on Dependency Trees 11

12 Model Decomposition 12 NullNon-null dependency relations dependency of phrases cf. [DeNero et al., 2008]

13 Dependency Relations rel(“He”, “is”) 13 彼 は 私 の 兄 He is borther my rel(“brother”, “is”) = (1, 0) rel(“my”, “brother”) = (1, 0) です # of steps for going up # of steps for going down = (Up, Down) = (1, 0)

14 Dependency Relations rel(“ 彼 は ”, “ です ”) = (1, 0) 14 彼 は 私 の 兄 He is borther my rel(“ 私 の ”, “ 兄 ”) = (1, 0) rel(“ 兄 ”, “ です ”) = (1, 0) です

15 Dependency Relations rel(“long”, “hair”) = (0, 1) 15 彼女 は 髪 が 長い She has hair long rel(“hair”, “she has”) = (1, 2) rel(“ 髪 が ”, “ 長い ”) = (0, 1) NULL

16 Dependency Relations 16 彼女 は 髪 が 長い She has hair long rel(“ 彼女 ”, “ は ”) = ? NULL rel(“ 彼女 ”, “ 長い ”) = (0, 2) N(“ 彼女 ”) = 1 # of NULL words on the way to non-null parent

17 Dependency Relation Probability Assign probability to the tuple: p(R e = (N, rel) = (N, Up, Down)) ~ Reordering model is decomposed as: 17

18 Model Training 18

19 Model Training Initialization – Create heuristic phrase alignment like ‘grow-diag- final-and’ on dependency trees using results from GIZA++ – Count phrase alignment and dependency relations Refine the model by Gibbs sampling – Operators: SWAP, TOGGLE, EXPAND 19

20 SWAP Operator ・・・ NULL ・・・ SWAP-1 SWAP-2 Swap the counterparts of two alignments 20

21 TOGGLE Operator Remove existing alignment or add new alignment NULL TOGGLE 21

22 EXPAND Operator Expand or contract an aligned subtree NULL EXPAND-1 EXPAND-2 22

23 Alignment Experiment Training: 1M for Ja-En, 678K for Ja-Zh Testing: about 500 hand-annotated parallel sentences (with Sure and Possible alignments) Measure: Precision, Recall, Alignment Error Rate Japanese Tools: JUMAN and KNP English Tool: Charniak’s nlparser Chinese Tools: MMA and CNP (from NICT) 23

24 Alignment Experiment Ja-En (paper abstract: 1M sentences) PrecisionRecallAER Initialization82.3961.8228.99 Proposed85.9364.7125.73 GIZA++ & grow81.1762.1929.25 Berkeley Aligner85.0053.8233.72

25 Alignment Experiment Ja-Zh (technical paper: 680K sentences) PrecisionRecallAER Initialization84.7175.4619.90 Proposed85.4975.2619.60 GIZA++ & grow83.7775.3820.39 Berkeley Aligner88.4369.7721.60

26 Improved Example GIZA++Proposed 26

27 GIZA++ Proposed

28 28 GIZA++ Proposed

29 Japanese-to-English Translation Experiment Baseline: Just run Moses and MERT 29

30 Japanese-to-English Translation Experiment Initialization: Use the result of initialization as the alignment result for Moses 30

31 Japanese-to-English Translation Experiment Proposed: Use the alignment result of proposed model after few iterations for Moses 31

32 GIZA++ Proposed

33 Bayesian Tree-based Phrase Alignment Model – Better alignment accuracy than GIZA++ in distant language pairs Translation – Currently (temporally), not improved  Future work – Robustness for parsing errors Using N-best parsing result or forest – Show improvement in translation Tree-based decoder(, Hiero?) Conclusion 33


Download ppt "Bayesian Subtree Alignment Model based on Dependency Trees Toshiaki Nakazawa Sadao Kurohashi Kyoto University 1 IJCNLP2011."

Similar presentations


Ads by Google