Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistical Phrase Alignment Model Using Dependency Relation Probability Toshiaki Nakazawa and Sadao Kurohashi Kyoto University.

Similar presentations


Presentation on theme: "Statistical Phrase Alignment Model Using Dependency Relation Probability Toshiaki Nakazawa and Sadao Kurohashi Kyoto University."— Presentation transcript:

1 Statistical Phrase Alignment Model Using Dependency Relation Probability Toshiaki Nakazawa and Sadao Kurohashi Kyoto University

2 Outline  Background  Tree-based Statistical Phrase Alignment Model  Model Training  Experiments  Conclusions 2

3 Conventional Word Sequence Alignment 受 (accept) 光 (light) 素子 (device) に (ni) は (ha) フォト (photo) ゲート (gate) を (wo) 用いた (used) A photogate is used for the photodetector

4 ・・・ exhibited ■■ a strong ■ inhibitory ■ effect ■ on ■■ tumor ■ growth ■ in ■ the ■ castrated ■ mice ■ as ■■ in ■ the non-castrated ■ mice ■ 非去勢去勢 マウスマウス と同様に同様に 去勢去勢 マウスマウス の腫よう腫よう の成長成長 に対し対し 強い強い 抑制抑制 効果効果 を示した示した grow-diag-final-and

5 Conventional Word Sequence Alignment 受 (accept) 光 (light) 素子 (device) に (ni) は (ha) フォト (photo) ゲート (gate) を (wo) 用いた (used) A photogate is used for the photodetector 受 光 素子 に は フォト ゲート を 用いた A photogate is used for the photodetector (accept) (light) (device) (photo) (gate) (used) (ni) (ha) (wo) Proposed Model 1.Dependency trees

6 Proposed Model 受 光 素子 に は フォト ゲート を 用いた A photogate is used for the photodetector (accept) (light) (device) (photo) (gate) (used) (ni) (ha) (wo) 1.Dependency trees 2.Phrase alignment 3.Bi-directional agreement

7 ・・・ exhibited ■■ a strong ■ inhibitory ■ effect ■ on ■■ tumor ■ growth ■ in ■ the ■ castrated ■ mice ■ as ■■ in ■ the non-castrated ■ mice ■ 非去勢去勢 マウスマウス と同様に同様に 去勢去勢 マウスマウス の腫よう腫よう の成長成長 に対し対し 強い強い 抑制抑制 効果効果 を示した示した grow-diag-final-and ・・・ exhibited ■■ │ ┌─a │ ├ ─strong ■ │ ├ ─inhibitory ■ ├ ─effect ■ ├ ─on │ │ ┌─tumor ■ │ └─growth ■ ├ ─in ■ │ │ ┌─the │ │ ├ ─castrated ■ │ └─mice ■ └─as ■ └─in ■ │ ┌─the │ ├ ─non-castrated ■■ └─mice ■ ─ ┌ 非 ─ ┌ 去 勢 ─ ┌ マ ウ ス ─┌と─┌と ┬同様に┬同様に ─ ┌ 去 勢 ─ ┌ マ ウ ス ─ ┌ の ─ ┌ 腫 よ う ─ ┌ の ─ ┌ 成 長 ─┌に─┌に ┬対し┬対し ─ ┌ 強 い ─ ┬ 抑 制 ─┌効果─┌効果 ┬を┬を 示した示した Proposed model

8 Related Work  Using tree structures  [Cherry and Lin, 2003], [Quirk et al., 2005], [Galley et al., 2006], ITG, …  Considering phrase alignment  [Zhang and Vogel, 2005], [Ion et al., 2006], …  Using two directed models simultaneously  [Liang et al., 2006], [Graca et al., 2008], …

9 Tree-based Statistical Phrase Alignment Model

10 Dependency Analysis of Sentences 受 光 素子 に は フォト ゲート を 用いた A photogate is used for the photodetector (accept) (light) (device) (photo) (gate) (used) (ni) (ha) (wo) 受光素子にはフォトゲートを用いた A photogate is used for the photodetector Source (Japanese)Target (English) Word order Head node

11 Overview of the Proposed Model (in comparison to the IBM models)  IBM models find the best alignment by  Proposed model Word translation Word reordering : source sentence : target sentence : alignment Phrase translation Dependency Relation Phrase translation Dependency Relation : Lexical prob. : Alignment prob.

12 Phrase Translation Probability

13  Note that the sentences are not previously segmented into phrases IBM Model f4f4 f3f3 F2F2 f5f5 f2f2 f1f1 F1F1 F3F3 s(j): s(1) = 1 s(2) = 2 s(3) = 2 s(4) = 3 s(5) = 1 source e4e4 e3e3 E2E2 e2e2 e1e1 E1E1 E3E3 A: A 1 =2 A 2 =3 A 3 =0 target

14 Dependency Relation Probability

15 Dependency Relations Parent-child Grandparent-child ? Fs(c)Fs(c) EAs(c)EAs(c) EAs(p)EAs(p) EAs(c)EAs(c) rel(f c, f p ) = c Inverted parent-child EAs(p)EAs(p) Fs(p)Fs(p) fpfp fcfc rel(f c, f p ) = c;c rel(f c, f p ) = p rel(f c, f p ) = NULL_p source target ・・・ NULL

16 Dependency Relation Probability  D s-pc is a set of parent-child word pairs in the source sentence  Source-side dependency relation probability is defined in the same manner

17 Model Training

18  Step 1 : Estimate word translation prob. (IBM Model 1)  Initialize dependency relation prob.  Step 2 : Estimate phrase translation prob. and dependency relation prob.  E-step 1.Create initial alignment 2.Modify the alignment by hill-climbing  Generate possible phrases  M-step: Parameter estimation Word base Tree base p( コロラド |Colorado)=0.7 p( 大学 |university)=0.6 … p(c) = 0.4 p(c;c)= 0.3 p(p) = 0.2 … p( コロラド |Colorado)=0.7 p( 大学 |university)=0.6 p( コロラド 大学 |university of Colorado)=0.9 …

19 Step 2 (E-step) 受 光 素子 に は フォト ゲート を 用いた A photogate is used for the photodetector 受 光 素子 に は フォト ゲート を 用いた A photogate is used for the photodetector 受 光 素子 に は フォト ゲート を 用いた A photogate is used for the photodetector 受 光 素子 に は フォト ゲート を 用いた A photogate is used for the photodetector 受 光 素子 に は フォト ゲート を 用いた A photogate is used for the photodetector Initial Alignment Swap Reject Add Extend  Initial alignment is greedily created  Modify the initial alignment with the operations:  Swap  Reject  Add  Extend Example of Hill-climbing

20 Generate Possible Phrases  Generate new possible phrases by merging the NULL- aligned nodes into their parent or child non-NULL- aligned nodes  The new possible phrases are taken into consideration from the next iteration 受 光 素子 に は フォト ゲート を 用いた A photogate is used for the photodetector 

21 Model Training  Step 1 : Estimate word translation prob. (IBM Model 1)  Initialize dependency relation prob.  Step 2 : Estimate phrase translation prob. and dependency relation prob.  E-step 1.Create initial alignment 2.Modify the alignment by hill-climbing  Generate possible phrases  M-step: Parameter estimation Word base Tree base p( コロラド |colorado)=0.7 p( 大学 |university)=0.6 … p(c) = 0.4 p(c;c)= 0.3 p(p) = 0.2 … p( コロラド |colorado)=0.7 p( 大学 |university)=0.6 p( コロラド 大学 |university of colorado)=0.9 …

22 Experiments

23 Alignment Experiments  Training: JST Ja-En paper abstract corpus (1M sentences, Ja: 36.4M words, En: 83.6M words)  Test: 475 sentences with the gold-standard alignments annotated by hand  Parsers: KNP for Japanese, MSTParser for English  Evaluation criteria: Precision, Recall, F1  For the proposed model, we did 5 iterations in each Step

24 Experimental Results Pre.Rec. F Proposed87.7550.2763.92 intersection90.3434.2849.71 grow-final-and81.3248.8561.04 grow-diag-final-and79.3951.1562.22 +1.7

25 Effectiveness of Phrase and Tree Pre.Rec. F Trees + Phrases (Proposed)85.5451.0063.90 Trees89.7739.4754.83 Phrases84.4147.3360.65 None85.0738.0652.59 c p +1  Positional relations instead of dependency relations

26 Discussions  Parsing errors  Parsing accuracy is basically good, but still sometimes makes incorrect parsing results  Parsing probability into the model  Search errors  Hill-climbing sometimes goes local minima  Random restart  Function words  Behave quite differently in different languages (ex. case markers in Japanese, articles in English)  Post-processing

27 Post-processing for Function Words  Reject correspondences between Japanese particles and English “be” or “have”  Reject correspondences of English articles  Japanese “ する ” and “ れる ” or English “be” and “have” are merged into its parent verb or adjective if they are NULL- aligned Pre.Rec. F Proposed87.7550.2763.92 Proposed+ modify87.8358.4070.16 grow-diag-final-and 79.3951.1562.22 grow-diag-final-and + modify 80.4651.1562.54 +6.2 +0.3

28 Conclusion and Future Work  Linguistically motivated phrase alignment 1. Dependency trees 2. Phrase alignment 3. Bi-directional agreement  Significantly better results compared to conventional word alignment models  Future work:  Apply the proposed model for other language pairs (Japanese- Chinese and so on)  Incorporate parsing probability into our model  Investigate the contribution of our alignment results to the translation quality


Download ppt "Statistical Phrase Alignment Model Using Dependency Relation Probability Toshiaki Nakazawa and Sadao Kurohashi Kyoto University."

Similar presentations


Ads by Google