Download presentation
Presentation is loading. Please wait.
Published byDaniel Robertson Modified over 9 years ago
1
INSTITUTE OF COMPUTING TECHNOLOGY Forest-to-String Statistical Translation Rules Yang Liu, Qun Liu, and Shouxun Lin Institute of Computing Technology Chinese Academy of Sciences
2
INSTITUTE OF COMPUTING TECHNOLOGY Outline Introduction Forest-to-String Translation Rules Training Decoding Experiments Conclusion
3
INSTITUTE OF COMPUTING TECHNOLOGY Syntactic and Non-syntactic Bilingual Phrases NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH PresidentBushmadeaspeech syntactic non-syntactic
4
INSTITUTE OF COMPUTING TECHNOLOGY Importance of Non-syntactic Bilingual Phrases About 28% of bilingual phrases are non-syntactic on a English-Chinese corpus (Marcu et al., 2006). Requiring bilingual phrases to be syntactically motivated will lose a good amount of valuable knowledge (Koehn et al., 2003). Keeping the strengths of phrases while incorporating syntax into statistical translation results in significant improvements (Chiang, 2005).
5
INSTITUTE OF COMPUTING TECHNOLOGY Previous Work NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH PresidentBushmadeaspeech Galley et al., 2004
6
INSTITUTE OF COMPUTING TECHNOLOGY Previous Work NPB DTJJNN themutualunderstanding THEMUTUALUNDERSTANDING DTJJ themutual THEMUTUAL *NPB_*NN NPB *NPB_*NNNN Marcu et al., 2006
7
INSTITUTE OF COMPUTING TECHNOLOGY Previous Work NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH Liu et al., 2006 BUSHPRESIDENT PresidentBush NRNN NP PresidentBush
8
INSTITUTE OF COMPUTING TECHNOLOGY Our Work We augment the tree-to-string translation model with forest-to-string rules that capture non-syntactic phrase pairs auxiliary rules that help integrate forest-to-string rules into the tree-to-string model
9
INSTITUTE OF COMPUTING TECHNOLOGY Outline Introduction Forest-to-String Translation Rules Training Decoding Experiments Conclusion
10
INSTITUTE OF COMPUTING TECHNOLOGY Tree-to-String Rules GUNMAN NN thegunman VP SBVP WASNPVV NNKILLED waskilledby IP NPVPPU
11
INSTITUTE OF COMPUTING TECHNOLOGY Derivation A derivation is a left-most composition of translation rules that explains how a source parse tree, a target sentence, and the word alignment between them are synchronously generated. IP NPVPPU NN GUNMAN the gunman SBVP WASNPVV NNKILLED was killed by POLICE police..
12
INSTITUTE OF COMPUTING TECHNOLOGY A Derivation Composed of TRs
13
INSTITUTE OF COMPUTING TECHNOLOGY A Derivation Composed of TRs IP NPVPPU IP NPVPPU
14
INSTITUTE OF COMPUTING TECHNOLOGY A Derivation Composed of TRs IP NPVPPU GUNMAN NN thegunman NP NN GUNMAN thegunman
15
INSTITUTE OF COMPUTING TECHNOLOGY A Derivation Composed of TRs IP NPVPPU NN GUNMAN thegunman VP SBVP WASNPVV NNKILLED waskilledby SBVP WASNPVV NNKILLED was killed by
16
INSTITUTE OF COMPUTING TECHNOLOGY A Derivation Composed of TRs IP NPVPPU NN GUNMAN thegunman police SBVP WASNPVV NNKILLED was killed by NN POLICE police
17
INSTITUTE OF COMPUTING TECHNOLOGY A Derivation Composed of TRs. PU. IP NPVPPU NN GUNMAN thegunman SBVP WASNPVV NNKILLED was killed by POLICE police..
18
INSTITUTE OF COMPUTING TECHNOLOGY Forest-to-String and Auxiliary Rules NN GUNMAN thegunman SB WAS was NP forest = tree sequence ! IP NPVPPU SBVP care about only root sequence while incorporating forest rules
19
INSTITUTE OF COMPUTING TECHNOLOGY A Derivation Composed of TRs, FRs, and ARs
20
INSTITUTE OF COMPUTING TECHNOLOGY A Derivation Composed of TRs, FRs, and ARs IP NPVPPU SBVP IP NPVPPU SBVP
21
INSTITUTE OF COMPUTING TECHNOLOGY A Derivation Composed of TRs, FRs, and ARs IP NPVPPU SBVP NN GUNMAN thegunman SB WAS was NP NN GUNMANWAS thegunmanwas
22
INSTITUTE OF COMPUTING TECHNOLOGY A Derivation Composed of TRs, FRs, and ARs NP killedby PU.. VP VV KILLED IP NPVPPU NN GUNMAN thegunman SBVP WASNPVV KILLED was killed by..
23
INSTITUTE OF COMPUTING TECHNOLOGY A Derivation Composed of TRs, FRs, and ARs IP NPVPPU NN GUNMAN thegunman SBVP WASNPVV NNKILLED was killed by POLICE police.. NN POLICE NP
24
INSTITUTE OF COMPUTING TECHNOLOGY Outline Introduction Forest-to-String Translation Rules Training Decoding Experiments Conclusion
25
INSTITUTE OF COMPUTING TECHNOLOGY Training Extract both tree-to-string and forest-to- string rules from word-aligned, source-side parsed bilingual corpus Bottom-up strategy Auxiliary rules are NOT learnt from real- world data
26
INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH PresidentBushmadeaspeech NR BUSH Bush
27
INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH PresidentBushmadeaspeech NN PRESIDENT President
28
INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH PresidentBushmadeaspeech VV MADE made
29
INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH PresidentBushmadeaspeech NN SPEECH speech
30
INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH PresidentBushmadeaspeech NP NR President NN BUSHPRESIDENT Bush NP NR President NN PRESIDENT NP NRNN BUSH Bush NP NRNN
31
INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH PresidentBushmadeaspeech
32
INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH PresidentBushmadeaspeech VP VVNN a VP VVNN madea MADE VP VVNN aspeech SPEECH VP VVNN madeaspeech MADESPEECH
33
INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH PresidentBushmadeaspeech NPVVNP VVNRNN NP VVNRNN MADEBUSHPRESIDENT madePresidentBush 10 FRs
34
INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH PresidentBushmadeaspeech
35
INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH PresidentBushmadeaspeech NP VP max_height = 2
36
INSTITUTE OF COMPUTING TECHNOLOGY Why We Don’t Extract Auxiliary Rules ? IP NPVP-B NP-B NR NNCCNN SHANGHAIPUDONGDEVEWITHLEGALESTAB VV STEP The development of Shanghai ‘s Pudong is in step with the establishment of its legal system
37
INSTITUTE OF COMPUTING TECHNOLOGY Outline Introduction Forest-to-String Translation Rules Training Decoding Experiments Conclusion
38
INSTITUTE OF COMPUTING TECHNOLOGY Decoding Input: a source parse tree Output: a target sentence Bottom-up strategy Build auxiliary rules while decoding Compute subcell divisions for building auxiliary rules
39
INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH NR BUSH Bush ( NR BUSH ) ||| Bush ||| 1:1 Rule Derivation Translation Bush
40
INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH NN PRESIDENT President ( NN PRESIDENT ) ||| President ||| 1:1 Rule Derivation Translation President
41
INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH VV MADE made ( VV MADE ) ||| made ||| 1:1 Rule Derivation Translation made
42
INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH NN SPEECH speech ( NN SPEECH ) ||| speech ||| 1:1 Rule Derivation Translation speech
43
INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH NP ( NP ( NR ) ( NN ) ) ||| X 1 X 2 ||| 1:2 2:1 ( NR BUSH ) ||| Bush ||| 1:1 ( NN PRESIDENT ) ||| President ||| 1:1 Rule Derivation Translation President Bush NRNN
44
INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH Rule Derivation Translation
45
INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH VP ( VP ( VV MADE ) ( NN ) ) ||| made a X ||| 1:1 2:3 ( NN SPEECH ) ||| speech ||| 1:1 Rule Derivation Translation made a speech VVNN MADE madea
46
INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH NP ( NP ( NN ) ( NN PRESIDENT ) ) ( VV MADE ) ||| President X made a ||| 1:2 2:1 3:3 ( NR BUSH ) ||| Bush ||| 1:1 Rule Derivation Translation President Bush made a NRNNVV PRESIDENTMADE Presidentmadea
47
INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH Rule Derivation Translation
48
INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH Rule Derivation Translation NP VP VVNN ( NP ( NP ) ( VP ( VV ) ( NN ) ) ) ||| X 1 X 2 || 1:1 2:1 3:2 ( NP ( NN ) ( NN PRESIDENT ) ) ( VV MADE ) ||| President X made a ||| 1:2 2:1 3:3 ( NR BUSH ) ||| Bush ||| 1:1 ( NN SPEECH ) ||| speech ||| 1:1 President Bush made a speech
49
INSTITUTE OF COMPUTING TECHNOLOGY Subcell Division NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH 1:1 2:2 3:3 4:4
50
INSTITUTE OF COMPUTING TECHNOLOGY Subcell Division NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH 1:3 4:4
51
INSTITUTE OF COMPUTING TECHNOLOGY Subcell Division NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH 1:4 1:1 2:4 1:2 3:4 1:3 4:4 1:1 2:2 3:4 1:1 2:3 4:4 1:2 3:3 4:4 1:1 2:2 3:3 4:4 2^(n-1)
52
INSTITUTE OF COMPUTING TECHNOLOGY Build Auxiliary Rule NP VP NRNNVVNN BUSHPRESIDENTMADESPEECHNP VP NRNNVVNN
53
INSTITUTE OF COMPUTING TECHNOLOGY Penalize the Use of FRs and ARs Auxiliary rules, which are built rather than learnt, have no probabilities. We introduce a feature that sums up the node count of auxiliary rules to balance the preference between conventional tree-to-string rules new forest-to-string and auxiliary rules
54
INSTITUTE OF COMPUTING TECHNOLOGY Outline Introduction Forest-to-String Translation Rules Training Decoding Experiments Conclusion
55
INSTITUTE OF COMPUTING TECHNOLOGY Experiments Training corpus: 31,149 sentence pairs with 843K Chinese words and 949K English words Development set: 2002 NIST Chinese-to- English test set (571 of 878 sentences) Test set: 2005 NIST Chinese-to-English test set (1,082 sentences)
56
INSTITUTE OF COMPUTING TECHNOLOGY Tools Evaluation: mteval-v11b.pl Language model: SRI Language Modeling Toolkits (Stolcke, 2002) Significant test: Zhang et al., 2004 Parser: Xiong et al., 2005 Minimum error rate training: optimizeV5IBMBLEU.m (Venugopal and Vogel, 2005)
57
INSTITUTE OF COMPUTING TECHNOLOGY Rules Used in Experiments RuleLPUTotal BP251, 17300 TR56, 98341, 0273, 529101, 539 FR16, 609254, 34625, 051296, 006
58
INSTITUTE OF COMPUTING TECHNOLOGY Comparison SystemRule SetBLEU4 PharaohBP0.2182±0.0089 Lynx BP0.2059±0.0083 TR0.2302±0.0089 TR + BP0.2346±0.0088 TR + FR + AR0.2402±0.0087
59
INSTITUTE OF COMPUTING TECHNOLOGY TRs Are Still Dominant To achieve the best result of 0.2402, Lynx made use of: 26, 082 tree-to-string rules 9,219 default rules 5,432 forest-to-string rules 2,919 auxiliary rules
60
INSTITUTE OF COMPUTING TECHNOLOGY Effect of Lexicalization Forest-to-String Rule SetBLEU4 None0.2225±0.0085 L0.2297± 0.0081 P0.2279±0.0083 U0.2270±0.0087 L + P + U0.2312±0.0082
61
INSTITUTE OF COMPUTING TECHNOLOGY Outline Introduction Forest-to-String Translation Rules Training Decoding Experiments Conclusion
62
INSTITUTE OF COMPUTING TECHNOLOGY Conclusion We augment the tree-to-string translation model with forest-to-string rules that capture non-syntactic phrase pairs auxiliary rules that help integrate forest-to-string rules into the tree-to-string model Forest and auxiliary rules enable tree-to- string models to derive in a more general way and bring significant improvement.
63
INSTITUTE OF COMPUTING TECHNOLOGY Future Work Scale up to large data Further investigation in auxiliary rules
64
INSTITUTE OF COMPUTING TECHNOLOGY Thanks!
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.