Presentation is loading. Please wait.

Presentation is loading. Please wait.

INSTITUTE OF COMPUTING TECHNOLOGY Forest-to-String Statistical Translation Rules Yang Liu, Qun Liu, and Shouxun Lin Institute of Computing Technology Chinese.

Similar presentations


Presentation on theme: "INSTITUTE OF COMPUTING TECHNOLOGY Forest-to-String Statistical Translation Rules Yang Liu, Qun Liu, and Shouxun Lin Institute of Computing Technology Chinese."— Presentation transcript:

1 INSTITUTE OF COMPUTING TECHNOLOGY Forest-to-String Statistical Translation Rules Yang Liu, Qun Liu, and Shouxun Lin Institute of Computing Technology Chinese Academy of Sciences

2 INSTITUTE OF COMPUTING TECHNOLOGY Outline Introduction Forest-to-String Translation Rules Training Decoding Experiments Conclusion

3 INSTITUTE OF COMPUTING TECHNOLOGY Syntactic and Non-syntactic Bilingual Phrases NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH PresidentBushmadeaspeech syntactic non-syntactic

4 INSTITUTE OF COMPUTING TECHNOLOGY Importance of Non-syntactic Bilingual Phrases About 28% of bilingual phrases are non-syntactic on a English-Chinese corpus (Marcu et al., 2006). Requiring bilingual phrases to be syntactically motivated will lose a good amount of valuable knowledge (Koehn et al., 2003). Keeping the strengths of phrases while incorporating syntax into statistical translation results in significant improvements (Chiang, 2005).

5 INSTITUTE OF COMPUTING TECHNOLOGY Previous Work NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH PresidentBushmadeaspeech Galley et al., 2004

6 INSTITUTE OF COMPUTING TECHNOLOGY Previous Work NPB DTJJNN themutualunderstanding THEMUTUALUNDERSTANDING DTJJ themutual THEMUTUAL *NPB_*NN NPB *NPB_*NNNN Marcu et al., 2006

7 INSTITUTE OF COMPUTING TECHNOLOGY Previous Work NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH Liu et al., 2006 BUSHPRESIDENT PresidentBush NRNN NP PresidentBush

8 INSTITUTE OF COMPUTING TECHNOLOGY Our Work We augment the tree-to-string translation model with forest-to-string rules that capture non-syntactic phrase pairs auxiliary rules that help integrate forest-to-string rules into the tree-to-string model

9 INSTITUTE OF COMPUTING TECHNOLOGY Outline Introduction Forest-to-String Translation Rules Training Decoding Experiments Conclusion

10 INSTITUTE OF COMPUTING TECHNOLOGY Tree-to-String Rules GUNMAN NN thegunman VP SBVP WASNPVV NNKILLED waskilledby IP NPVPPU

11 INSTITUTE OF COMPUTING TECHNOLOGY Derivation A derivation is a left-most composition of translation rules that explains how a source parse tree, a target sentence, and the word alignment between them are synchronously generated. IP NPVPPU NN GUNMAN the gunman SBVP WASNPVV NNKILLED was killed by POLICE police..

12 INSTITUTE OF COMPUTING TECHNOLOGY A Derivation Composed of TRs

13 INSTITUTE OF COMPUTING TECHNOLOGY A Derivation Composed of TRs IP NPVPPU IP NPVPPU

14 INSTITUTE OF COMPUTING TECHNOLOGY A Derivation Composed of TRs IP NPVPPU GUNMAN NN thegunman NP NN GUNMAN thegunman

15 INSTITUTE OF COMPUTING TECHNOLOGY A Derivation Composed of TRs IP NPVPPU NN GUNMAN thegunman VP SBVP WASNPVV NNKILLED waskilledby SBVP WASNPVV NNKILLED was killed by

16 INSTITUTE OF COMPUTING TECHNOLOGY A Derivation Composed of TRs IP NPVPPU NN GUNMAN thegunman police SBVP WASNPVV NNKILLED was killed by NN POLICE police

17 INSTITUTE OF COMPUTING TECHNOLOGY A Derivation Composed of TRs. PU. IP NPVPPU NN GUNMAN thegunman SBVP WASNPVV NNKILLED was killed by POLICE police..

18 INSTITUTE OF COMPUTING TECHNOLOGY Forest-to-String and Auxiliary Rules NN GUNMAN thegunman SB WAS was NP forest = tree sequence ! IP NPVPPU SBVP care about only root sequence while incorporating forest rules

19 INSTITUTE OF COMPUTING TECHNOLOGY A Derivation Composed of TRs, FRs, and ARs

20 INSTITUTE OF COMPUTING TECHNOLOGY A Derivation Composed of TRs, FRs, and ARs IP NPVPPU SBVP IP NPVPPU SBVP

21 INSTITUTE OF COMPUTING TECHNOLOGY A Derivation Composed of TRs, FRs, and ARs IP NPVPPU SBVP NN GUNMAN thegunman SB WAS was NP NN GUNMANWAS thegunmanwas

22 INSTITUTE OF COMPUTING TECHNOLOGY A Derivation Composed of TRs, FRs, and ARs NP killedby PU.. VP VV KILLED IP NPVPPU NN GUNMAN thegunman SBVP WASNPVV KILLED was killed by..

23 INSTITUTE OF COMPUTING TECHNOLOGY A Derivation Composed of TRs, FRs, and ARs IP NPVPPU NN GUNMAN thegunman SBVP WASNPVV NNKILLED was killed by POLICE police.. NN POLICE NP

24 INSTITUTE OF COMPUTING TECHNOLOGY Outline Introduction Forest-to-String Translation Rules Training Decoding Experiments Conclusion

25 INSTITUTE OF COMPUTING TECHNOLOGY Training Extract both tree-to-string and forest-to- string rules from word-aligned, source-side parsed bilingual corpus Bottom-up strategy Auxiliary rules are NOT learnt from real- world data

26 INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH PresidentBushmadeaspeech NR BUSH Bush

27 INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH PresidentBushmadeaspeech NN PRESIDENT President

28 INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH PresidentBushmadeaspeech VV MADE made

29 INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH PresidentBushmadeaspeech NN SPEECH speech

30 INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH PresidentBushmadeaspeech NP NR President NN BUSHPRESIDENT Bush NP NR President NN PRESIDENT NP NRNN BUSH Bush NP NRNN

31 INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH PresidentBushmadeaspeech

32 INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH PresidentBushmadeaspeech VP VVNN a VP VVNN madea MADE VP VVNN aspeech SPEECH VP VVNN madeaspeech MADESPEECH

33 INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH PresidentBushmadeaspeech NPVVNP VVNRNN NP VVNRNN MADEBUSHPRESIDENT madePresidentBush 10 FRs

34 INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH PresidentBushmadeaspeech

35 INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH PresidentBushmadeaspeech NP VP max_height = 2

36 INSTITUTE OF COMPUTING TECHNOLOGY Why We Don’t Extract Auxiliary Rules ? IP NPVP-B NP-B NR NNCCNN SHANGHAIPUDONGDEVEWITHLEGALESTAB VV STEP The development of Shanghai ‘s Pudong is in step with the establishment of its legal system

37 INSTITUTE OF COMPUTING TECHNOLOGY Outline Introduction Forest-to-String Translation Rules Training Decoding Experiments Conclusion

38 INSTITUTE OF COMPUTING TECHNOLOGY Decoding Input: a source parse tree Output: a target sentence Bottom-up strategy Build auxiliary rules while decoding Compute subcell divisions for building auxiliary rules

39 INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH NR BUSH Bush ( NR BUSH ) ||| Bush ||| 1:1 Rule Derivation Translation Bush

40 INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH NN PRESIDENT President ( NN PRESIDENT ) ||| President ||| 1:1 Rule Derivation Translation President

41 INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH VV MADE made ( VV MADE ) ||| made ||| 1:1 Rule Derivation Translation made

42 INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH NN SPEECH speech ( NN SPEECH ) ||| speech ||| 1:1 Rule Derivation Translation speech

43 INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH NP ( NP ( NR ) ( NN ) ) ||| X 1 X 2 ||| 1:2 2:1 ( NR BUSH ) ||| Bush ||| 1:1 ( NN PRESIDENT ) ||| President ||| 1:1 Rule Derivation Translation President Bush NRNN

44 INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH Rule Derivation Translation

45 INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH VP ( VP ( VV MADE ) ( NN ) ) ||| made a X ||| 1:1 2:3 ( NN SPEECH ) ||| speech ||| 1:1 Rule Derivation Translation made a speech VVNN MADE madea

46 INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH NP ( NP ( NN ) ( NN PRESIDENT ) ) ( VV MADE ) ||| President X made a ||| 1:2 2:1 3:3 ( NR BUSH ) ||| Bush ||| 1:1 Rule Derivation Translation President Bush made a NRNNVV PRESIDENTMADE Presidentmadea

47 INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH Rule Derivation Translation

48 INSTITUTE OF COMPUTING TECHNOLOGY An Example NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH Rule Derivation Translation NP VP VVNN ( NP ( NP ) ( VP ( VV ) ( NN ) ) ) ||| X 1 X 2 || 1:1 2:1 3:2 ( NP ( NN ) ( NN PRESIDENT ) ) ( VV MADE ) ||| President X made a ||| 1:2 2:1 3:3 ( NR BUSH ) ||| Bush ||| 1:1 ( NN SPEECH ) ||| speech ||| 1:1 President Bush made a speech

49 INSTITUTE OF COMPUTING TECHNOLOGY Subcell Division NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH 1:1 2:2 3:3 4:4

50 INSTITUTE OF COMPUTING TECHNOLOGY Subcell Division NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH 1:3 4:4

51 INSTITUTE OF COMPUTING TECHNOLOGY Subcell Division NP VP NRNNVVNN BUSHPRESIDENTMADESPEECH 1:4 1:1 2:4 1:2 3:4 1:3 4:4 1:1 2:2 3:4 1:1 2:3 4:4 1:2 3:3 4:4 1:1 2:2 3:3 4:4 2^(n-1)

52 INSTITUTE OF COMPUTING TECHNOLOGY Build Auxiliary Rule NP VP NRNNVVNN BUSHPRESIDENTMADESPEECHNP VP NRNNVVNN

53 INSTITUTE OF COMPUTING TECHNOLOGY Penalize the Use of FRs and ARs Auxiliary rules, which are built rather than learnt, have no probabilities. We introduce a feature that sums up the node count of auxiliary rules to balance the preference between conventional tree-to-string rules new forest-to-string and auxiliary rules

54 INSTITUTE OF COMPUTING TECHNOLOGY Outline Introduction Forest-to-String Translation Rules Training Decoding Experiments Conclusion

55 INSTITUTE OF COMPUTING TECHNOLOGY Experiments Training corpus: 31,149 sentence pairs with 843K Chinese words and 949K English words Development set: 2002 NIST Chinese-to- English test set (571 of 878 sentences) Test set: 2005 NIST Chinese-to-English test set (1,082 sentences)

56 INSTITUTE OF COMPUTING TECHNOLOGY Tools Evaluation: mteval-v11b.pl Language model: SRI Language Modeling Toolkits (Stolcke, 2002) Significant test: Zhang et al., 2004 Parser: Xiong et al., 2005 Minimum error rate training: optimizeV5IBMBLEU.m (Venugopal and Vogel, 2005)

57 INSTITUTE OF COMPUTING TECHNOLOGY Rules Used in Experiments RuleLPUTotal BP251, 17300 TR56, 98341, 0273, 529101, 539 FR16, 609254, 34625, 051296, 006

58 INSTITUTE OF COMPUTING TECHNOLOGY Comparison SystemRule SetBLEU4 PharaohBP0.2182±0.0089 Lynx BP0.2059±0.0083 TR0.2302±0.0089 TR + BP0.2346±0.0088 TR + FR + AR0.2402±0.0087

59 INSTITUTE OF COMPUTING TECHNOLOGY TRs Are Still Dominant To achieve the best result of 0.2402, Lynx made use of: 26, 082 tree-to-string rules 9,219 default rules 5,432 forest-to-string rules 2,919 auxiliary rules

60 INSTITUTE OF COMPUTING TECHNOLOGY Effect of Lexicalization Forest-to-String Rule SetBLEU4 None0.2225±0.0085 L0.2297± 0.0081 P0.2279±0.0083 U0.2270±0.0087 L + P + U0.2312±0.0082

61 INSTITUTE OF COMPUTING TECHNOLOGY Outline Introduction Forest-to-String Translation Rules Training Decoding Experiments Conclusion

62 INSTITUTE OF COMPUTING TECHNOLOGY Conclusion We augment the tree-to-string translation model with forest-to-string rules that capture non-syntactic phrase pairs auxiliary rules that help integrate forest-to-string rules into the tree-to-string model Forest and auxiliary rules enable tree-to- string models to derive in a more general way and bring significant improvement.

63 INSTITUTE OF COMPUTING TECHNOLOGY Future Work Scale up to large data Further investigation in auxiliary rules

64 INSTITUTE OF COMPUTING TECHNOLOGY Thanks!


Download ppt "INSTITUTE OF COMPUTING TECHNOLOGY Forest-to-String Statistical Translation Rules Yang Liu, Qun Liu, and Shouxun Lin Institute of Computing Technology Chinese."

Similar presentations


Ads by Google