Presentation is loading. Please wait.

Presentation is loading. Please wait.

A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

Similar presentations


Presentation on theme: "A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪"— Presentation transcript:

1 A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

2 Outline  Introduction  Non-contiguous Tree Sequence Modeling  Rule Extraction  Non-contiguous Decoding: the Pisces Decoder  Experiments  Conclusion 2

3 Contiguous and Non-contiguous Bilingual Phrases 3 Contiguous translational equivalences Non-contiguous translational equivalence

4 Previous Work on Non-contiguous phrases  (-) Zhang et al. (2008) acquire the non-contiguous phrasal rules from the contiguous tree sequence pairs, and find them useless via real syntax-based translation systems.  (+) Wellington et al. (2006) statistically report that discontinuities are very useful for translational equivalence analysis using binary branching structures under word alignment and parse tree constraints.  (+) Bod (2007) also finds that discontinues phrasal rules make significant improvement in linguistically motivated STSG-based translation model. 4

5 Previous Work on Non-contiguous phrases (cont.) 5 VP(VV( 到 ),NP(CP[ 0 ],NN( 时候 )))  SBAR(WRB(when),S[ 0 ]) Non-contiguous Contiguous tree sequence pair

6 Previous Work on Non-contiguous phrases (cont.) 6 No match in rule set

7 Proposed Non-contiguous phrases Modeling 7... Extracted from non-contiguous tree sequence pairs

8 Contributions  The proposed model extracts the translation rules not only from the contiguous tree sequence pairs but also from the non-contiguous tree sequence pairs (with gaps). With the help of the non-contiguous tree sequence, the proposed model can well capture the non-contiguous phrases in avoidance of the constraints of large applicability of context and enhance the non-contiguous constituent modeling.  A decoding algorithm for non-contiguous phrase modeling 8

9 Outline  Introduction  Non-contiguous Tree Sequence Modeling  Rule Extraction  Non-contiguous Decoding: the Pisces Decoder  Experiments  Conclusion 9

10 SncTSSG Synchronous Tree Substitution Grammar (STSG, Chiang, 2006) Synchronous Tree Sequence Substitution Grammar (STSSG, Zhang et al. 2008) Synchronous non-contiguous Tree Sequence Substitution Grammar (SncTSSG) 10

11 Word Aligned Parse Tree and Two Parse Tree Sequence 11 VBA 把 我给 钢笔 PRVGNG VO VBA 把 给 P RVG NG VO s u b t r e e S u b s t r u c t u r e a b s t r a c t 1. Word-aligned bi-parsed Tree 2. Two Structure 3. Two Tree Sequences

12 Contiguous Translation Rules 12 r1. Contiguous Tree-to-Tree Rule r2. Contiguous Tree Sequence Rule

13 Non-contiguous Translation Rules 13 r1. Non-contiguous Tree-to-Tree Rule r2. Non-contiguous Tree Sequence Rule

14 Outline 14  Introduction  Non-contiguous Tree Sequence Modeling  Rule Extraction  Non-contiguous Decoding: the Pisces Decoder  Experiments  Conclusion

15 A word-aligned parse tree pairs

16 Example for contiguous rule extraction(1)

17 Example for contiguous rule extraction(2)

18 Example for contiguous rule extraction(3)

19 Example for contiguous rule extraction(4) Abstract into substructures

20 Example for non-contiguous rule extraction(1) Extracted from non-contiguous tree sequence pairs

21 Example for non-contiguous rule extraction(2) Abstract into substructures from non-contiguous tree sequence pairs

22 Outline 22  Introduction  Non-contiguous Tree Sequence Modeling  Rule Extraction  Non-contiguous Decoding: the Pisces Decoder  Experiments  Conclusion

23 The Pisces Decoder  Pisces conducts searching by the following two modules  The first one is a CFG-based chart parser as a pre-processor for mapping an input sentence to a parse tree T s (for details of chart parser, please refer to Charniak (1997))  The second one is a span-based tree decoder (3 phases)  Contiguous decoding (same with Zhang et al. 2008)  Source side non-contiguous translation  Tree sequence reordering in Target side 23

24 Source side non-contiguous translation  Source gap insertion 24 IN(in)NP(...) Right insertion:Left insertion:

25 Tree sequence reordering in Target side  Binarize each span into the left one and the right one.  Generating the new translation hypothesis for this span by inserting the candidate translations of the right span to each gap in the ones of the left span.  Generating the translation hypothesis for this span by inserting the candidate translations of the left span to each gap in the ones of the right span. 25 A candidate hypo taget span with gaps Left span Right span

26 Modeling 26  : source/target sentence  : source/target parse tree  : a non-contiguous source/target tree sequence  : source/target spans  h m : the feature function

27 Features  The bi-phrasal translation probabilities  The bi-lexical translation probabilities  The target language model  The # of words in the target sentence  The # of rules utilized  The average tree depth in the source side of the rules adopted  The # of non-contiguous rules utilized  The # of reordering times caused by the utilization of the non-contiguous rules 27

28 Outline 28  Introduction  Non-contiguous Tree Sequence Modeling  Rule Extraction  Non-contiguous Decoding: the Pisces Decoder  Experiments  Conclusion

29  Training Corpus:  Chinese-English FBIS corpus  Development Set:  NIST MT 2002 test set  Test Set:  NIST MT 2005 test set  Evaluation Metrics:  case-sensitive BLEU-4  Parser:  Stanford Parser (Chinese/English) 29 Experimental settings  Evaluation:  mteval-v11b.pl  Language Model:  SRILM 4-gram  Minimum error rate training:  (Och, 2003)  Model Optimization:  Only allow gaps in one side

30 Model comparison in BLEU Table 1: Translation results of different models (cBP refers to contiguous bilingual phrases without syntactic structural information, as used in Moses) 30 SystemModelBLEU MosescBP23.86 Pisces STSSG25.92 SncTSSG26.53

31 Rule combination Table 2: Performance of different rule combination 31 IDRule SetBLEU 1cR (STSSG)25.92 2cR w/o ncPR25.87 3cR w/o ncPR + tgtncR26.14 4cR w/o ncPR + srcncR26.50 5cR w/o ncPR + src&tgtncR26.51 6cR + tgtncR26.11 7cR + srcncR26.56 8cR+src&tgtncR(SncTSSG)26.53 cR: rules derived from contiguous tree sequence pairs (i.e., all STSSG rules) ncPR: non-contiguous rules derived from contiguous tree sequence pairs with at least one non-terminal leaf node between two lexicalized leaf nodes srcncR: non-contiguous rules with gaps in the source side tgtncR: non-contiguous rules with gaps in the target side src&tgtncR : non-contiguous rules with gaps in either side

32 Bilingual Phrasal Rules Table 3: Performance of bilingual phrasal rules 32 SystemRule SetBLEU MosescBP23.86 Pisces cBP22.63 cBP + tgtncBP23.74 cBP + srcncBP23.93 cBP + src&tgtncBP24.24 cR: rules derived from contiguous tree sequence pairs (i.e., all STSSG rules) ncPR: non-contiguous rules derived from contiguous tree sequence pairs with at least one non-terminal leaf node between two lexicalized leaf nodes srcncBP: non-contiguous phrasal rules with gaps in the source side tgtncBP: non-contiguous phrasal rules with gaps in the target side src&tgtncBP : non-contiguous phrasal rules with gaps in either side

33 Maximal number of gaps  Table 4: Performance and rule size changing with different maximal number of gaps 33 Max gaps allowedRule #BLEU sourcetarget 001,661,04525.92 11+841,26326.53 22+447,16126.55 33+17,78226.56 ∞+8,22326.57

34 Sample translations 34 Output & References Source 才 /only 过 /pass 了 /null 五年 /five years , 两人 /two people 就 /null 对簿公堂 /confront at court Referenceafter only five years the two confronted each other at court STSSG only in the five years, the two candidates would 对簿公堂 SncTSSGthe two people can confront other countries at court leisurely manner only in the five years key rules VV( 对簿公堂 )→VB(confront)NP(JJ(other),NNS(countries))IN(at) NN(court) *** JJ(leisurely)NN(manner) Source 欧元 /Euro 的 /’s 大幅 /substantial 升值 /appreciation 将 /will 在 /in 近期 /recent 的 /’s 调查 /survey 中 /middle 持续 /continue 对 /for 经济 /economy 信心 /confidence 产生 /produce 影响 /impact Referencesubstantial appreciation of the euro will continue to impact the economic confidence in the recent surveys STSSGsubstantial appreciation of the euro has continued to have an impact on confidence in the economy, in the recent surveys will SncTSSGsubstantial appreciation of the euro will continue in the recent surveys have an impact on economic confidence key rules AD( 将 /will) *** VV( 持续 /continue) → VP(MD(will),VB(continue)) P( 在 /in) *** LC( 中 /middle) → IN(in)

35 Conclusion  Able to attain better ability of non-contiguous phrase modeling and the reordering caused by non-contiguous constituents with large gaps from  Non-contiguous tree sequence alignment model based on SncTSSG  Observations  In Chinese-English translation task, gaps are more effective in Chinese side than in the English side.  Allowing one gap only is effective  Future Work  Redundant non-contiguous rules  Optimization of the large rule set 35

36 36 The End


Download ppt "A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪"

Similar presentations


Ads by Google