Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer.

Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer Science and Technology Tsinghua University, Beijing 100084, China ACL 2013

Introduction  目前的 statistical machine translation approach 大致上分為兩類  phrase-based  syntax-based  提出 shift-reduce parsing algorithm 來整合兩類的優點  翻譯的對象是 string-to-dependency phrase pair  利用 maximum entropy model 來解決 conflicts 的問題

Introduction  datasets: 使用 NIST Chinese-English translation datasets  evaluation : BLEU & TER, 並與 phrase-based 和 syntax-based 結果相比較

Shift-Reduce Parsing for Phrase-based String-to-Dependency Translation  Example: zongtong jiang yu siyue lai lundun fangwen The President will visit London in April GIZA++ Context free grammar parser

Shift-Reduce Parsing for Phrase-based String-to-Dependency Translation  Two broad categories:  well-formed:  fixed  floating – (left or right,according to position of head)  ill-formed source phrasetarget phrasedependencycategory r1 r2 r3 r4 r5 fangwen yu siyue zongtone jiang yu siyue lai lundun zongtone jiang visit in April The President will London in April President will {} {1  2} {2  1} {2  3} {} fixed floating left floating right ill-formed

shift-reduce algorithm - example tuple 從 empty state 開始 terminate: 當所有 source words 都被翻譯且 stack 內有完整的 dependency tree 時

A Maximum Entropy Based Shift-Reduce Parsing Model  h : fixed  l : left floating  r : right floating

A Maximum Entropy Based Shift-Reduce Parsing Model  maximum entropy model:  a ∈ {S, R l, R r }  c : 為 boolean 值表示是否所有的 source words 都 covered  h(a, c, s t, s t-1 ) : vector of binary features  Ѳ: vector of feature weights

A Maximum Entropy Based Shift-Reduce Parsing Model

 為了 train model, 我們需要每個 training example gold-standard action sequence  To alleviate this problem : derivation graph

Decoding  linear model with the following features:  standard features  relative frequencies in two directions  lexical weights in two directions  phrase penalty  distance-based reordering model  lexicaized reordering model  n-gram language model model  word penalty

Decoding (continue)  dependency features:  ill-formed structure penalty  dependency language model  maximum entropy parsing model

Decoding

 在 decoding 的過程中,stack 內的 context information 會不斷變動 (dependency language model and maximum entropy model probabilities)  使用 hypergraph reranking (Huang and Chiang, 2007; Huang, 2008)  divided into two part

Decoding  為了提高 rule coverage, 使用 Shen et al. (2008) 的 ill-formed structures  如果 :  ill-formed structure 有單一個 root : 當作 (pseudo) fixed structure  其他的 ill-formed structure 拆成一個 (pseudo) left floating structure 和一個 (pseudo) right floating structure

Experiments  evaluated on Chinese-English translation  training data : 2.9M 個 sentence pairs, 包含 76.0M Chinese words 和 82.2M English words  development set : 2002 NIST MT Chinese- English dataset  test sets: 2003-2005 NIST datasets

Experiments  用 Stanford parser 得到 English sentence 的 dependency trees  train a 4-gram language model on the Xinhua portion of the GIGAWORD corpus, which contains 238M English words  train a 3-gram dependency language model was trained on the English dependency trees

Experiments  compare with:  The Moses phrase-based decoder (Koehn et al., 2007)  A re-implementation of bottom-up string-to- dependency decoder (Shen et al., 2008)  b limit : 100  pharse table limit : 20

Experiments  Moses shares the same feature set with our system except for the dependency features.  For the bottom-up string-to-dependency system, we included both well-formed and ill- formed structures in chart parsing.

Experiments MosesdependencyThis work Rule number103M587M124M avg. decoding time (per sentence) 3.67 s13.89 s4.56 s

Experiments

Conclusion  提出 shift-reduce parsing algorithm for phrase- based string-to-dependency translation, 這個方法能整合 phrase-based 和 string-to-dependency model 的優點, 並在 Chinese-to-English translation 的實驗結果, outperform 兩個 baseline(phrase-based, syntax-based)

Future work  在 maximum entropy model 中增加更多的 contextual information 來解決 conflicts 的問題, 另一方面, 修改 Huang and Sagae (2010) 提出的 dynamic programming algorithm 來提高 string- to-dependency decoder 的效果

Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer.

Similar presentations

Presentation on theme: "Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer.

Similar presentations

Presentation on theme: "Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer."— Presentation transcript:

Similar presentations

About project

Feedback