Presentation is loading. Please wait.

Presentation is loading. Please wait.

Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation EMNLP’14 paper by Kyunghyun Cho, et al.

Similar presentations


Presentation on theme: "Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation EMNLP’14 paper by Kyunghyun Cho, et al."— Presentation transcript:

1 Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation EMNLP’14 paper by Kyunghyun Cho, et al.

2 Recurrent Neural Networks (1/3) 2

3 Recurrent Neural Networks (2/3)  A variable-length sequence x = (x 1, …, x T )  Hidden state h  (optional) Output y (e.g. next symbol in a sequence)  A non-linear activation function f  Logistic sigmoid  Long short-term memory (LSTM) 3

4 Recurrent Neural Networks (3/3)  The output at each timestep t is the conditional probability p(x t | x t- 1, …, x 1 ) e.g. output from a softmax layer:  Hence, the probability of the sequence x can be computed: 4

5 RNN Encoder-Decoder (1/3) 5

6 RNN Encoder-Decoder (2/3)  Encoder:  Input: A variable-length sequence x  Output: A fixed-length vector representation c  Decoder:  Input: A given fixed-length vector representation c  Output: A variable-length sequence y  Note that the decoder’s hidden state h t depends on h t-1, y t-1, and c. 6

7 RNN Encoder-Decoder (3/3)  Trained jointly to maximize conditional log-likelihood  Usage:  Generate an output sequence given an input sequence  Score a given pair of input and output sequences 7

8 The Hidden Unit (1/2)  Gated Recurrent Unit (GRU)  2 gates:  Update gate z decides how the hidden state is updated with a new hidden state  Reset gate r decides whether the previous hidden state is ignored. 8

9 The Hidden Unit (2/2)  Reset gate:  Update gate:  New state:  Final state: 9

10 Statistical Machine Translation  RNN encoder-decoder for scoring phrase pairs.  Additional feature in the log-linear model of the phrase-based SMT framework  Trained on each phrase pairs (ignoring frequencies). A new score is added to the existing phrase table. 10

11 Experiments  English-to-French machine translation 11


Download ppt "Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation EMNLP’14 paper by Kyunghyun Cho, et al."

Similar presentations


Ads by Google