Download presentation
Presentation is loading. Please wait.
Published byGwendoline Hoover Modified over 8 years ago
1
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation EMNLP’14 paper by Kyunghyun Cho, et al.
2
Recurrent Neural Networks (1/3) 2
3
Recurrent Neural Networks (2/3) A variable-length sequence x = (x 1, …, x T ) Hidden state h (optional) Output y (e.g. next symbol in a sequence) A non-linear activation function f Logistic sigmoid Long short-term memory (LSTM) 3
4
Recurrent Neural Networks (3/3) The output at each timestep t is the conditional probability p(x t | x t- 1, …, x 1 ) e.g. output from a softmax layer: Hence, the probability of the sequence x can be computed: 4
5
RNN Encoder-Decoder (1/3) 5
6
RNN Encoder-Decoder (2/3) Encoder: Input: A variable-length sequence x Output: A fixed-length vector representation c Decoder: Input: A given fixed-length vector representation c Output: A variable-length sequence y Note that the decoder’s hidden state h t depends on h t-1, y t-1, and c. 6
7
RNN Encoder-Decoder (3/3) Trained jointly to maximize conditional log-likelihood Usage: Generate an output sequence given an input sequence Score a given pair of input and output sequences 7
8
The Hidden Unit (1/2) Gated Recurrent Unit (GRU) 2 gates: Update gate z decides how the hidden state is updated with a new hidden state Reset gate r decides whether the previous hidden state is ignored. 8
9
The Hidden Unit (2/2) Reset gate: Update gate: New state: Final state: 9
10
Statistical Machine Translation RNN encoder-decoder for scoring phrase pairs. Additional feature in the log-linear model of the phrase-based SMT framework Trained on each phrase pairs (ignoring frequencies). A new score is added to the existing phrase table. 10
11
Experiments English-to-French machine translation 11
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.