Download presentation
Presentation is loading. Please wait.
1
Paraphrase Generation Using Deep Learning
Prasanna Vaidya Co Founder DiscoveryAI
2
Agenda What is Paraphrase Generation? Use Cases Building Blocks
Technologies Publicly Available Datasets & Compute Power Evaluation Metrics Important Research Papers Questions & Answers
3
What is Paraphrase Generation?
Paraphrasing, the act to express the same meaning in different possible ways, is an important subtask in various Natural Language Processing (NLP) Applications. How old is your child? —> Age of your kid?
4
Why it is important & Use Cases
Information Retrieval Conversational Systems Content Summarisation
5
Research Areas Recognition - Identify if two textual units are paraphrases of each other Extraction - Extract paraphrase instances from a thesaurus or a corpus Generation - Generate a reference paraphrase given a source text
6
Building Blocks
7
Word Embeddings Word embedding is a technique where words or phrases from the vocabulary are mapped to vectors of real numbers. King
8
Neural Networks
9
Limitations of Neural Networks
Neural Networks don’t have memory.
10
Enter Recurrent Neural Nets
They are networks with loops in them, allowing information to persist.
11
Limitations of RNNs I grew up in Pune…I speak fluent Marathi.
In theory, RNNs are absolutely capable of handling such “long-term dependencies.” Sadly, in practice, RNNs don’t seem to be able to learn them.
12
Long Short Term Memory LSTMs are explicitly designed to avoid the long-term dependency problem. Remembering information for long periods of time is their default behavior.
13
Similarity with Machine Translation
Paraphrasing Task can be modelled as Machine Translation Task. How are you? —> ¿cómo estás?
14
Encoder Decoder Model Encoder encodes the input sequence to an internal representation called 'context vector' which is used by the decoder to generate the output sequence. The lengths of input and output sequences can be different. import seq2seq from seq2seq.models import SimpleSeq2Seq model = SimpleSeq2Seq(input_dim=5, hidden_dim=10, output_length=8, output_dim=8, depth=3) model.compile(loss='mse', optimizer='rmsprop')
15
Publicly Available Datasets
16
Compute Requirements Training lasted for 32 hours with on p2.xlarge on AWS for PPDB
17
Evaluation Metrics BLEU (bilingual evaluation understudy) is an algorithm for evaluating the quality of text which has been machine-translated from one natural language to another. METEOR (Metric for Evaluation of Translation with Explicit Ordering) is based on the harmonic mean of unigram precision and recall, with recall weighted higher than precision.
18
Results - How are you? how you doin ' , man uh , how are you
how ya been how ya feelin ' , kid how the hell are you
19
Important Research Papers
Neural Paraphrase Generation with Stacked Residual LSTM Networks Paraphrase Generation with Deep Reinforcement Learning A Deep Generative Framework for Paraphrase Generation
20
Thank You! Questions? @getprasannav
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.