Please enjoy. https://botnik.org/content/harry-potter.html.

Slides:



Advertisements
Similar presentations
Introduction to Recurrent neural networks (RNN), Long short-term memory (LSTM) Wenjie Pei In this coffee talk, I would like to present you some basic.
Advertisements

Deep Learning Neural Network with Memory (1)
ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Recurrent Neural Networks (RNNs) –BackProp Through Time (BPTT) –Vanishing / Exploding.
Haitham Elmarakeby.  Speech recognition
ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –LSTMs (intuition and variants) –[Abhishek:] Lua / Torch Tutorial.
Convolutional LSTM Networks for Subcellular Localization of Proteins
Predicting the dropouts rate of online course using LSTM method
NOTE: To change the image on this slide, select the picture and delete it. Then click the Pictures icon in the placeholder to insert your own image. SHOW.
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation EMNLP’14 paper by Kyunghyun Cho, et al.
S.Bengio, O.Vinyals, N.Jaitly, N.Shazeer
Deep Learning Methods For Automated Discourse CIS 700-7
Attention Model in NLP Jichuan ZENG.
Prof. Adriana Kovashka University of Pittsburgh March 28, 2017
Deep Learning RUSSIR 2017 – Day 3
Convolutional Sequence to Sequence Learning
Unsupervised Learning of Video Representations using LSTMs
Deep Learning Methods For Automated Discourse CIS 700-7
Learning linguistic structure with simple and more complex recurrent neural networks Psychology February 2, 2017.
SD Study RNN & LSTM 2016/11/10 Seitaro Shinagawa.
CS 388: Natural Language Processing: LSTM Recurrent Neural Networks
CS 4501: Introduction to Computer Vision Computer Vision + Natural Language Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy / Justin Johnson.
Recurrent Neural Networks for Natural Language Processing
Neural Machine Translation by Jointly Learning to Align and Translate
CS 2750: Machine Learning Recurrent Neural Networks
Show and Tell: A Neural Image Caption Generator (CVPR 2015)
An Overview of Machine Translation
Deep Learning: Model Summary
ICS 491 Big Data Analytics Fall 2017 Deep Learning
Intro to NLP and Deep Learning
Different Units Ramakrishna Vedantam.
Shunyuan Zhang Nikhil Malik
Attention Is All You Need
Advanced Recurrent Architectures
Attention-based Caption Description Mun Jonghwan.
RNN and LSTM Using MXNet Cyrus M Vahid, Principal Solutions Architect
Advanced Artificial Intelligence
Paraphrase Generation Using Deep Learning
Image Captions With Deep Learning Yulia Kogan & Ron Shiff
Final Presentation: Neural Network Doc Summarization
RNNs & LSTM Hadar Gorodissky Niv Haim.
Introduction to Text Generation
Understanding LSTM Networks
ECE599/692 - Deep Learning Lecture 14 – Recurrent Neural Network (RNN)
The Big Health Data–Intelligent Machine Paradox
Other Classification Models: Recurrent Neural Network (RNN)
Lecture 16: Recurrent Neural Networks (RNNs)
Recurrent Encoder-Decoder Networks for Time-Varying Dense Predictions
Machine Translation(MT)
Report by: 陆纪圆.
RNN Encoder-decoder Architecture
Attention.
LSTM: Long Short Term Memory
Meta Learning (Part 2): Gradient Descent as LSTM
Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton
Attention for translation
-- Ray Mooney, Association for Computational Linguistics (ACL) 2014
Learn to Comment Mentor: Mahdi M. Kalayeh
Jointly Generating Captions to Aid Visual Question Answering
Recurrent Neural Networks (RNNs)
A unified extension of lstm to deep network
Neural Machine Translation - Encoder-Decoder Architecture and Attention Mechanism Anmol Popli CSE 291G.
Neural Machine Translation using CNN
Question Answering System
Presented By: Harshul Gupta
Baseline Model CSV Files Pandas DataFrame Sentence Lists
Recurrent Neural Networks
Sequence-to-Sequence Models
Deep learning: Recurrent Neural Networks CV192
Neural Machine Translation by Jointly Learning to Align and Translate
Listen Attend and Spell – a brief introduction
Presentation transcript:

Please enjoy. https://botnik.org/content/harry-potter.html

CIS 700-004: Lecture 9W Generative RNNs 03/13/19 Done

Today's Agenda Review of RNNs vs. LSTMs Review of language models Generative RNNs Admiring Karpathy's Ode to the RNN Applications of generative LSTMs Seq2seq

Why do LSTMs patch the deficiencies of RNNs?

Why do LSTMs patch the deficiencies of RNNs? And how does adding a cell state help?

Perspective #1: better memory https://www.quora.com/What-is-an-intuitive-explanation-of-LSTMs-and-GRUs

Perspective #1: better memory Real life is chaotic. https://www.quora.com/What-is-an-intuitive-explanation-of-LSTMs-and-GRUs

Perspective #1: better memory Real life is chaotic. Information quickly transforms and vanishes. https://www.quora.com/What-is-an-intuitive-explanation-of-LSTMs-and-GRUs

Perspective #1: better memory Real life is chaotic. Information quickly transforms and vanishes. We want the network to learn how to update its beliefs. Scenes without Captain America should not update beliefs about Captain America Scenes with Iron Man should focus on gathering details about him https://www.quora.com/What-is-an-intuitive-explanation-of-LSTMs-and-GRUs

Perspective #1: better memory Real life is chaotic. Information quickly transforms and vanishes. We want the network to learn how to update its beliefs. Scenes without Captain America should not update beliefs about Captain America Scenes with Iron Man should focus on gathering details about him The RNN's knowledge of the world then evolves more gently. https://www.quora.com/What-is-an-intuitive-explanation-of-LSTMs-and-GRUs

Perspective #1: better memory Cell state = long-term memory, hidden state = working memory. We learn what parts of long-term memory are worth keeping. Forget gate. We learn new learnable information. Candidate cell state. We learn which parts of the new information are worth storing long-term. Input gate. https://www.quora.com/What-is-an-intuitive-explanation-of-LSTMs-and-GRUs

Perspective #1: better memory Cell state = long-term memory, hidden state = working memory. We learn what parts of long-term memory are worth keeping. https://www.quora.com/What-is-an-intuitive-explanation-of-LSTMs-and-GRUs

Perspective #1: better memory Cell state = long-term memory, hidden state = working memory. We learn what parts of long-term memory are worth keeping. Forget gate. We learn new information. https://www.quora.com/What-is-an-intuitive-explanation-of-LSTMs-and-GRUs

Perspective #1: better memory Cell state = long-term memory, hidden state = working memory. We learn what parts of long-term memory are worth keeping. Forget gate. We learn new information. Candidate cell state. We learn which parts of the new information are worth storing long-term. https://www.quora.com/What-is-an-intuitive-explanation-of-LSTMs-and-GRUs

Perspective #1: better memory Cell state = long-term memory, hidden state = working memory. We learn what parts of long-term memory are worth keeping. Forget gate. We learn new information. Candidate cell state. We learn which parts of the new information are worth storing long-term. Input gate. https://www.quora.com/What-is-an-intuitive-explanation-of-LSTMs-and-GRUs

Perspective #1: better memory https://www.quora.com/What-is-an-intuitive-explanation-of-LSTMs-and-GRUs

Perspective #2: vanishing gradient

Perspective #2: vanishing gradient

Andrej Karpathy

Andrej Karpathy

Generative RNNs

Temperature

Review: language models Language models assign probabilities to sequences of words.

Review: language models Language models assign probabilities to sequences of words.

Review: language models Language models assign probabilities to sequences of words. Previously: the n-gram assumption

The Unreasonable Effectiveness of RNNs http://karpathy.github.io/2015/05/21/rnn-effectiveness/

PixelRNN

Image captioning

Image captioning in PyTorch: CNN encoding

Image captioning in PyTorch: CNN encoding

Image captioning in PyTorch: RNN decoding (train)

Image captioning in PyTorch: RNN decoding (test)

We've covered many modes of RNNs. Fixed-to-variable (k->n) Variable-to-fixed (n->k) Variable-to-variable (n->n)

We've covered many modes of RNNs. Fixed-to-variable (k->n) Variable-to-fixed (n->k) Variable-to-variable (n->n) Does this cover all variable-to-variable use cases?

What if we want to input and output arbitrary variable length vectors?

seq2seq

How do we evaluate a variable-length language output if there are many possible correct answers?

BLEU (2002) "bilingual evaluation understudy" "the closer a machine translation is to a professional human translation, the better it is"

Seq2seq translation in PyTorch

"“Drop your RNN and LSTM, they are no good!”

"Use attention. Attention really is all you need "Use attention. Attention really is all you need!" -- Eugenio Culurciello

Some funny LSTMs

LSTMs on humor https://amoudgl.github.io/blog/funnybot/

LSTMs on humor: Results

LSTMs on humor: Continued. Antijokes!

LSTMs on humor: Continued. Low temperature.

Is Kanye learnable? https://towardsdatascience.com/text-predictor-generating-rap-lyrics-with-recurrent-neural-networks-lstms-c3a1acbbda79