RNNs & LSTM Hadar Gorodissky Niv Haim.

Slides:



Advertisements
Similar presentations
Introduction to Recurrent neural networks (RNN), Long short-term memory (LSTM) Wenjie Pei In this coffee talk, I would like to present you some basic.
Advertisements

CHAPTER 11 Back-Propagation Ming-Feng Yeh.
ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Recurrent Neural Networks (RNNs) –BackProp Through Time (BPTT) –Vanishing / Exploding.
Haitham Elmarakeby.  Speech recognition
Recurrent Neural Networks - Learning, Extension and Applications
Supervised Sequence Labelling with Recurrent Neural Networks PRESENTED BY: KUNAL PARMAR UHID:
Predicting the dropouts rate of online course using LSTM method
NOTE: To change the image on this slide, select the picture and delete it. Then click the Pictures icon in the placeholder to insert your own image. SHOW.
Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting 卷积LSTM网络:利用机器学习预测短期降雨 施行健 香港科技大学 VALSE 2016/03/23.
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation EMNLP’14 paper by Kyunghyun Cho, et al.
1 Deep Recurrent Neural Networks for Acoustic Modelling 2015/06/01 Ming-Han Yang William ChanIan Lane.
S.Bengio, O.Vinyals, N.Jaitly, N.Shazeer
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Attention Model in NLP Jichuan ZENG.
Welcome deep loria !.
Sentiment analysis using deep learning methods
Deep Learning RUSSIR 2017 – Day 3
Best viewed with Computer Modern fonts installed
Unsupervised Learning of Video Representations using LSTMs
SD Study RNN & LSTM 2016/11/10 Seitaro Shinagawa.
CS 388: Natural Language Processing: LSTM Recurrent Neural Networks
CS 4501: Introduction to Computer Vision Computer Vision + Natural Language Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy / Justin Johnson.
Deep Learning Amin Sobhani.
Natural Language and Text Processing Laboratory
Randomness in Neural Networks
Recursive Neural Networks
Recurrent Neural Networks for Natural Language Processing
Show and Tell: A Neural Image Caption Generator (CVPR 2015)
Deep Learning: Model Summary
Intro to NLP and Deep Learning
Classification with Perceptrons Reading:
Intelligent Information System Lab
Intro to NLP and Deep Learning
Different Units Ramakrishna Vedantam.
Mini Presentations - part 2
Neural Networks 2 CS446 Machine Learning.
Shunyuan Zhang Nikhil Malik
Neural Networks and Backpropagation
Using Recurrent Neural Networks and Hebbian Learning
A critical review of RNN for sequence learning Zachary C
Grid Long Short-Term Memory
RNN and LSTM Using MXNet Cyrus M Vahid, Principal Solutions Architect
Advanced Artificial Intelligence
Image Captions With Deep Learning Yulia Kogan & Ron Shiff
Home Automation Enhancement with EEG Based Headset (Orpheus)
A First Look at Music Composition using LSTM Recurrent Neural Networks
Recurrent Neural Networks
Recurrent Neural Networks
Understanding LSTM Networks
ECE599/692 - Deep Learning Lecture 14 – Recurrent Neural Network (RNN)
Recurrent Neural Networks
Other Classification Models: Recurrent Neural Network (RNN)
Lecture 16: Recurrent Neural Networks (RNNs)
Attention.
实习生汇报 ——北邮 张安迪.
Please enjoy.
LSTM: Long Short Term Memory
Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton
Attention for translation
-- Ray Mooney, Association for Computational Linguistics (ACL) 2014
Automatic Handwriting Generation
The Updated experiment based on LSTM
Neural Machine Translation using CNN
Weeks 1 and 2 Aaron Ott.
Recurrent Neural Networks
Sequence-to-Sequence Models
Deep learning: Recurrent Neural Networks CV192
LHC beam mode classification
Neural Machine Translation by Jointly Learning to Align and Translate
Presentation transcript:

RNNs & LSTM Hadar Gorodissky Niv Haim

(1) Vanilla mode of processing without RNN, from fixed-sized input to fixed-sized output (e.g. image classification) (2) image captioning takes an image and outputs a sentence of words (3) sentiment analysis where a given sentence is classified as expressing positive or negative sentiment (4) Machine Translation: an RNN reads a sentence in English and then outputs a sentence in French (5) Synced sequence input and output, video classification where we wish to label each frame of the video)

RNN

BPTT It is sufficient for the largest eigenvalue λ1 of the recurrent weight matrix to be smaller than 1 for long term components to vanish (as t → ∞) and necessary for it to be larger than 1 for gradients to explode. We can generalize these results for nonlinear functions σ where the absolute values of σ 0 (x) is bounded

BPTT It is sufficient for the largest eigenvalue λ1 of the recurrent weight matrix to be smaller than 1 for long term components to vanish (as t → ∞) and necessary for it to be larger than 1 for gradients to explode. We can generalize these results for nonlinear functions σ where the absolute values of σ 0 (x) is bounded

RNN - exploding / vanshing gradients It is sufficient for the largest eigenvalue λ1 of the recurrent weight matrix to be smaller than 1 for long term components to vanish (as t → ∞) and necessary for it to be larger than 1 for gradients to explode. We can generalize these results for nonlinear functions σ where the absolute values of σ 0 (x) is bounded

RNN - exploding / vanshing gradients Desmos demo... It’s a bit easier to analyze iterated functions if they’re monotonic Eventually, the iterates either shoot off to infinity or wind up at a fixed point called sinks, or attractors Observe that fixed points with derivatives f 0 (x) < 1 are sinks and fixed points with f 0 (x) > 1 are sources

RNN - Gradient Clipping Handles Exploding gradients. What about vanishing?

LSTM = forget gate new candidates input gate architecture was designed to make it easy to remember information over long time periods until it’s needed forget gate new candidates input gate =

LSTM - How it solves the vanshing problem Boom Puff

LSTM - Keras Implementation

References http://karpathy.github.io/2015/05/21/rnn-effectiveness/ https://arxiv.org/pdf/1211.5063.pdf https://colah.github.io/posts/2015-08-Understanding-LSTMs/ http://www.cs.toronto.edu/~rgrosse/courses/csc321_2017/readings/L15%20Explo ding%20and%20Vanishing%20Gradients.pdf https://ayearofai.com/rohan-lenny-3-recurrent-neural-networks-10300100899b http://proceedings.mlr.press/v37/jozefowicz15.pdf