ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –LSTMs (intuition and variants) –[Abhishek:] Lua / Torch Tutorial.

Slides:



Advertisements
Similar presentations
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –(Finish) Regression –Model selection, Cross-validation –Error decomposition.
Advertisements

Lecture #9 EGR 277 – Digital Logic
Tuomas Sandholm Carnegie Mellon University Computer Science Department
CSC321: Neural Networks Lecture 3: Perceptrons
CSE111: Great Ideas in Computer Science Dr. Carl Alphonce 219 Bell Hall Office hours: M-F 11:00-11:
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –SVM –Multi-class SVMs –Neural Networks –Multi-layer Perceptron Readings:
Branch Prediction with Neural- Networks: Hidden Layers and Recurrent Connections Andrew Smith CSE Dept. June 10, 2004.
Introduction to Recurrent neural networks (RNN), Long short-term memory (LSTM) Wenjie Pei In this coffee talk, I would like to present you some basic.
Deep Learning Neural Network with Memory (1)
ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Neural Networks –Backprop –Modular Design.
ECE 6504: Deep Learning for Perception
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –(Finish) Nearest Neighbour Readings: Barber 14 (kNN)
Kai Sheng-Tai, Richard Socher, Christopher D. Manning
Visualizing and Understanding recurrent Neural Networks
ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Recurrent Neural Networks (RNNs) –BackProp Through Time (BPTT) –Vanishing / Exploding.
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Regression Readings: Barber 17.1, 17.2.
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Ensemble Methods: Bagging, Boosting Readings: Murphy 16.4; Hastie 16.
ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Deconvolution.
Positioning rat using neuronal activity in hippocampus
Recurrent Neural Networks Long Short-Term Memory Networks
Recurrent Neural Networks Long Short-Term Memory Networks
Predicting the dropouts rate of online course using LSTM method
NOTE: To change the image on this slide, select the picture and delete it. Then click the Pictures icon in the placeholder to insert your own image. SHOW.
Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting 卷积LSTM网络:利用机器学习预测短期降雨 施行健 香港科技大学 VALSE 2016/03/23.
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation EMNLP’14 paper by Kyunghyun Cho, et al.
Deep Learning Methods For Automated Discourse CIS 700-7
Deep Learning RUSSIR 2017 – Day 3
RNNs: An example applied to the prediction task
SD Study RNN & LSTM 2016/11/10 Seitaro Shinagawa.
CS 388: Natural Language Processing: LSTM Recurrent Neural Networks
CS 4501: Introduction to Computer Vision Computer Vision + Natural Language Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy / Justin Johnson.
Dhruv Batra Georgia Tech
ECE 5424: Introduction to Machine Learning
ECE 5424: Introduction to Machine Learning
Recurrent Neural Networks for Natural Language Processing
Show and Tell: A Neural Image Caption Generator (CVPR 2015)
Deep Learning: Model Summary
ICS 491 Big Data Analytics Fall 2017 Deep Learning
Different Units Ramakrishna Vedantam.
ECE 5424: Introduction to Machine Learning
ECE 5424: Introduction to Machine Learning
RNNs: Going Beyond the SRN in Language Prediction
Grid Long Short-Term Memory
RNN and LSTM Using MXNet Cyrus M Vahid, Principal Solutions Architect
Image Captions With Deep Learning Yulia Kogan & Ron Shiff
Recurrent Neural Networks
ECE 5424: Introduction to Machine Learning
Tips for Training Deep Network
RNNs & LSTM Hadar Gorodissky Niv Haim.
Understanding LSTM Networks
ECE599/692 - Deep Learning Lecture 14 – Recurrent Neural Network (RNN)
Lecture 16: Recurrent Neural Networks (RNNs)
Recurrent Encoder-Decoder Networks for Time-Varying Dense Predictions
RNNs: Going Beyond the SRN in Language Prediction
Attention.
Martin Schrimpf & Jon Gauthier MIT BCS Peer Lectures
Please enjoy.
LSTM: Long Short Term Memory
Meta Learning (Part 2): Gradient Descent as LSTM
Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton
Recurrent Neural Networks (RNNs)
A unified extension of lstm to deep network
Rgh
Recurrent Neural Networks
Deep learning: Recurrent Neural Networks CV192
Bidirectional LSTM-CRF Models for Sequence Tagging
Mahdi Kalayeh David Hill
Dhruv Batra Georgia Tech
Dhruv Batra Georgia Tech
Presentation transcript:

ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –LSTMs (intuition and variants) –[Abhishek:] Lua / Torch Tutorial

Administrativia HW3 –Out today –Due in 2 weeks –Please please please please please start early – (C) Dhruv Batra2

RNN Basic block diagram (C) Dhruv Batra3 Image Credit: Christopher Olah (

Key Problem Learning long-term dependencies is hard (C) Dhruv Batra4 Image Credit: Christopher Olah (

Meet LSTMs How about we explicitly encode memory? (C) Dhruv Batra5 Image Credit: Christopher Olah (

LSTMs Intuition: Memory Cell State / Memory (C) Dhruv Batra6 Image Credit: Christopher Olah (

LSTMs Intuition: Forget Gate Should we continue to remember this “bit” of information or not? (C) Dhruv Batra7 Image Credit: Christopher Olah (

LSTMs Intuition: Input Gate Should we update this “bit” of information or not? –If so, with what? (C) Dhruv Batra8 Image Credit: Christopher Olah (

LSTMs Intuition: Memory Update Forget that + memorize this (C) Dhruv Batra9 Image Credit: Christopher Olah (

LSTMs Intuition: Output Gate Should we output this “bit” of information to “deeper” layers? (C) Dhruv Batra10 Image Credit: Christopher Olah (

LSTMs Intuition: Output Gate Should we output this “bit” of information to “deeper” layers? (C) Dhruv Batra11 Image Credit: Christopher Olah (

LSTMs A pretty sophisticated cell (C) Dhruv Batra12 Image Credit: Christopher Olah (

LSTM Variants #1: Peephole Connections Let gates see the cell state / memory (C) Dhruv Batra13 Image Credit: Christopher Olah (

LSTM Variants #2: Coupled Gates Only memorize new if forgetting old (C) Dhruv Batra14 Image Credit: Christopher Olah (

LSTM Variants #3: Gated Recurrent Units Changes: –No explicit memory; memory = hidden output –Z = memorize new and forget old (C) Dhruv Batra15 Image Credit: Christopher Olah (

RMSProp Intuition Gradients ≠ Direction to Opt –Gradients point in the direction of steepest ascent locally –Not where we want to go long term Mismatch gradient magnitudes –magnitude large = we should travel a small distance –magnitude small = we should travel a large distance (C) Dhruv Batra16 Image Credit: Geoffrey Hinton

RMSProp Intuition Keep track of previous gradients to get an idea of magnitudes over batch Divide by this accumulate (C) Dhruv Batra17