Mahdi Kalayeh David Hill

Slides:



Advertisements
Similar presentations
Recurrent Neural Networks
Advertisements

Branch Prediction with Neural- Networks: Hidden Layers and Recurrent Connections Andrew Smith CSE Dept. June 10, 2004.
Multi Layer Perceptrons (MLP) Course website: The back-propagation algorithm Following Hertz chapter 6.
Introduction to Recurrent neural networks (RNN), Long short-term memory (LSTM) Wenjie Pei In this coffee talk, I would like to present you some basic.
Deep Learning Neural Network with Memory (1)
ARTIFICIAL NEURAL NETWORKS. Overview EdGeneral concepts Areej:Learning and Training Wesley:Limitations and optimization of ANNs Cora:Applications and.
ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Recurrent Neural Networks (RNNs) –BackProp Through Time (BPTT) –Vanishing / Exploding.
ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –LSTMs (intuition and variants) –[Abhishek:] Lua / Torch Tutorial.
SIFT DESCRIPTOR K Wasif Mrityunjay
Positioning rat using neuronal activity in hippocampus
Predicting the dropouts rate of online course using LSTM method
Lecture 3a Analysis of training of NN
Facial Detection via Convolutional Neural Network Nathan Schneider.
Fabien Cromieres Chenhui Chu Toshiaki Nakazawa Sadao Kurohashi
Back Propagation and Representation in PDP Networks
Learning linguistic structure with simple and more complex recurrent neural networks Psychology February 2, 2017.
RNNs: An example applied to the prediction task
CS 388: Natural Language Processing: Neural Networks
SD Study RNN & LSTM 2016/11/10 Seitaro Shinagawa.
CS 388: Natural Language Processing: LSTM Recurrent Neural Networks
CS 4501: Introduction to Computer Vision Computer Vision + Natural Language Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy / Justin Johnson.
Van-Khanh Tran and Le-Minh Nguyen
Matt Gormley Lecture 16 October 24, 2016
Intro to NLP and Deep Learning
Inception and Residual Architecture in Deep Convolutional Networks
Intelligent Information System Lab
Convolution Neural Networks
Random walk initialization for training very deep feedforward networks
Master’s Thesis defense Ming Du Advisor: Dr. Yi Shang
RNNs: Going Beyond the SRN in Language Prediction
Visual Question Generation
Final Presentation: Neural Network Doc Summarization
Learning linguistic structure with simple and more complex recurrent neural networks Psychology February 8, 2018.
cs540 - Fall 2016 (Shavlik©), Lecture 18, Week 10
RNNs & LSTM Hadar Gorodissky Niv Haim.
ECE 599/692 – Deep Learning Lecture 5 – CNN: The Representative Power
CSE 573 Introduction to Artificial Intelligence Neural Networks
Understanding LSTM Networks
Transp Course 2014 Overview.
Neural Networks Geoff Hulten.
Vinit Shah, Joseph Picone and Iyad Obeid
Lecture 16: Recurrent Neural Networks (RNNs)
Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824
RNNs: Going Beyond the SRN in Language Prediction
A connectionist model in action
实习生汇报 ——北邮 张安迪.
Ali Hakimi Parizi, Paul Cook
Please enjoy.
LSTM: Long Short Term Memory
Meta Learning (Part 2): Gradient Descent as LSTM
Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824
Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton
Mahdi Kalayeh David Hill
Attention for translation
Learn to Comment Mentor: Mahdi M. Kalayeh
Dilated Neural Networks for Time Series Forecasting
Automatic Handwriting Generation
The Updated experiment based on LSTM
VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION
Weeks 1 and 2 Aaron Ott.
Forward propagation Notation Input Output n : number of features
Week 3 Volodymyr Bobyr.
Bidirectional LSTM-CRF Models for Sequence Tagging
Neural Machine Translation by Jointly Learning to Align and Translate
Listen Attend and Spell – a brief introduction
Dhruv Batra Georgia Tech
Artificial Neural Networks / Spring 2002
Andrew Karl, Ph.D. James Wisnowski, Ph.D. Lambros Petropoulos
Presentation transcript:

Mahdi Kalayeh David Hill Learn to Comment Week 6 Mahdi Kalayeh David Hill

Overview Quick introduction to LSTMs and BPTT Results for this week GPU implementation

Introduction to LSTM’s: RNN’s Output Output Who Hidden Hidden Whh Wih Input Input

Unrolling RNN’s Output0 Output1 Output2 Hidden-1 Hidden0 Hidden1 Hidden state initialized to neutral value at t-1 Hidden-1 Hidden0 Hidden1 Hidden2 Input0 Input1 Input2

LSTM Unit Designed to eliminate exploding/disappearing gradient problem Learns with greater temporal depth

LSTM Unit

LSTM Backprop

Experiment Results Previously: Tested several 256 unit models This week: Test full sized 512 unit model also on flickr-8k concatenated features same learning parameters

Experiment Results LSTM Size Bleu-1 Bleu-2 Bleu-3 Bleu-4 256 57.6 37.3 Ours: GoogLeNet 256 57.6 37.3 23.6 15.1 Ours: Places 52.8 32.4 19.5 11.7 Ours: GoogLeNet + Places 59.4 39.9 26.3 17.3 512 59.3 39.6 25.5 16.2 Google: NIC 63 ... Human 70

Result Analysis Test scaling learning rate over epochs Early stopping on Bleu Consider Dropout

GPU Implementation Working on GPU: Needs work: Forwarding, single-backprop Needs work: Backprop over a batch: