Mahdi Kalayeh David Hill

Slides:

Advertisements

Similar presentations

Recurrent Neural Networks

Advertisements

Branch Prediction with Neural- Networks: Hidden Layers and Recurrent Connections Andrew Smith CSE Dept. June 10, 2004.

Multi Layer Perceptrons (MLP) Course website: The back-propagation algorithm Following Hertz chapter 6.

Introduction to Recurrent neural networks (RNN), Long short-term memory (LSTM) Wenjie Pei In this coffee talk, I would like to present you some basic.

Deep Learning Neural Network with Memory (1)

ARTIFICIAL NEURAL NETWORKS. Overview EdGeneral concepts Areej:Learning and Training Wesley:Limitations and optimization of ANNs Cora:Applications and.

ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Recurrent Neural Networks (RNNs) –BackProp Through Time (BPTT) –Vanishing / Exploding.

ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –LSTMs (intuition and variants) –[Abhishek:] Lua / Torch Tutorial.

SIFT DESCRIPTOR K Wasif Mrityunjay

Positioning rat using neuronal activity in hippocampus

Predicting the dropouts rate of online course using LSTM method

Lecture 3a Analysis of training of NN

Facial Detection via Convolutional Neural Network Nathan Schneider.

Fabien Cromieres Chenhui Chu Toshiaki Nakazawa Sadao Kurohashi

Back Propagation and Representation in PDP Networks

Learning linguistic structure with simple and more complex recurrent neural networks Psychology February 2, 2017.

RNNs: An example applied to the prediction task

CS 388: Natural Language Processing: Neural Networks

SD Study RNN & LSTM 2016/11/10 Seitaro Shinagawa.

CS 388: Natural Language Processing: LSTM Recurrent Neural Networks

CS 4501: Introduction to Computer Vision Computer Vision + Natural Language Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy / Justin Johnson.

Van-Khanh Tran and Le-Minh Nguyen

Matt Gormley Lecture 16 October 24, 2016

Intro to NLP and Deep Learning

Inception and Residual Architecture in Deep Convolutional Networks

Intelligent Information System Lab

Convolution Neural Networks

Random walk initialization for training very deep feedforward networks

Master’s Thesis defense Ming Du Advisor: Dr. Yi Shang

RNNs: Going Beyond the SRN in Language Prediction

Visual Question Generation

Final Presentation: Neural Network Doc Summarization

Learning linguistic structure with simple and more complex recurrent neural networks Psychology February 8, 2018.

cs540 - Fall 2016 (Shavlik©), Lecture 18, Week 10

RNNs & LSTM Hadar Gorodissky Niv Haim.

ECE 599/692 – Deep Learning Lecture 5 – CNN: The Representative Power

CSE 573 Introduction to Artificial Intelligence Neural Networks

Understanding LSTM Networks

Transp Course 2014 Overview.

Neural Networks Geoff Hulten.

Vinit Shah, Joseph Picone and Iyad Obeid

Lecture 16: Recurrent Neural Networks (RNNs)

Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824

RNNs: Going Beyond the SRN in Language Prediction

A connectionist model in action

实习生汇报 ——北邮张安迪.

Ali Hakimi Parizi, Paul Cook

LSTM: Long Short Term Memory

Meta Learning (Part 2): Gradient Descent as LSTM

Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824

Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton

Mahdi Kalayeh David Hill

Attention for translation

Learn to Comment Mentor: Mahdi M. Kalayeh

Dilated Neural Networks for Time Series Forecasting

Automatic Handwriting Generation

The Updated experiment based on LSTM

VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION

Weeks 1 and 2 Aaron Ott.

Forward propagation Notation Input Output n : number of features

Week 3 Volodymyr Bobyr.

Bidirectional LSTM-CRF Models for Sequence Tagging

Neural Machine Translation by Jointly Learning to Align and Translate

Listen Attend and Spell – a brief introduction

Dhruv Batra Georgia Tech

Artificial Neural Networks / Spring 2002

Andrew Karl, Ph.D. James Wisnowski, Ph.D. Lambros Petropoulos

Presentation transcript:

Mahdi Kalayeh David Hill Learn to Comment Week 6 Mahdi Kalayeh David Hill

Overview Quick introduction to LSTMs and BPTT Results for this week GPU implementation

Introduction to LSTM’s: RNN’s Output Output Who Hidden Hidden Whh Wih Input Input

Unrolling RNN’s Output0 Output1 Output2 Hidden-1 Hidden0 Hidden1 Hidden state initialized to neutral value at t-1 Hidden-1 Hidden0 Hidden1 Hidden2 Input0 Input1 Input2

LSTM Unit Designed to eliminate exploding/disappearing gradient problem Learns with greater temporal depth

LSTM Unit

LSTM Backprop

Experiment Results Previously: Tested several 256 unit models This week: Test full sized 512 unit model also on flickr-8k concatenated features same learning parameters

Experiment Results LSTM Size Bleu-1 Bleu-2 Bleu-3 Bleu-4 256 57.6 37.3 Ours: GoogLeNet 256 57.6 37.3 23.6 15.1 Ours: Places 52.8 32.4 19.5 11.7 Ours: GoogLeNet + Places 59.4 39.9 26.3 17.3 512 59.3 39.6 25.5 16.2 Google: NIC 63 ... Human 70

Result Analysis Test scaling learning rate over epochs Early stopping on Bleu Consider Dropout

GPU Implementation Working on GPU: Needs work: Forwarding, single-backprop Needs work: Backprop over a batch: