Image Captions With Deep Learning Yulia Kogan & Ron Shiff

Slides:

Advertisements

Similar presentations

Introduction to Recurrent neural networks (RNN), Long short-term memory (LSTM) Wenjie Pei In this coffee talk, I would like to present you some basic.

Advertisements

Distributed Representations of Sentences and Documents

Kai Sheng-Tai, Richard Socher, Christopher D. Manning

ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Recurrent Neural Networks (RNNs) –BackProp Through Time (BPTT) –Vanishing / Exploding.

Predicting the dropouts rate of online course using LSTM method

NOTE: To change the image on this slide, select the picture and delete it. Then click the Pictures icon in the placeholder to insert your own image. SHOW.

Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation EMNLP’14 paper by Kyunghyun Cho, et al.

S.Bengio, O.Vinyals, N.Jaitly, N.Shazeer

Deep Learning Methods For Automated Discourse CIS 700-7

Neural networks and support vector machines

Deep Learning RUSSIR 2017 – Day 3

Learning linguistic structure with simple and more complex recurrent neural networks Psychology February 2, 2017.

RNNs: An example applied to the prediction task

CS 388: Natural Language Processing: LSTM Recurrent Neural Networks

CS 4501: Introduction to Computer Vision Computer Vision + Natural Language Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy / Justin Johnson.

Deep Feedforward Networks

Deep Learning Amin Sobhani.

Natural Language and Text Processing Laboratory

Recursive Neural Networks

Recurrent Neural Networks for Natural Language Processing

Show and Tell: A Neural Image Caption Generator (CVPR 2015)

Matt Gormley Lecture 16 October 24, 2016

A Hierarchical Model of Reviews for Aspect-based Sentiment Analysis

Visualizing and Understanding Neural Models in NLP

Deep Learning: Model Summary

Intro to NLP and Deep Learning

ICS 491 Big Data Analytics Fall 2017 Deep Learning

Intelligent Information System Lab

Intro to NLP and Deep Learning

Different Units Ramakrishna Vedantam.

Neural networks (3) Regularization Autoencoder

Neural Networks 2 CS446 Machine Learning.

Shunyuan Zhang Nikhil Malik

Neural Networks and Backpropagation

Recursive Structure.

RNNs: Going Beyond the SRN in Language Prediction

A critical review of RNN for sequence learning Zachary C

Grid Long Short-Term Memory

RNN and LSTM Using MXNet Cyrus M Vahid, Principal Solutions Architect

Advanced Artificial Intelligence

A First Look at Music Composition using LSTM Recurrent Neural Networks

Recurrent Neural Networks

Recurrent Neural Networks (RNN)

CS 4501: Introduction to Computer Vision Training Neural Networks II

RNNs & LSTM Hadar Gorodissky Niv Haim.

Understanding LSTM Networks

ECE599/692 - Deep Learning Lecture 14 – Recurrent Neural Network (RNN)

Code Completion with Neural Attention and Pointer Networks

The Big Health Data–Intelligent Machine Paradox

Long Short Term Memory within Recurrent Neural Networks

Other Classification Models: Recurrent Neural Network (RNN)

Lecture 16: Recurrent Neural Networks (RNNs)

Recurrent Encoder-Decoder Networks for Time-Varying Dense Predictions

RNNs: Going Beyond the SRN in Language Prediction

实习生汇报 ——北邮张安迪.

Meta Learning (Part 2): Gradient Descent as LSTM

Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton

Attention for translation

Lecture 21: Machine Learning Overview AP Computer Science Principles

Automatic Handwriting Generation

Introduction to Neural Networks

Recurrent Neural Networks

Deep learning: Recurrent Neural Networks CV192

Bidirectional LSTM-CRF Models for Sequence Tagging

LHC beam mode classification

Neural Machine Translation by Jointly Learning to Align and Translate

CS249: Neural Language Model

Lecture 9: Machine Learning Overview AP Computer Science Principles

Andrew Karl, Ph.D. James Wisnowski, Ph.D. Lambros Petropoulos

Presentation transcript:

Image Captions With Deep Learning Yulia Kogan & Ron Shiff

Lecture outline Part 1 – NLP and RNN Introduction “The Unreasonable Effectiveness of Recurrent Neural Networks” Basic Recurrent Neural Network NLP example Long Short Term Memory RNN’s Part 2 – Image Captioning Algorithms using RNN’s

The Unreasonable Effectiveness of Recurrent Neural Networks Taken from Andrej Karpathy’s blog So far – “Old School” Neural Networks – fixed length inputs and outputs RNN’s - operate over sequences of vectors (input or output) Image Captions Sentiment Analysis Machine Translation “Word Prediction”

The Unreasonable Effectiveness of Recurrent Neural Networks Algebraic Geometry-Latex

The Unreasonable Effectiveness of Recurrent Neural Networks Shakespeare:

Word Vectors Classical Word Representation is “one hot”: Each word is represented by a sparse vector

Word Vectors A more modern approach: Represent words in a dense vector ( ) “Semantically” close vectors are close In the Vector Space. Semantic Relations are preserved in Vector Space: “king”+”woman”-”man”=“queen”

Word Vectors A word Vector can be written as : where is a “one hot” vector, Beneficial for most Deep learning tasks

RNN– Language Model (Based on Richard Socher’s lecture – Deep Learning in NLP Stanford) A language model computes a probability for a sequence of words: Examples: Word ordering: Word Choice: Useful for machine translation and speech recognition

Recurrent Neural Networks Language Model Each output depends on all previous inputs

RNN– Language Model Input : Word Vectors – At each time, compute: Output:

Recurrent Neural Networks-Language Model Total Objective is to maximize the log-likelihood w.r.t parameters “one hot” vector containing the true word log-likelihood:

RNN’s – HARD TO TRAIN!

Vanishing/Exploding gradient problem For Stochastic Gradient Descent we calculate the derivative of the loss w.r.t the Parameters: Reminder: where: Applying Chain Rule:

Vanishing/Exploding gradient problem Update equation: By Chain rule:

Vanishing/Exploding gradient problem Gradients can be very large or very small – “small W” – vanishing gradient Long time dependencies “Large W” (bad for optimization)

LSTM’s Long Short term memory Invented in 1991 by Hochreiter and Schmidnbaur Solved vanishing and exploding gradients using gating Taken from Christopher Olah’s blog

LSTM’s Equations: Different notations. H(t) instead of y(t). C(t) instead of h(t)

LSTM’s “Forget Gate” Ft = 0 forget, ft = 1. keep Examples: Period “.” New Subject gender

LSTM’s “Input gate layer” What information ngoes into the new Candidate Ctilda. “input gate layer” decides which values we’ll update

LSTM’s Updating memory cell No longer Exp.: Information Can Flow: No exponential Ct = f(t-1)*f(t-2)*Ct-2

LSTM’s Finally, Setting the output we need to decide what we’re going to output

Conclusions RNN’s are very powerful RNN’s are hard to train Nowadays - gating (LSTM’s) is the way to go! Acknowledgments: Andrej Karpathy - http://karpathy.github.io/2015/05/21/rnn- effectiveness/ Richard Socher - http://cs224d.stanford.edu/ Christopher Olah - http://colah.github.io/posts/2015-08-Understanding- LSTMs/