Long Short Term Memory within Recurrent Neural Networks

Slides:



Advertisements
Similar presentations
Artificial Neural Networks
Advertisements

Artificial Intelligence 13. Multi-Layer ANNs Course V231 Department of Computing Imperial College © Simon Colton.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 7: Learning in recurrent networks Geoffrey Hinton.
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
Neural Nets Using Backpropagation Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill.
Artificial Neural Networks
Introduction to Recurrent neural networks (RNN), Long short-term memory (LSTM) Wenjie Pei In this coffee talk, I would like to present you some basic.
Artificial Neural Networks
CHAPTER 11 Back-Propagation Ming-Feng Yeh.
Artificial Neural Networks
Classification Part 3: Artificial Neural Networks
Machine Learning Chapter 4. Artificial Neural Networks
ARTIFICIAL NEURAL NETWORKS. Overview EdGeneral concepts Areej:Learning and Training Wesley:Limitations and optimization of ANNs Cora:Applications and.
EE459 Neural Networks Backpropagation
Back-Propagation Algorithm AN INTRODUCTION TO LEARNING INTERNAL REPRESENTATIONS BY ERROR PROPAGATION Presented by: Kunal Parmar UHID:
Introduction to Neural Networks Freek Stulp. 2 Overview Biological Background Artificial Neuron Classes of Neural Networks 1. Perceptrons 2. Multi-Layered.
CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models.
Neural Networks Lecture 11: Learning in recurrent networks Geoffrey Hinton.
Machine Learning Artificial Neural Networks MPλ ∀ Stergiou Theodoros 1.
Machine Learning Supervised Learning Classification and Regression
Neural networks.
Neural networks and support vector machines
Big data classification using neural network
TensorFlow CS 5665 F16 practicum Karun Joseph, A Reference:
Fall 2004 Backpropagation CS478 - Machine Learning.
CS 388: Natural Language Processing: Neural Networks
Neural Networks.
CS 388: Natural Language Processing: LSTM Recurrent Neural Networks
Applying Neural Networks
Deep Learning Amin Sobhani.
Recurrent Neural Networks for Natural Language Processing
Matt Gormley Lecture 16 October 24, 2016
Intro to NLP and Deep Learning
Intelligent Information System Lab
Modular Neural Networks for Pattern Classification Using LabVIEW®
A VERY Brief Introduction to Convolutional Neural Network using TensorFlow 李 弘
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
Neural Networks 2 CS446 Machine Learning.
Shunyuan Zhang Nikhil Malik
CSE P573 Applications of Artificial Intelligence Neural Networks
Master’s Thesis defense Ming Du Advisor: Dr. Yi Shang
By: Kevin Yu Ph.D. in Computer Engineering
RNNs: Going Beyond the SRN in Language Prediction
CSC 578 Neural Networks and Deep Learning
A critical review of RNN for sequence learning Zachary C
Brain Inspired Algorithms Dr. David Fagan
Deep Learning Packages
Grid Long Short-Term Memory
Image Captions With Deep Learning Yulia Kogan & Ron Shiff
A First Look at Music Composition using LSTM Recurrent Neural Networks
Recurrent Neural Networks
of the Artificial Neural Networks.
CSE 573 Introduction to Artificial Intelligence Neural Networks
Artificial Neural Networks
Neural Networks Geoff Hulten.
Other Classification Models: Recurrent Neural Network (RNN)
Lecture Notes for Chapter 4 Artificial Neural Networks
Lecture 16: Recurrent Neural Networks (RNNs)
实习生汇报 ——北邮 张安迪.
LSTM: Long Short Term Memory
Computer Vision Lecture 19: Object Recognition III
Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton
CSC321: Neural Networks Lecture 11: Learning in recurrent networks
Automatic Handwriting Generation
David Kauchak CS158 – Spring 2019
Recurrent Neural Networks
Deep learning: Recurrent Neural Networks CV192
Outline Announcement Neural networks Perceptrons - continued
Overall Introduction for the Lecture
Presentation transcript:

Long Short Term Memory within Recurrent Neural Networks Tyler fricks

Overview Types of Neural Networks for Background RNN’s and LSTM Perceptron Feed Forward Denoising RNN’s and LSTM Training Recurrent Neural Networks Gradient Methods 2nd Order Optimization

Overview: Continued RNN Design Tools and Frameworks TensorFlow Matlab Others Examples of LSTM Networks in use: DRAW Speech Recognition

Types of ANN’s: Perceptron Very simple, single neuron classification Works well for linearly separable data

Types of ANN’s: Feed Forward Multiple layers of perceptron, or other neurons with different activation functions All information flows in one direction

Types of ANN’s: Denoising Removes noise from data Assist with feature selection Cleaner data is easier to work with More nodes in hidden layer than inputs

Types of ANN’s: Denoising Example Original Corrupted Denoised

Recurrent Neural Networks Developed in the 1980’s Computer power too low to pursue Based on Feed Forward networks Added Short Term Memory to neurons RNN Neurons have access to the current data and the result of one time-step in the past.

RNN Advantages Over Feed Forward Feed Forward Networks can’t base decisions off of the past data Because of this, predictions are near impossible to make Example: A neural network is reading a string “Apple” character by character. Both networks are predicting the letter after the second P. FF Network has every letter available to predict RNN Network can rule out unpronounceable letter combinations.

Long Short Term Memory Networks Based on RN Networks Adds memory control to each neuron Similar to D Flip Flop Circuit Gates are usually sigmoid.

Long Short Term Memory Networks Advantages Extends prediction ability Memory is not restricted to Present and one back in time Gradients are kept steep Vanishing gradient problem is less of an issue

Training Methods for RNN’s: Gradient Methods Backpropagation Through Time (Gradient Discovery Method) Sending error values backwards with respect to time Requires “Unfolding” the network at each time-step. Allows discovery of layer contribution Vanishing Gradient Problem Over time, contribution of the inner-layers moves towards zero. Reported error rates get low, causing terrible training speed.

Training Methods for RNN’s: Gradient Methods Exploding Gradient Problem Opposite of Vanishing Gradient Caused by training of long sequences of data Training speed ends up being way too high Scholastic Gradient Decent Method of finding the local minimum of the error/loss function. Allows for exploration/exploitation functions like GridWorld

Training Methods for RNN’s: 2nd Order Optimization New training method Attempts to find the Global Minima Not affected by vanishing/exploding gradient problems Takes a huge amount of thought and computing power to implement. Hessian matrix must be computed 2nd order refers to differential equations

RNN Design Tools and Frameworks: Tensorflow Open Source, Developed by Google “an open source library for high performance numerical computation across a variety of platforms” Supports Python, JavaScript, C++, Java, Go, and Swift Provides GPGPU support for faster training

RNN Design Tools and Frameworks: Matlab Deep Learning Toolbox Addon for Matlab Graphical design of neural networks Costly Matlab license: $2,150 Deep Learning Toolbox: $1000

RNN Design Tools and Frameworks: PyTorch Successor to Torch for Lua Primary competitor to Tensorflow Faster Slightly more difficult to use, less API support

RNN Design Tools and Frameworks: Misc Scikit Collection of algorithms for learning DL4J Eclipse Foundation’s learning framework with GPGPU support Weka Developed by University of Waikato. For educational learning.

DeepMind DRAW Deep Recurrent Attentive Writer Image Generation, as a human would draw Update over time Depends on a 2D array of Gaussian filters to avoid the vanishing gradient problem.

Speech Recognition Usually LSTM networks due to predictive ability Again, if the last letter predicted was “C”, “X” would be ruled out as the next letter.

Image Captioning Automatically categorize images based on recognized features Save humans hours of tedious work. Convolutional neural network is used to condition images.

Anomalies in a Time Series Detection of “out of the ordinary” data in a time series. EKG Data, for example could have a bad wave detected by a neural network.

Conclusion

Questions