Introduction to Recurrent neural networks (RNN), Long short-term memory (LSTM) Wenjie Pei In this coffee talk, I would like to present you some basic.

Slides:



Advertisements
Similar presentations
A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C. Wunsch.
Advertisements

Tuomas Sandholm Carnegie Mellon University Computer Science Department
Machine Learning Neural Networks
Lecture 14 – Neural Networks
Neural NetworksNN 11 Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
Decision Support Systems
Recurrent Neural Networks
Branch Prediction with Neural- Networks: Hidden Layers and Recurrent Connections Andrew Smith CSE Dept. June 10, 2004.
Carla P. Gomes CS4700 CS 4700: Foundations of Artificial Intelligence Prof. Carla P. Gomes Module: Neural Networks: Concepts (Reading:
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
AN INTERACTIVE TOOL FOR THE STOCK MARKET RESEARCH USING RECURSIVE NEURAL NETWORKS Master Thesis Michal Trna
Machine Learning Motivation for machine learning How to set up a problem How to design a learner Introduce one class of learners (ANN) –Perceptrons –Feed-forward.
Artificial Neural Networks
NEURAL NETWORKS Introduction
SOMTIME: AN ARTIFICIAL NEURAL NETWORK FOR TOPOLOGICAL AND TEMPORAL CORRELATION FOR SPATIOTEMPORAL PATTERN LEARNING.
Soft Computing Colloquium 2 Selection of neural network, Hybrid neural networks.
© N. Kasabov Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering, MIT Press, 1996 INFO331 Machine learning. Neural networks. Supervised.
Deep Learning Neural Network with Memory (1)
Artificial Neural Networks An Overview and Analysis.
Introduction to Neural Networks Debrup Chakraborty Pattern Recognition and Machine Learning 2006.
Explorations in Neural Networks Tianhui Cai Period 3.
ANNs (Artificial Neural Networks). THE PERCEPTRON.
Appendix B: An Example of Back-propagation algorithm
NEURAL NETWORKS FOR DATA MINING
Radial Basis Function Networks:
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
Neural Networks Demystified by Louise Francis Francis Analytics and Actuarial Data Mining, Inc.
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
Convolutional LSTM Networks for Subcellular Localization of Proteins
Supervised Sequence Labelling with Recurrent Neural Networks PRESENTED BY: KUNAL PARMAR UHID:
Introduction to Neural Networks Freek Stulp. 2 Overview Biological Background Artificial Neuron Classes of Neural Networks 1. Perceptrons 2. Multi-Layered.
Modelleerimine ja Juhtimine Tehisnärvivõrgudega Identification and Control with artificial neural networks.
Bab 5 Classification: Alternative Techniques Part 4 Artificial Neural Networks Based Classifer.
Recurrent Neural Networks Long Short-Term Memory Networks
Recurrent Neural Networks Long Short-Term Memory Networks
Predicting the dropouts rate of online course using LSTM method
Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting 卷积LSTM网络:利用机器学习预测短期降雨 施行健 香港科技大学 VALSE 2016/03/23.
1 Convolutional neural networks Abin - Roozgard. 2  Introduction  Drawbacks of previous neural networks  Convolutional neural networks  LeNet 5 
Lecture 12. Outline of Rule-Based Classification 1. Overview of ANN 2. Basic Feedforward ANN 3. Linear Perceptron Algorithm 4. Nonlinear and Multilayer.
Artificial Neural Networks By: Steve Kidos. Outline Artificial Neural Networks: An Introduction Frank Rosenblatt’s Perceptron Multi-layer Perceptron Dot.
Introduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science and Informatics and Complex and Adaptive Systems Labs University.
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation EMNLP’14 paper by Kyunghyun Cho, et al.
1 Introduction to Neural Networks Recurrent Neural Networks.
CS 388: Natural Language Processing: LSTM Recurrent Neural Networks
Learning in Neural Networks
Recurrent Neural Networks for Natural Language Processing
Modelleerimine ja Juhtimine Tehisnärvivõrgudega
Artificial Neural Networks
CSE P573 Applications of Artificial Intelligence Neural Networks
CSE 473 Introduction to Artificial Intelligence Neural Networks
Prof. Carolina Ruiz Department of Computer Science
Grid Long Short-Term Memory
RNN and LSTM Using MXNet Cyrus M Vahid, Principal Solutions Architect
Image Captions With Deep Learning Yulia Kogan & Ron Shiff
A First Look at Music Composition using LSTM Recurrent Neural Networks
شبکه عصبی تنظیم: بهروز نصرالهی-فریده امدادی استاد محترم: سرکار خانم کریمی دانشگاه آزاد اسلامی واحد شهرری.
Artificial Intelligence Methods
XOR problem Input 2 Input 1
RNNs & LSTM Hadar Gorodissky Niv Haim.
CSE 573 Introduction to Artificial Intelligence Neural Networks
Understanding LSTM Networks
Long Short Term Memory within Recurrent Neural Networks
Lecture 16: Recurrent Neural Networks (RNNs)
Recurrent Encoder-Decoder Networks for Time-Varying Dense Predictions
Please enjoy.
Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton
Lecture 09: Introduction Image Recognition using Neural Networks
Question Answering System
Bidirectional LSTM-CRF Models for Sequence Tagging
Prof. Carolina Ruiz Department of Computer Science
Presentation transcript:

Introduction to Recurrent neural networks (RNN), Long short-term memory (LSTM) Wenjie Pei In this coffee talk, I would like to present you some basic knowledge about the RNN and LSTM models. I was going to present a paper about RNN model, but I think it is better to give you a introduction to these models first and hence you can have a overview about these models.

Artificial Neural Networks Feedforward neural networks ANNs without cycle connections between nodes (Feedback) Recurrent neural networks ANNs with cycle connections between nodes

Feedforward Neural Networks Multilayer perceptron (MLP) Universal function approximation theory: Sufficient nonlinear hidden units  approximate any continuous mapping function For each node here, it first calculate the weighted sum over all the inputs and then process it by the activation function. So can we see that this model is very powerful, right? But the Drawback is that it only Take into account Drawback: Output depends only on the current input No temporal information dependencies

Recurrent Neural Networks Feedback from hidden unit activation of last time step to current time step Universal approximation theory: An RNN with sufficient hidden units Any measurable sequence-to-sequence mapping or dynamic system From MLP to RNNs is a transition from static process to dynamic process which takes into account the time dimension. Advantage: Memory of previous inputs Incorporate contextual information

Recurrent Neural Networks Bidirectional RNNs

Recurrent Neural Networks Vanishing gradient problem This model cannot have a longer memory 2 reasons: 1. activation function 2. diluted by other input Sensitivity decay exponentially over the time

Long Short-Term Memory (LSTM) Input gate [0, 1]: How much information from input could go into the cell Forget gate [0, 1]: How much information from last time step could enter the cell Output gate [0, 1]: How much information to output In this model, it replaces the hidden layer by the memory blocks, in each block, there could be several memory cells. Here is the model for one cell. We can see that it has input and output and three gates: What is the roles of these three gates? All the values of these gate are from 0 to 1.

Long Short-Term Memory Advantage: long-period time memory Conveyed without decay

Applications Applications to sequence labeling problems: Handwritten character recognition Speech recognition Protein secondary structure prediction … Want to know more about the latest papers: Waiting for my next coffee talk