Introduction to Recurrent neural networks (RNN), Long short-term memory (LSTM) Wenjie Pei In this coffee talk, I would like to present you some basic.

Slides:

Advertisements

Similar presentations

A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C. Wunsch.

Advertisements

Tuomas Sandholm Carnegie Mellon University Computer Science Department

Machine Learning Neural Networks

Lecture 14 – Neural Networks

Neural NetworksNN 11 Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22

Decision Support Systems

Recurrent Neural Networks

Branch Prediction with Neural- Networks: Hidden Layers and Recurrent Connections Andrew Smith CSE Dept. June 10, 2004.

Carla P. Gomes CS4700 CS 4700: Foundations of Artificial Intelligence Prof. Carla P. Gomes Module: Neural Networks: Concepts (Reading:

1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.

AN INTERACTIVE TOOL FOR THE STOCK MARKET RESEARCH USING RECURSIVE NEURAL NETWORKS Master Thesis Michal Trna

Machine Learning Motivation for machine learning How to set up a problem How to design a learner Introduce one class of learners (ANN) –Perceptrons –Feed-forward.

Artificial Neural Networks

NEURAL NETWORKS Introduction

SOMTIME: AN ARTIFICIAL NEURAL NETWORK FOR TOPOLOGICAL AND TEMPORAL CORRELATION FOR SPATIOTEMPORAL PATTERN LEARNING.

Soft Computing Colloquium 2 Selection of neural network, Hybrid neural networks.

© N. Kasabov Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering, MIT Press, 1996 INFO331 Machine learning. Neural networks. Supervised.

Deep Learning Neural Network with Memory (1)

Artificial Neural Networks An Overview and Analysis.

Introduction to Neural Networks Debrup Chakraborty Pattern Recognition and Machine Learning 2006.

Explorations in Neural Networks Tianhui Cai Period 3.

ANNs (Artificial Neural Networks). THE PERCEPTRON.

Appendix B: An Example of Back-propagation algorithm

NEURAL NETWORKS FOR DATA MINING

Radial Basis Function Networks:

1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.

Neural Networks Demystified by Louise Francis Francis Analytics and Actuarial Data Mining, Inc.

Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.

Convolutional LSTM Networks for Subcellular Localization of Proteins

Supervised Sequence Labelling with Recurrent Neural Networks PRESENTED BY: KUNAL PARMAR UHID:

Introduction to Neural Networks Freek Stulp. 2 Overview Biological Background Artificial Neuron Classes of Neural Networks 1. Perceptrons 2. Multi-Layered.

Modelleerimine ja Juhtimine Tehisnärvivõrgudega Identification and Control with artificial neural networks.

Bab 5 Classification: Alternative Techniques Part 4 Artificial Neural Networks Based Classifer.

Recurrent Neural Networks Long Short-Term Memory Networks

Recurrent Neural Networks Long Short-Term Memory Networks

Predicting the dropouts rate of online course using LSTM method

Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting 卷积LSTM网络:利用机器学习预测短期降雨施行健香港科技大学 VALSE 2016/03/23.

1 Convolutional neural networks Abin - Roozgard. 2  Introduction  Drawbacks of previous neural networks  Convolutional neural networks  LeNet 5 

Lecture 12. Outline of Rule-Based Classification 1. Overview of ANN 2. Basic Feedforward ANN 3. Linear Perceptron Algorithm 4. Nonlinear and Multilayer.

Artificial Neural Networks By: Steve Kidos. Outline Artificial Neural Networks: An Introduction Frank Rosenblatt’s Perceptron Multi-layer Perceptron Dot.

Introduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science and Informatics and Complex and Adaptive Systems Labs University.

Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation EMNLP’14 paper by Kyunghyun Cho, et al.

1 Introduction to Neural Networks Recurrent Neural Networks.

CS 388: Natural Language Processing: LSTM Recurrent Neural Networks

Learning in Neural Networks

Recurrent Neural Networks for Natural Language Processing

Modelleerimine ja Juhtimine Tehisnärvivõrgudega

Artificial Neural Networks

CSE P573 Applications of Artificial Intelligence Neural Networks

CSE 473 Introduction to Artificial Intelligence Neural Networks

Prof. Carolina Ruiz Department of Computer Science

Grid Long Short-Term Memory

RNN and LSTM Using MXNet Cyrus M Vahid, Principal Solutions Architect

Image Captions With Deep Learning Yulia Kogan & Ron Shiff

A First Look at Music Composition using LSTM Recurrent Neural Networks

شبکه عصبی تنظیم: بهروز نصرالهی-فریده امدادی استاد محترم: سرکار خانم کریمی دانشگاه آزاد اسلامی واحد شهرری.

Artificial Intelligence Methods

XOR problem Input 2 Input 1

RNNs & LSTM Hadar Gorodissky Niv Haim.

CSE 573 Introduction to Artificial Intelligence Neural Networks

Understanding LSTM Networks

Long Short Term Memory within Recurrent Neural Networks

Lecture 16: Recurrent Neural Networks (RNNs)

Recurrent Encoder-Decoder Networks for Time-Varying Dense Predictions

Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton

Lecture 09: Introduction Image Recognition using Neural Networks

Question Answering System

Bidirectional LSTM-CRF Models for Sequence Tagging

Prof. Carolina Ruiz Department of Computer Science

Presentation transcript:

Introduction to Recurrent neural networks (RNN), Long short-term memory (LSTM) Wenjie Pei In this coffee talk, I would like to present you some basic knowledge about the RNN and LSTM models. I was going to present a paper about RNN model, but I think it is better to give you a introduction to these models first and hence you can have a overview about these models.

Artificial Neural Networks Feedforward neural networks ANNs without cycle connections between nodes (Feedback) Recurrent neural networks ANNs with cycle connections between nodes

Feedforward Neural Networks Multilayer perceptron (MLP) Universal function approximation theory: Sufficient nonlinear hidden units  approximate any continuous mapping function For each node here, it first calculate the weighted sum over all the inputs and then process it by the activation function. So can we see that this model is very powerful, right? But the Drawback is that it only Take into account Drawback: Output depends only on the current input No temporal information dependencies

Recurrent Neural Networks Feedback from hidden unit activation of last time step to current time step Universal approximation theory: An RNN with sufficient hidden units Any measurable sequence-to-sequence mapping or dynamic system From MLP to RNNs is a transition from static process to dynamic process which takes into account the time dimension. Advantage: Memory of previous inputs Incorporate contextual information

Recurrent Neural Networks Bidirectional RNNs

Recurrent Neural Networks Vanishing gradient problem This model cannot have a longer memory 2 reasons: 1. activation function 2. diluted by other input Sensitivity decay exponentially over the time

Long Short-Term Memory (LSTM) Input gate [0, 1]: How much information from input could go into the cell Forget gate [0, 1]: How much information from last time step could enter the cell Output gate [0, 1]: How much information to output In this model, it replaces the hidden layer by the memory blocks, in each block, there could be several memory cells. Here is the model for one cell. We can see that it has input and output and three gates: What is the roles of these three gates? All the values of these gate are from 0 to 1.

Long Short-Term Memory Advantage: long-period time memory Conveyed without decay

Applications Applications to sequence labeling problems: Handwritten character recognition Speech recognition Protein secondary structure prediction … Want to know more about the latest papers: Waiting for my next coffee talk