Welcome deep loria !.

Slides:



Advertisements
Similar presentations
A brief review of non-neural-network approaches to deep learning
Advertisements

Thomas Trappenberg Autonomous Robotics: Supervised and unsupervised learning.
Deep Learning Bing-Chen Tsai 1/21.
Stochastic Neural Networks Deep Learning and Neural Nets Spring 2015.
CS590M 2008 Fall: Paper Presentation
Advanced topics.
Deep Learning and Neural Nets Spring 2015
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Deep Learning.
Supervised and Unsupervised learning and application to Neuroscience Cours CA6b-4.
Deep Belief Networks for Spam Filtering
Using Fast Weights to Improve Persistent Contrastive Divergence Tijmen Tieleman Geoffrey Hinton Department of Computer Science, University of Toronto ICML.
A shallow introduction to Deep Learning
Overview of the final test for CSC Overview PART A: 7 easy questions –You should answer 5 of them. If you answer more we will select 5 at random.
Convolutional Restricted Boltzmann Machines for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov
Convolutional Neural Network
Neural Networks The Elements of Statistical Learning, Chapter 12 Presented by Nick Rizzolo.
CSC321 Lecture 24 Using Boltzmann machines to initialize backpropagation Geoffrey Hinton.
Deep Belief Network Training Same greedy layer-wise approach First train lowest RBM (h 0 – h 1 ) using RBM update algorithm (note h 0 is x) Freeze weights.
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Xintao Wu University of Arkansas Introduction to Deep Learning 1.
Automatic Lung Cancer Diagnosis from CT Scans (Week 3)
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Neural networks and support vector machines
Sentiment analysis using deep learning methods
Learning Deep Generative Models by Ruslan Salakhutdinov
Environment Generation with GANs
Deep Learning Amin Sobhani.
Machine Learning & Deep Learning
Energy models and Deep Belief Networks
CSC321: Neural Networks Lecture 22 Learning features one layer at a time Geoffrey Hinton.
ECE 5424: Introduction to Machine Learning
Recursive Neural Networks
Computer Science and Engineering, Seoul National University
COMP24111: Machine Learning and Optimisation
Matt Gormley Lecture 16 October 24, 2016
LECTURE ??: DEEP LEARNING
Multimodal Learning with Deep Boltzmann Machines
ICS 491 Big Data Analytics Fall 2017 Deep Learning
CS 4501: Introduction to Computer Vision Basics of Neural Networks, and Training Neural Nets I Connelly Barnes.
Intro to NLP and Deep Learning
Neural networks (3) Regularization Autoencoder
ECE 6504 Deep Learning for Perception
Deep Learning Qing LU, Siyuan CAO.
Policy Compression for MDPs
Deep Belief Networks Psychology 209 February 22, 2013.
Deep Learning Workshop
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
Neural Networks and Backpropagation
A brief introduction to neural network
Deep Learning Convoluted Neural Networks Part 2 11/13/
Department of Electrical and Computer Engineering
BlackBerry Test Validation and Analysis using Deep Learning
ECE 599/692 – Deep Learning Lecture 9 – Autoencoder (AE)
Artificial Neural Networks
Neural Networks Geoff Hulten.
Lecture Notes for Chapter 4 Artificial Neural Networks
Tuning CNN: Tips & Tricks
Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824
实习生汇报 ——北邮 张安迪.
Image Classification & Training of Neural Networks
Neural networks (3) Regularization Autoencoder
Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824
Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton
Introduction to Neural Networks
Deep Learning Libraries
Neural Machine Translation using CNN
Autoencoders David Dohan.
An introduction to neural network and machine learning
Overall Introduction for the Lecture
Presentation transcript:

Welcome deep loria !

Deep Loria Mailing list: deeploria@inria.fr Web site: http://deeploria.gforge.inria.fr/ Git repository: https://gforge.inria.fr/projects/deeploria

deeploria: involvment I'm no DL expert !!! (at most a trigger) Deeploria'll be what you make of it: Need volunteers !! Propose anything Organize, participate, animate... Next meeting (please think about it): Coffee & discussion session ? → paper reading group: who's willing to take care of it ? Demo for Yann LeCun's venue ? … ?

Outline Motivation Lightning-speed overview of DNN basics Neuron vs. Random Variable; activations Layers: dense, RNN Vanishing gradient More layers: LSTM, RBM/DBN, CNN, Autoencoder Implementation with Keras/Theano

Why all this buzz with DNN ? Because of Expressive Power cf. “On the Expressive Power of Deep Learning: A Tensor Analysis” by Nadav Cohen, Or Sharir, Amnon Shashua “[...] besides a negligible set, all functions that can be implemented by a deep network of polynomial size, require an exponential size if one wishes to implement (or approximate) them with a shallow network”

Basic neuron

Activations sigmoid = logistic relu = rectified linear

Dense layer

Alternative “neuron” Graphical model: node = random variable Connection = “dependency” between variables Restricted Boltzmann Machine (RBM)

Training Dense: Minimize error Stochastic Gradient Descent (SGD) = gradient descent ( back-propagation) RBM: Minimize energy Contrastive Divergence = gradient descent ( Gibbs sampling)

DNN vs. DBN N x Dense → DNN (Deep Neural Networks) N x RBM → DBN (Deep Belief Networks) Dense are discriminative = model the “boundary” between classes RBM are generative = model every classe Performances: RBM better (?) Efficiency: RBM much more difficult to train Usage: 90% for Dense

Recurrent neural network Take the past into account to predict the next step Just like HMMs, CRFs...

Issue 1: Vanishing gradient Back-propagation of error E = chain rule: N layers → N factors of gradient of activation Gradient decreases exponentially with N Consequences: The deepest layers are never learnt

Vanishing gradient Solutions: More data ! Rectified linear (gradient = 1) Unsupervised pre-training: DBN Autoencoders LSTMs instead of RNNs

Autoencoders Re-create the inputs = model of the data with dimensionality reduction = compression

LSTM

Vanishing gradient

Issue 2 : overfitting

Overfitting: solutions Share weights: ex: convolutional layer Regularization: ex: Drop-out, Drop-connect...

Time to code, isn't it ?

Keras example: Reuters Trains an MLP to classify texts into 46 topics In the root dir of Keras, run: python examples/reuters_mlp.py

Keras example max_words 512 46 topics

Tricks for the model Score = categorical cross-entropy = kind of smooth, continuous classification error Softmax = normalizes the outputs as probas Adam = adaptive gradient ?

Tricks for the data X_train = int[# sentences][# words] = word idx Converts list of word indexes into matrix = #sents X Bag Of Words vector (dim=#words)

Plot accuracy as fct of epochs sudo apt-get install python-matplotlib import matplotlib.pyplot as plt […] plt.plot(history.history['acc']) plt.show()

Plot matrix of weights Or plt.matshow(model.get_weight()[0], cmap=plt.cm.gray) plt.show() Or plt.savefig(“fig.png”)

Rules of thumb Check overfitting: plot training acc vs. test acc Check vanishing gradient: plot weights or gradients Normalize your inputs & outputs Try to automatically augment your training set: add noise, rotate/translate images...