Welcome deep loria !.

Slides:

Advertisements

Similar presentations

A brief review of non-neural-network approaches to deep learning

Advertisements

Thomas Trappenberg Autonomous Robotics: Supervised and unsupervised learning.

Deep Learning Bing-Chen Tsai 1/21.

Stochastic Neural Networks Deep Learning and Neural Nets Spring 2015.

CS590M 2008 Fall: Paper Presentation

Advanced topics.

Deep Learning and Neural Nets Spring 2015

CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.

Supervised and Unsupervised learning and application to Neuroscience Cours CA6b-4.

Deep Belief Networks for Spam Filtering

Using Fast Weights to Improve Persistent Contrastive Divergence Tijmen Tieleman Geoffrey Hinton Department of Computer Science, University of Toronto ICML.

A shallow introduction to Deep Learning

Overview of the final test for CSC Overview PART A: 7 easy questions –You should answer 5 of them. If you answer more we will select 5 at random.

Convolutional Restricted Boltzmann Machines for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov

Convolutional Neural Network

Neural Networks The Elements of Statistical Learning, Chapter 12 Presented by Nick Rizzolo.

CSC321 Lecture 24 Using Boltzmann machines to initialize backpropagation Geoffrey Hinton.

Deep Belief Network Training Same greedy layer-wise approach First train lowest RBM (h 0 – h 1 ) using RBM update algorithm (note h 0 is x) Freeze weights.

Deep Learning Overview Sources: workshop-tutorial-final.pdf

Xintao Wu University of Arkansas Introduction to Deep Learning 1.

Automatic Lung Cancer Diagnosis from CT Scans (Week 3)

Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.

Neural networks and support vector machines

Sentiment analysis using deep learning methods

Learning Deep Generative Models by Ruslan Salakhutdinov

Environment Generation with GANs

Deep Learning Amin Sobhani.

Machine Learning & Deep Learning

Energy models and Deep Belief Networks

CSC321: Neural Networks Lecture 22 Learning features one layer at a time Geoffrey Hinton.

ECE 5424: Introduction to Machine Learning

Recursive Neural Networks

Computer Science and Engineering, Seoul National University

COMP24111: Machine Learning and Optimisation

Matt Gormley Lecture 16 October 24, 2016

LECTURE ??: DEEP LEARNING

Multimodal Learning with Deep Boltzmann Machines

ICS 491 Big Data Analytics Fall 2017 Deep Learning

CS 4501: Introduction to Computer Vision Basics of Neural Networks, and Training Neural Nets I Connelly Barnes.

Intro to NLP and Deep Learning

Neural networks (3) Regularization Autoencoder

ECE 6504 Deep Learning for Perception

Deep Learning Qing LU, Siyuan CAO.

Policy Compression for MDPs

Deep Belief Networks Psychology 209 February 22, 2013.

Deep Learning Workshop

with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017

Neural Networks and Backpropagation

A brief introduction to neural network

Deep Learning Convoluted Neural Networks Part 2 11/13/

Department of Electrical and Computer Engineering

BlackBerry Test Validation and Analysis using Deep Learning

ECE 599/692 – Deep Learning Lecture 9 – Autoencoder (AE)

Artificial Neural Networks

Neural Networks Geoff Hulten.

Lecture Notes for Chapter 4 Artificial Neural Networks

Tuning CNN: Tips & Tricks

Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824

实习生汇报 ——北邮张安迪.

Image Classification & Training of Neural Networks

Neural networks (3) Regularization Autoencoder

Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824

Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton

Introduction to Neural Networks

Deep Learning Libraries

Neural Machine Translation using CNN

Autoencoders David Dohan.

An introduction to neural network and machine learning

Overall Introduction for the Lecture

Presentation transcript:

Welcome deep loria !

Deep Loria Mailing list: deeploria@inria.fr Web site: http://deeploria.gforge.inria.fr/ Git repository: https://gforge.inria.fr/projects/deeploria

deeploria: involvment I'm no DL expert !!! (at most a trigger) Deeploria'll be what you make of it: Need volunteers !! Propose anything Organize, participate, animate... Next meeting (please think about it): Coffee & discussion session ? → paper reading group: who's willing to take care of it ? Demo for Yann LeCun's venue ? … ?

Outline Motivation Lightning-speed overview of DNN basics Neuron vs. Random Variable; activations Layers: dense, RNN Vanishing gradient More layers: LSTM, RBM/DBN, CNN, Autoencoder Implementation with Keras/Theano

Why all this buzz with DNN ? Because of Expressive Power cf. “On the Expressive Power of Deep Learning: A Tensor Analysis” by Nadav Cohen, Or Sharir, Amnon Shashua “[...] besides a negligible set, all functions that can be implemented by a deep network of polynomial size, require an exponential size if one wishes to implement (or approximate) them with a shallow network”

Basic neuron

Activations sigmoid = logistic relu = rectified linear

Dense layer

Alternative “neuron” Graphical model: node = random variable Connection = “dependency” between variables Restricted Boltzmann Machine (RBM)

Training Dense: Minimize error Stochastic Gradient Descent (SGD) = gradient descent ( back-propagation) RBM: Minimize energy Contrastive Divergence = gradient descent ( Gibbs sampling)

DNN vs. DBN N x Dense → DNN (Deep Neural Networks) N x RBM → DBN (Deep Belief Networks) Dense are discriminative = model the “boundary” between classes RBM are generative = model every classe Performances: RBM better (?) Efficiency: RBM much more difficult to train Usage: 90% for Dense

Recurrent neural network Take the past into account to predict the next step Just like HMMs, CRFs...

Issue 1: Vanishing gradient Back-propagation of error E = chain rule: N layers → N factors of gradient of activation Gradient decreases exponentially with N Consequences: The deepest layers are never learnt

Vanishing gradient Solutions: More data ! Rectified linear (gradient = 1) Unsupervised pre-training: DBN Autoencoders LSTMs instead of RNNs

Autoencoders Re-create the inputs = model of the data with dimensionality reduction = compression

LSTM

Vanishing gradient

Issue 2 : overfitting

Overfitting: solutions Share weights: ex: convolutional layer Regularization: ex: Drop-out, Drop-connect...

Time to code, isn't it ?

Keras example: Reuters Trains an MLP to classify texts into 46 topics In the root dir of Keras, run: python examples/reuters_mlp.py

Keras example max_words 512 46 topics

Tricks for the model Score = categorical cross-entropy = kind of smooth, continuous classification error Softmax = normalizes the outputs as probas Adam = adaptive gradient ?

Tricks for the data X_train = int[# sentences][# words] = word idx Converts list of word indexes into matrix = #sents X Bag Of Words vector (dim=#words)

Plot accuracy as fct of epochs sudo apt-get install python-matplotlib import matplotlib.pyplot as plt […] plt.plot(history.history['acc']) plt.show()

Plot matrix of weights Or plt.matshow(model.get_weight()[0], cmap=plt.cm.gray) plt.show() Or plt.savefig(“fig.png”)

Rules of thumb Check overfitting: plot training acc vs. test acc Check vanishing gradient: plot weights or gradients Normalize your inputs & outputs Try to automatically augment your training set: add noise, rotate/translate images...