Mini Presentations - part 2

Slides:



Advertisements
Similar presentations
A brief review of non-neural-network approaches to deep learning
Advertisements

Neural networks Introduction Fitting neural networks
ImageNet Classification with Deep Convolutional Neural Networks
Lecture 14 – Neural Networks
Deep Residual Learning for Image Recognition
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Bassem Makni SML 16 Click to add text 1 Deep Learning of RDF rules Semantic Machine Learning.
6.5.4 Back-Propagation Computation in Fully-Connected MLP.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Welcome deep loria !.
Big data classification using neural network
Deep Residual Learning for Image Recognition
Convolutional Sequence to Sequence Learning
Deep Feedforward Networks
CS 6501: 3D Reconstruction and Understanding Convolutional Neural Networks Connelly Barnes.
Randomness in Neural Networks
Computer Science and Engineering, Seoul National University
DeepCount Mark Lenson.
Matt Gormley Lecture 16 October 24, 2016
Deep Learning: Model Summary
Spring Courses CSCI 5922 – Probabilistic Models (Mozer) CSCI Mind Reading Machines (Sidney D’Mello) CSCI 7000 – Human Centered Machine Learning.
Deep Learning Libraries
Intro to NLP and Deep Learning
Inception and Residual Architecture in Deep Convolutional Networks
Intelligent Information System Lab
Intro to NLP and Deep Learning
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
Overview of TensorFlow
Lecture 5 Smaller Network: CNN
Neural Networks 2 CS446 Machine Learning.
Convolution Neural Networks
Convolutional Networks
Deep Belief Networks Psychology 209 February 22, 2013.
Deep Learning Workshop
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
Introduction to Deep Learning for neuronal data analyses
CSC 578 Neural Networks and Deep Learning
Introduction to Neural Networks
Deep Learning Packages
Image Classification.
Tensorflow in Deep Learning
Recurrent Neural Networks
CS 4501: Introduction to Computer Vision Training Neural Networks II
Introduction to Deep Learning with Keras
Very Deep Convolutional Networks for Large-Scale Image Recognition
Chap. 7 Regularization for Deep Learning (7.8~7.12 )
Smart Robots, Drones, IoT
CSC 578 Neural Networks and Deep Learning
CSC 578 Neural Networks and Deep Learning
A Proposal Defense On Deep Residual Network For Face Recognition Presented By SAGAR MISHRA MECE
Long Short Term Memory within Recurrent Neural Networks
Neural Networks Geoff Hulten.
Vinit Shah, Joseph Picone and Iyad Obeid
Lecture 16: Recurrent Neural Networks (RNNs)
Coding neural networks: A gentle Introduction to keras
Convolutional Neural Networks
Deep Learning Some slides are from Prof. Andrew Ng of Stanford.
实习生汇报 ——北邮 张安迪.
Martin Schrimpf & Jon Gauthier MIT BCS Peer Lectures
Inception-v4, Inception-ResNet and the Impact of
Deep Residual Learning for Automatic Seizure Detection
Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton
A unified extension of lstm to deep network
Deep Learning Libraries
Natalie Lang Tomer Malach
Tensorflow and Keras (draft)
Image recognition.
Example of training and deployment of deep convolutional neural networks. Example of training and deployment of deep convolutional neural networks. During.
CSC 578 Neural Networks and Deep Learning
CSC 578 Neural Networks and Deep Learning
Presentation transcript:

Mini Presentations - part 2 Class 7 Mini Presentations - part 2 28.5.2017

Dropout Aviv & Daniel

Dropout - general What is Dropout? Keep neuron's output with some probability (use for some or all layers) Why use dropout? – reduces overfitting Net learns multiple independent representations of same data Prevents complex co-adaptations

Dropout - code Dropout parameters Other parameters keep_prob: train (p=0.5-0.8) vs. test (p=1) Other parameters increase learning rate by factor of 10-100 use high momentum value: 0.9 or 0.99 Best use when overfitting

Intro to Keras Gal & Gal

Introduction to Keras Gal Benor, Gal Yona

Using Keras - Motivation Keras is a high-level neural networks API, written in Python that works on top of either TensorFlow or Theano as its backend. Core data structure of Keras is a model, which is just a way to organize layers. There are two ways Keras offers us to model our network: The model’s layers are a way of representing the neural network in a simple and manageable way. . Sequential model Linear stack of layers Keras functional API Arbitrary graphs of layers

Intro to Keras 2 Yotam Amar

Locally Connected Layer It’s a mid-point between fully connected and convolution Is = input size Ic = input channels k = kernel size Os = output size Oc = output channels Fully connected Is×Ic×Os×Oc convolution Ic×k×Oc

Locally Connected Layer It’s a mid-point between fully connected and convolution Is = input size Ic = input channels k = kernel size Os = output size Oc = output channels Locally Connected Is×Ic×k×Oc

Locally Connected Layer Keras implimantation: Just replace Conv1D with LocallyConnected1D (2D also available) keras.layers.local.LocallyConnected1D(filters, kernel_size, strides=1, padding='valid', data_format=None, activation=None, use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None)

Residual Networks Lee Peleg

Residual Networks Lee Peleg

Motivation Deeper network perform better on image recognition tasks. When reaching tens of layers, a degradation suddenly appears for the training error. But why would a deeper network ruin the training error, when it can just learn an identity representation?

Scheme The representation to be learned: The residual representation can be built out of several non-linear layers. For only one linear layer no advantage is observed Dimensionality reduction through linear mapping:

Tensorflow Implementation def residual(name, l, increase_dim=False, first=False): … %Stuff with tf.variable_scope(name) as scope: b1 = l if first else BNReLU(l) c1 = Conv2D('conv1', b1, out_channel, stride=stride1, nl=BNReLU) c2 = Conv2D('conv2', c1, out_channel) l = c2 + l return l Plain Vs. ResNet

Gradient Clipping Hagai & Yonatan & Omer

Gradient clipping Omer Shabtai

The problem Context Prone nets: the exploding gradients problem refers to the large increase in the norm of the gradient during training Happens for strong long-term correlations. The opposite of vanishing gradient problem. Many layered nets (deep) RNN where back prop might extend to far past

RNN back propagation: Vanishing for max eigenvalue < 1 Exploading for max eigenvalue > 1

Solution- Gradient clipping

Implementation

Tensorflow built in functions: Basic (Value clipping): tf.clip_by_value(t, clip_value_min, clip_value_max, name=None) clip_value_max clip_value_min Value clipping doesn’t keep gradient direction!

Norm - clipping

tf.clip_by_norm(t, clip_norm, axes=None, name=None) Advanced (L2 norms): Typically used before applying an optimizer. Resemble regularisation parameters, but allows long term memory. tf.clip_by_norm(t, clip_norm, axes=None, name=None) (over single layer, single gradient matrix) tf.clip_by_average_norm(t, clip_norm, name=None) (over gradient ensemble)

When applied on tensor list: tf.clip_by_global_norm(t_list, clip_norm, use_norm=None, name=None) This is the correct way to perform gradient clipping slower than clip_by_norm() (needs all parameters)

Further reading On the difficulty of training Recurrent Neural Networks: https://arxiv.org/pdf/1211.5063.pdf Gradient clipping in Tensor flow: https://www.tensorflow.org/versions/r0.12/api_docs/python/train/gradient_clip ping