Mini Presentations - part 2

Slides:

Advertisements

Similar presentations

A brief review of non-neural-network approaches to deep learning

Advertisements

Neural networks Introduction Fitting neural networks

ImageNet Classification with Deep Convolutional Neural Networks

Lecture 14 – Neural Networks

Deep Residual Learning for Image Recognition

Deep Learning Overview Sources: workshop-tutorial-final.pdf

Bassem Makni SML 16 Click to add text 1 Deep Learning of RDF rules Semantic Machine Learning.

6.5.4 Back-Propagation Computation in Fully-Connected MLP.

Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.

Welcome deep loria !.

Big data classification using neural network

Deep Residual Learning for Image Recognition

Convolutional Sequence to Sequence Learning

Deep Feedforward Networks

CS 6501: 3D Reconstruction and Understanding Convolutional Neural Networks Connelly Barnes.

Randomness in Neural Networks

Computer Science and Engineering, Seoul National University

DeepCount Mark Lenson.

Matt Gormley Lecture 16 October 24, 2016

Deep Learning: Model Summary

Spring Courses CSCI 5922 – Probabilistic Models (Mozer) CSCI Mind Reading Machines (Sidney D’Mello) CSCI 7000 – Human Centered Machine Learning.

Deep Learning Libraries

Intro to NLP and Deep Learning

Inception and Residual Architecture in Deep Convolutional Networks

Intelligent Information System Lab

Intro to NLP and Deep Learning

with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017

Overview of TensorFlow

Lecture 5 Smaller Network: CNN

Neural Networks 2 CS446 Machine Learning.

Convolution Neural Networks

Convolutional Networks

Deep Belief Networks Psychology 209 February 22, 2013.

Deep Learning Workshop

with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017

Introduction to Deep Learning for neuronal data analyses

CSC 578 Neural Networks and Deep Learning

Introduction to Neural Networks

Deep Learning Packages

Image Classification.

Tensorflow in Deep Learning

Recurrent Neural Networks

CS 4501: Introduction to Computer Vision Training Neural Networks II

Introduction to Deep Learning with Keras

Very Deep Convolutional Networks for Large-Scale Image Recognition

Chap. 7 Regularization for Deep Learning (7.8~7.12 )

Smart Robots, Drones, IoT

CSC 578 Neural Networks and Deep Learning

CSC 578 Neural Networks and Deep Learning

A Proposal Defense On Deep Residual Network For Face Recognition Presented By SAGAR MISHRA MECE

Long Short Term Memory within Recurrent Neural Networks

Neural Networks Geoff Hulten.

Vinit Shah, Joseph Picone and Iyad Obeid

Lecture 16: Recurrent Neural Networks (RNNs)

Coding neural networks: A gentle Introduction to keras

Convolutional Neural Networks

Deep Learning Some slides are from Prof. Andrew Ng of Stanford.

实习生汇报 ——北邮张安迪.

Martin Schrimpf & Jon Gauthier MIT BCS Peer Lectures

Inception-v4, Inception-ResNet and the Impact of

Deep Residual Learning for Automatic Seizure Detection

Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton

A unified extension of lstm to deep network

Deep Learning Libraries

Natalie Lang Tomer Malach

Tensorflow and Keras (draft)

Image recognition.

Example of training and deployment of deep convolutional neural networks. Example of training and deployment of deep convolutional neural networks. During.

CSC 578 Neural Networks and Deep Learning

CSC 578 Neural Networks and Deep Learning

Presentation transcript:

Mini Presentations - part 2 Class 7 Mini Presentations - part 2 28.5.2017

Dropout Aviv & Daniel

Dropout - general What is Dropout? Keep neuron's output with some probability (use for some or all layers) Why use dropout? – reduces overfitting Net learns multiple independent representations of same data Prevents complex co-adaptations

Dropout - code Dropout parameters Other parameters keep_prob: train (p=0.5-0.8) vs. test (p=1) Other parameters increase learning rate by factor of 10-100 use high momentum value: 0.9 or 0.99 Best use when overfitting

Intro to Keras Gal & Gal

Introduction to Keras Gal Benor, Gal Yona

Using Keras - Motivation Keras is a high-level neural networks API, written in Python that works on top of either TensorFlow or Theano as its backend. Core data structure of Keras is a model, which is just a way to organize layers. There are two ways Keras offers us to model our network: The model’s layers are a way of representing the neural network in a simple and manageable way. . Sequential model Linear stack of layers Keras functional API Arbitrary graphs of layers

Intro to Keras 2 Yotam Amar

Locally Connected Layer It’s a mid-point between fully connected and convolution Is = input size Ic = input channels k = kernel size Os = output size Oc = output channels Fully connected Is×Ic×Os×Oc convolution Ic×k×Oc

Locally Connected Layer It’s a mid-point between fully connected and convolution Is = input size Ic = input channels k = kernel size Os = output size Oc = output channels Locally Connected Is×Ic×k×Oc

Locally Connected Layer Keras implimantation: Just replace Conv1D with LocallyConnected1D (2D also available) keras.layers.local.LocallyConnected1D(filters, kernel_size, strides=1, padding='valid', data_format=None, activation=None, use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None)

Residual Networks Lee Peleg

Residual Networks Lee Peleg

Motivation Deeper network perform better on image recognition tasks. When reaching tens of layers, a degradation suddenly appears for the training error. But why would a deeper network ruin the training error, when it can just learn an identity representation?

Scheme The representation to be learned: The residual representation can be built out of several non-linear layers. For only one linear layer no advantage is observed Dimensionality reduction through linear mapping:

Tensorflow Implementation def residual(name, l, increase_dim=False, first=False): … %Stuff with tf.variable_scope(name) as scope: b1 = l if first else BNReLU(l) c1 = Conv2D('conv1', b1, out_channel, stride=stride1, nl=BNReLU) c2 = Conv2D('conv2', c1, out_channel) l = c2 + l return l Plain Vs. ResNet

Gradient Clipping Hagai & Yonatan & Omer

Gradient clipping Omer Shabtai

The problem Context Prone nets: the exploding gradients problem refers to the large increase in the norm of the gradient during training Happens for strong long-term correlations. The opposite of vanishing gradient problem. Many layered nets (deep) RNN where back prop might extend to far past

RNN back propagation: Vanishing for max eigenvalue < 1 Exploading for max eigenvalue > 1

Solution- Gradient clipping

Implementation

Tensorflow built in functions: Basic (Value clipping): tf.clip_by_value(t, clip_value_min, clip_value_max, name=None) clip_value_max clip_value_min Value clipping doesn’t keep gradient direction!

Norm - clipping

tf.clip_by_norm(t, clip_norm, axes=None, name=None) Advanced (L2 norms): Typically used before applying an optimizer. Resemble regularisation parameters, but allows long term memory. tf.clip_by_norm(t, clip_norm, axes=None, name=None) (over single layer, single gradient matrix) tf.clip_by_average_norm(t, clip_norm, name=None) (over gradient ensemble)

When applied on tensor list: tf.clip_by_global_norm(t_list, clip_norm, use_norm=None, name=None) This is the correct way to perform gradient clipping slower than clip_by_norm() (needs all parameters)

Further reading On the difficulty of training Recurrent Neural Networks: https://arxiv.org/pdf/1211.5063.pdf Gradient clipping in Tensor flow: https://www.tensorflow.org/versions/r0.12/api_docs/python/train/gradient_clip ping