Many slides and slide ideas thanks to Marc'Aurelio Ranzato and Michael Nielson. http://www.cs.toronto.edu/~ranzato/files/ranzato_CNN_stanford2015.pdf http://neuralnetworksanddeeplearning.com/

Slides:



Advertisements
Similar presentations
Neural Networks and Kernel Methods
Advertisements

A brief review of non-neural-network approaches to deep learning
Neural networks Introduction Fitting neural networks
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
ImageNet Classification with Deep Convolutional Neural Networks
Outline 1-D regression Least-squares Regression Non-iterative Least-squares Regression Basis Functions Overfitting Validation 2.
ARTIFICIAL NEURAL NETWORKS. Overview EdGeneral concepts Areej:Learning and Training Wesley:Limitations and optimization of ANNs Cora:Applications and.
Artificial Intelligence Chapter 3 Neural Networks Artificial Intelligence Chapter 3 Neural Networks Biointelligence Lab School of Computer Sci. & Eng.
Learning Features and Parts for Fine-Grained Recognition Authors: Jonathan Krause, Timnit Gebru, Jia Deng, Li-Jia Li, Li Fei-Fei ICPR, 2014 Presented by:
ImageNet Classification with Deep Convolutional Neural Networks Presenter: Weicong Chen.
Lecture 3b: CNN: Advanced Layers
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Convolutional Neural Networks at Constrained Time Cost (CVPR 2015) Authors : Kaiming He, Jian Sun (MSR) Presenter : Hyunjun Ju 1.
CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
Neural networks and support vector machines
Big data classification using neural network
Regularization Techniques in Neural Networks
Learning to Compare Image Patches via Convolutional Neural Networks
Deep Feedforward Networks
Artificial Neural Networks
Summary of “Efficient Deep Learning for Stereo Matching”
Data Mining, Neural Network and Genetic Programming
[Ran Manor and Amir B.Geva] Yehu Sapir Outlines Review
Data Mining, Neural Network and Genetic Programming
ECE 5424: Introduction to Machine Learning
Computer Science and Engineering, Seoul National University
Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek
第 3 章 神经网络.
Many slides and slide ideas thanks to Marc'Aurelio Ranzato and Michael Nielson.
Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)
Convolution Neural Networks
Training Techniques for Deep Neural Networks
Convolutional Networks
Introduction to Deep Learning for neuronal data analyses
Machine Learning Today: Reading: Maria Florina Balcan
Introduction to Neural Networks
Image Classification.
ANN Design and Training
Cos 429: Face Detection (Part 2) Viola-Jones and AdaBoost Guest Instructor: Andras Ferencz (Your Regular Instructor: Fei-Fei Li) Thanks to Fei-Fei.
CS 4501: Introduction to Computer Vision Training Neural Networks II
Deep Learning Hierarchical Representations for Image Steganalysis
ECE 599/692 – Deep Learning Lecture 5 – CNN: The Representative Power
Artificial Intelligence Chapter 3 Neural Networks
Very Deep Convolutional Networks for Large-Scale Image Recognition
Chap. 7 Regularization for Deep Learning (7.8~7.12 )
Neural Networks Geoff Hulten.
On Convolutional Neural Network
Deep Learning for Non-Linear Control
ML – Lecture 3B Deep NN.
Artificial Intelligence Chapter 3 Neural Networks
Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824
Artificial Intelligence Chapter 3 Neural Networks
Life Long Learning Hung-yi Lee 李宏毅
Neural networks (1) Traditional multi-layer perceptrons
Machine learning overview
Mihir Patel and Nikhil Sardana
Image Classification & Training of Neural Networks
Artificial Intelligence Chapter 3 Neural Networks
Word embeddings (continued)
Computer Vision Lecture 19: Object Recognition III
Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824
Reuben Feinman Research advised by Brenden Lake
Department of Computer Science Ben-Gurion University of the Negev
Introduction to Neural Networks
CS295: Modern Systems: Application Case Study Neural Network Accelerator Sang-Woo Jun Spring 2019 Many slides adapted from Hyoukjun Kwon‘s Gatech “Designing.
VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION
Image recognition.
Example of training and deployment of deep convolutional neural networks. Example of training and deployment of deep convolutional neural networks. During.
Artificial Intelligence Chapter 3 Neural Networks
Presentation transcript:

Many slides and slide ideas thanks to Marc'Aurelio Ranzato and Michael Nielson. http://www.cs.toronto.edu/~ranzato/files/ranzato_CNN_stanford2015.pdf http://neuralnetworksanddeeplearning.com/

http://illusionoftheyear.com/2009/05/the-illusion-of-sex/ http://public.gettysburg.edu/~rrussell/Russell_2009.pdf Left appears female, right appears male. Eyes and lips have not had their brightness/contrast adjusted, but it looks like the face on the left has darker lips This is an example of simultaneous contrast adjustment Russell, 2009

Simultaneous contrast http://pixgood.com/color-contrast-art-examples.html

In this one, the skin was left unchanged, but the lips and eyes were lightened on the left and darkened on the right.

No illusion – both faces appear androgenous “A sex difference in facial contrast and its exaggeration by cosmetics” Female faces have greater luminance contrast between the eyes, lips, and surrounding skin than do male faces. This contrast change in some way represents ‘femininity’. Cosmetics could be seen to increase contrast of these areas, and so increase femininity.

Perceptron: This is convolution!

v v v Shared weights Output should be a little smaller, as we aren’t padding v

Filter = ‘local’ perceptron. Also called kernel. This is the major trick. For each layer of the neural network, we’re going to learn multiple filters.

By pooling responses at different locations, we gain robustness to the exact spatial location of image features.

FLOPS = just to execute, not train.

Universality A single-layer network can learn any function: So long as it is differentiable To some approximation; More perceptrons = a better approximation Visual proof (Michael Nielson): http://neuralnetworksanddeeplearning.com/chap4.html Visual proof is with sigmoid activation functions

If a single-layer network can learn any function… …given enough parameters… …then why do we go deeper? Intuitively, composition is efficient because it allows reuse. Empirically, deep networks do a better job than shallow networks at learning such hierarchies of knowledge.

Problem of fitting Too many parameters = overfitting Not enough parameters = underfitting More data = less chance to overfit How do we know what is required?

Regularization Attempt to guide solution to not overfit But still give freedom with many parameters

Data fitting problem [Nielson]

Which is better? Which is better a priori? How do we know a priori how many parameters we need? Cross-validation is one way. 9th order polynomial 1st order polynomial [Nielson]

Regularization Attempt to guide solution to not overfit But still give freedom with many parameters Idea: Penalize the use of parameters to prefer small weights.

Regularization: Idea: add a cost to having high weights λ = regularization parameter n is size of training data Intuitively, the effect of regularization is to make it so the network prefers to learn small weights, all other things being equal. Large weights will only be allowed if they considerably improve the first part of the cost function. Put another way, regularization can be viewed as a way of compromising between finding small weights and minimizing the original cost function. [Nielson]

Both can describe the data… …but one is simpler. Occam’s razor: “Among competing hypotheses, the one with the fewest assumptions should be selected” For us: Large weights cause large changes in behaviour in response to small changes in the input. Simpler models (or smaller changes) are more robust to noise. Why does this help reduce overfitting?

Regularization Idea: add a cost to having high weights λ = regularization parameter n is size of training data Intuitively, the effect of regularization is to make it so the network prefers to learn small weights, all other things being equal. Large weights will only be allowed if they considerably improve the first part of the cost function. Put another way, regularization can be viewed as a way of compromising between finding small weights and minimizing the original cost function. Normal cross-entropy loss (binary classes) Regularization term [Nielson]

Regularization: Dropout Our networks typically start with random weights. Every time we train = slightly different outcome. Why random weights? If weights are all equal, response across filters will be equivalent. Network doesn’t train. http://stackoverflow.com/questions/20027598/why-should-weights-of-neural-networks-be-initialized-to-random-numbers [Nielson]

Regularization Our networks typically start with random weights. Every time we train = slightly different outcome. Why not train 5 different networks with random starts and vote on their outcome? Works fine! Helps generalization because error is averaged. Any problem here? But, kind of slow?

Regularization: Dropout [Nielson]

Regularization: Dropout At each mini-batch: Randomly select a subset of neurons. Ignore them. On test: half weights outgoing to compensate for training on half neurons. Effect: Neurons become less dependent on output of connected neurons. Forces network to learn more robust features that are useful to more subsets of neurons. Like averaging over many different trained networks with different random initializations. Except cheaper to train. [Nielson]

More regularization Adding more data is a kind of regularization Pooling is a kind of regularization Data augmentation is a kind of regularization

CNNs for Medical Imaging - Connectomics This is some of James’ work with Harvard Brain Science Center. The brain represented as a network: image obtained from MR connectomics. (Provided by Patric Hagmann, CHUV-UNIL, Lausanne, Switzerland) [Patric Hagmann]

Vision for understanding the brain 1mm cubed of brain Image at 5-30 nanometers How much data? [Kaynig-Fittkau et al.]

Vision for understanding the brain 1mm cubed of brain Image at 5-30 nanometers How much data? 1 Petabyte – 1,000,000,000,000,000 ~ All photos uploaded to Facebook per day [Kaynig-Fittkau et al.]

[Kaynig-Fittkau et al.]

Vision for understanding the brain Use computer vision to classify neurons in the brain. Attempt to build a connectivity graph of all neurons – ‘connectome’ – so that we can map and simulate brain function.

[Kaynig-Fittkau et al.]

https://arxiv.org/abs/1704.00848 [Haehn et al.]

Network Architecture [Haehn et al.] 170,000 parameters 9000 positive / negative example patches 75 x 75 80,000 image patches https://arxiv.org/abs/1704.00848 [Haehn et al.]

https://arxiv.org/abs/1704.00848 [Haehn et al.]

https://arxiv.org/abs/1704.00848 [Haehn et al.]