CSE 190 Neural Networks: How to train a network to look and see

Slides:



Advertisements
Similar presentations
Dougal Sutherland, 9/25/13.
Advertisements

CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 7: Learning in recurrent networks Geoffrey Hinton.
Lecture 14 – Neural Networks
Artificial Neural Networks ECE 398BD Instructor: Shobha Vasudevan.
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.
Explorations in Neural Networks Tianhui Cai Period 3.
Neural Networks Ellen Walker Hiram College. Connectionist Architectures Characterized by (Rich & Knight) –Large number of very simple neuron-like processing.
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
CSE & CSE6002E - Soft Computing Winter Semester, 2011 Neural Networks Videos Brief Review The Next Generation Neural Networks - Geoff Hinton.
Introduction to Neural Networks Introduction to Neural Networks Applied to OCR and Speech Recognition An actual neuron A crude model of a neuron Computational.
Chapter 18 Connectionist Models
C - IT Acumens. COMIT Acumens. COM. To demonstrate the use of Neural Networks in the field of Character and Pattern Recognition by simulating a neural.
Chapter 6 Neural Network.
Neural Networks Lecture 11: Learning in recurrent networks Geoffrey Hinton.
Lecture 4b Data augmentation for CNN training
Multinomial Regression and the Softmax Activation Function Gary Cottrell.
Another Example: Circle Detection
Attention Model in NLP Jichuan ZENG.
Today’s Lecture Neural networks Training
Machine Learning Supervised Learning Classification and Regression
Reinforcement Learning
Unsupervised Learning of Video Representations using LSTMs
Regularization Techniques in Neural Networks
RNNs: An example applied to the prediction task
Neural Network Architecture Session 2
Deep Learning Amin Sobhani.
an introduction to: Deep Learning
CSE 190 Neural Networks: The Neural Turing Machine
Data Mining, Neural Network and Genetic Programming
Computer Science and Engineering, Seoul National University
Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek
Matt Gormley Lecture 16 October 24, 2016
Intro to NLP and Deep Learning
Intelligent Information System Lab
ECE 6504 Deep Learning for Perception
Convolutional Networks
Deep Belief Networks Psychology 209 February 22, 2013.
CSSE463: Image Recognition Day 17
"Playing Atari with deep reinforcement learning."
Prof. Carolina Ruiz Department of Computer Science
Layer-wise Performance Bottleneck Analysis of Deep Neural Networks
RNNs: Going Beyond the SRN in Language Prediction
Grid Long Short-Term Memory
Hidden Markov Models Part 2: Algorithms
network of simple neuron-like computing elements
Emre O. Neftci  iScience  Volume 5, Pages (July 2018) DOI: /j.isci
CSSE463: Image Recognition Day 17
Neural Networks Geoff Hulten.
On Convolutional Neural Network
Lecture: Deep Convolutional Neural Networks
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
CSSE463: Image Recognition Day 17
CSSE463: Image Recognition Day 13
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
Introduction to Artificial Intelligence Lecture 24: Computer Vision IV
RNNs: Going Beyond the SRN in Language Prediction
CSSE463: Image Recognition Day 18
实习生汇报 ——北邮 张安迪.
CSSE463: Image Recognition Day 17
Neural Networks and Neuroscience-Inspired Computer Vision
CSSE463: Image Recognition Day 17
Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton
CSC321: Neural Networks Lecture 11: Learning in recurrent networks
An introduction to: Deep Learning aka or related to Deep Neural Networks Deep Structural Learning Deep Belief Networks etc,
David Kauchak CS158 – Spring 2019
EE 193/Comp 150 Computing with Biological Parts
CSC 578 Neural Networks and Deep Learning
Prof. Carolina Ruiz Department of Computer Science
Outline Announcement Neural networks Perceptrons - continued
Presentation transcript:

CSE 190 Neural Networks: How to train a network to look and see 6/6/2018 CSE 190 Neural Networks: How to train a network to look and see Gary Cottrell Week 9 Lecture 2 . CSE 190 6/6/2018 Walker L. Cisler Memorial Science Lecture

Introduction How do we deal with the high dimensionality of visual input? CSE 190 6/6/2018

Introduction Our field of view is about 200° horizontally and 130° vertically – a HUGE image. Compare to the size of MNIST images! How do we deal with the high dimensionality of visual input? Sampling! CSE 190 6/6/2018

Introduction We have a foveated retina – we only have high resolution for about 2° of visual angle We move our eyes about 3 times a second That pencils out to about 172,000 times a day! So we sample at the highest resolution 2° of visual angle 172k times per day. Perhaps we could apply this idea to computer vision. CSE 190 6/6/2018

Introduction And we have (Kanan & Cottrell, 2010) We used a salience map to decide where to sample from an image And stored fragments of the image For a new image, took new samples and figured out who or what it was by a kind of nearest neighbor voting CSE 190 6/6/2018

6/6/2018 Humans make ~170,000 saccades each day CSE 190 6/6/2018 OSHER 6

What’s wrong with this picture? The model sampled randomly from the image according to the probability distribution of the salience map. Clearly, we (humans, other animals) don’t do this We can recognize a face in two fixations (Hsiao & Cottrell, 2008) Can we learn a policy for sampling from an image efficiently? CSE 190 6/6/2018

The Recurrent Attention Model Gary Cottrell & * 07/16/96 The Recurrent Attention Model Researchers at Deep Mind (purchased by Google for $400,000,000 in 2014) have developed a network that can “move its eyes” and recognize multiple objects in an image. (Ba, Mnih, and Kavukcuoglu (2015), ICLR 2015) It is trained end-to-end to sample from an image, decide the next location to look at, and output a classification Initially used to read street addresses CSE 190 APA Talk, August 2000: The face of fear: *

The Recurrent Attention Model Gary Cottrell & * 07/16/96 The Recurrent Attention Model CSE 190 APA Talk, August 2000: The face of fear: *

The Recurrent Attention Model Gary Cottrell & * 07/16/96 The Recurrent Attention Model Start Here CSE 190 APA Talk, August 2000: The face of fear: *

The Recurrent Attention Model Gary Cottrell & * 07/16/96 The Recurrent Attention Model The little arrow from the little picture is actually a 3-layer convnet with no pooling, that learns from a coarse version of the image to create the initial state of the recurrent network that decides where to look next. CSE 190 APA Talk, August 2000: The face of fear: *

The Recurrent Attention Model Gary Cottrell & * 07/16/96 The Recurrent Attention Model The controller network CSE 190 APA Talk, August 2000: The face of fear: *

The Recurrent Attention Model Gary Cottrell & * 07/16/96 The Recurrent Attention Model This is half of the recurrent network; the part that I’ll call the controller network – because it keeps the state of where we’ve looked and is input to the emission network to produce where to look next. It is an LSTM network CSE 190 APA Talk, August 2000: The face of fear: *

The Recurrent Attention Model Gary Cottrell & * 07/16/96 The Recurrent Attention Model From the controller network, the little arrow marked “emission” is really just a feedforward network with one hidden layer that learns to produce an (x,y) location of where to look next, based on the current state of the r(2) network. n is the time step CSE 190 APA Talk, August 2000: The face of fear: *

The Recurrent Attention Model Gary Cottrell & * 07/16/96 The Recurrent Attention Model So, the first computation is to take a coarse version of the image, run it through a convnet, which sets the initial state of the r(2) network, which feeds into the emission network, which produces a first fixation. CSE 190 APA Talk, August 2000: The face of fear: *

The Recurrent Attention Model Gary Cottrell & * 07/16/96 The Recurrent Attention Model This (x,y) location decides what patch of image is input to the “glimpse network” CSE 190 APA Talk, August 2000: The face of fear: *

The Recurrent Attention Model Gary Cottrell & * 07/16/96 The Recurrent Attention Model So, after training, it focuses on the first digit in the address. Glimpse network glimpse Input image CSE 190 APA Talk, August 2000: The face of fear: *

The Recurrent Attention Model Gary Cottrell & * 07/16/96 The Recurrent Attention Model This little arrow is a feed-forward convnet, with three convolutional layers followed by a fully-connected hidden layer…it is gated by the hidden layer of the location network, one to one (element-wise) glimpse Input image CSE 190 APA Talk, August 2000: The face of fear: *

The Story So Far… This section of the network: Gary Cottrell & * 07/16/96 The Story So Far… This section of the network: CSE 190 APA Talk, August 2000: The face of fear: *

The Story So Far… This section of the network: In more detail… is this Gary Cottrell & * The Story So Far… 07/16/96 This section of the network: X Y X Y X Y T=2 T=1 T=0 In more detail… is this CSE 190 APA Talk, August 2000: The face of fear: *

The Story So Far… This section of the network: In more detail… is this Gary Cottrell & * The Story So Far… 07/16/96 This section of the network: X Y X Y X Y T=2 T=1 T=0 In more detail… is this CSE 190 APA Talk, August 2000: The face of fear: *

The Story So Far… This section of the network: In more detail… is this Gary Cottrell & * The Story So Far… 07/16/96 This section of the network: In more detail… is this CSE 190 APA Talk, August 2000: The face of fear: *

The Story So Far… This section of the network: Gary Cottrell & * The Story So Far… 07/16/96 This section of the network: Again, the hidden units in emission network, with exactly the same number of hidden units as the first recurrent network, are one-to-one connected with multiplicative connections – that is, the hidden layer of the lower recurrent net is gated by the location network CSE 190 APA Talk, August 2000: The face of fear: *

The Story So Far… This section of the network: Gary Cottrell & * The Story So Far… 07/16/96 This section of the network: Note that this also gives a pathway for the error to propagate from the actual target network (which is fed by the lower recurrent net) back al the way to the hidden nodes of the emission network, but not the output of the emission network – the location. CSE 190 APA Talk, August 2000: The face of fear: *

The Recurrent Attention Model Gary Cottrell & * 07/16/96 The Recurrent Attention Model End Here (if done) Start Here CSE 190 APA Talk, August 2000: The face of fear: *

The Recurrent Attention Model Gary Cottrell & * 07/16/96 The Recurrent Attention Model How do we train this?? The y is compared to the target (presumably read out when the LSTM units are good and ready) And then backprop They actually stop the gradient calculation after the first mislabeled target – so shorter sequences first. This is sometimes called curriculum learning CSE 190 APA Talk, August 2000: The face of fear: *

The Recurrent Attention Model Gary Cottrell & * 07/16/96 The Recurrent Attention Model That takes care of the classification part, but what about the location part? Here, we can use reinforcement learning to reward the network when it picks a location that works well. The reinforcement signal is based on the fraction it gets right. CSE 190 APA Talk, August 2000: The face of fear: *

The Recurrent Attention Model Gary Cottrell & * 07/16/96 The Recurrent Attention Model But how do we even get started? We let the network choose random locations at first, to encourage it to explore. Then later we exploit what it has learned, and explore less. CSE 190 APA Talk, August 2000: The face of fear: *

The Recurrent Attention Model Gary Cottrell & * 07/16/96 The Recurrent Attention Model Start Here CSE 190 APA Talk, August 2000: The face of fear: *

So, what can we do with all this machinery?? We can find pairs of digits in images! (whoohoo!) (really? We did all this to do that???) Ok, yeah, well, we can do it better than anyone else! (ok, better than we did it last year…) CSE 190

How the network behaves CSE 190

But wait! There’s more! We can add those two digits (we couldn’t do that last year) CSE 190

But wait! There’s more! We can read street numbers!!! CSE 190

But wait! There’s more! We can read street numbers backwards!!! CSE 190

Was all that really necessary? CSE 190