CSE 190 Neural Networks: How to train a network to look and see

Slides:

Advertisements

Similar presentations

Dougal Sutherland, 9/25/13.

Advertisements

CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 7: Learning in recurrent networks Geoffrey Hinton.

Lecture 14 – Neural Networks

Artificial Neural Networks ECE 398BD Instructor: Shobha Vasudevan.

1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.

1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.

Explorations in Neural Networks Tianhui Cai Period 3.

Neural Networks Ellen Walker Hiram College. Connectionist Architectures Characterized by (Rich & Knight) –Large number of very simple neuron-like processing.

1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.

CSE & CSE6002E - Soft Computing Winter Semester, 2011 Neural Networks Videos Brief Review The Next Generation Neural Networks - Geoff Hinton.

Introduction to Neural Networks Introduction to Neural Networks Applied to OCR and Speech Recognition An actual neuron A crude model of a neuron Computational.

Chapter 18 Connectionist Models

C - IT Acumens. COMIT Acumens. COM. To demonstrate the use of Neural Networks in the field of Character and Pattern Recognition by simulating a neural.

Chapter 6 Neural Network.

Neural Networks Lecture 11: Learning in recurrent networks Geoffrey Hinton.

Lecture 4b Data augmentation for CNN training

Multinomial Regression and the Softmax Activation Function Gary Cottrell.

Another Example: Circle Detection

Attention Model in NLP Jichuan ZENG.

Today’s Lecture Neural networks Training

Machine Learning Supervised Learning Classification and Regression

Reinforcement Learning

Unsupervised Learning of Video Representations using LSTMs

Regularization Techniques in Neural Networks

RNNs: An example applied to the prediction task

Neural Network Architecture Session 2

Deep Learning Amin Sobhani.

an introduction to: Deep Learning

CSE 190 Neural Networks: The Neural Turing Machine

Data Mining, Neural Network and Genetic Programming

Computer Science and Engineering, Seoul National University

Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek

Matt Gormley Lecture 16 October 24, 2016

Intro to NLP and Deep Learning

Intelligent Information System Lab

ECE 6504 Deep Learning for Perception

Convolutional Networks

Deep Belief Networks Psychology 209 February 22, 2013.

CSSE463: Image Recognition Day 17

"Playing Atari with deep reinforcement learning."

Prof. Carolina Ruiz Department of Computer Science

Layer-wise Performance Bottleneck Analysis of Deep Neural Networks

RNNs: Going Beyond the SRN in Language Prediction

Grid Long Short-Term Memory

Hidden Markov Models Part 2: Algorithms

network of simple neuron-like computing elements

Emre O. Neftci iScience Volume 5, Pages (July 2018) DOI: /j.isci

CSSE463: Image Recognition Day 17

Neural Networks Geoff Hulten.

On Convolutional Neural Network

Lecture: Deep Convolutional Neural Networks

MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING

CSSE463: Image Recognition Day 17

CSSE463: Image Recognition Day 13

MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING

Introduction to Artificial Intelligence Lecture 24: Computer Vision IV

RNNs: Going Beyond the SRN in Language Prediction

CSSE463: Image Recognition Day 18

实习生汇报 ——北邮张安迪.

CSSE463: Image Recognition Day 17

Neural Networks and Neuroscience-Inspired Computer Vision

CSSE463: Image Recognition Day 17

Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton

CSC321: Neural Networks Lecture 11: Learning in recurrent networks

An introduction to: Deep Learning aka or related to Deep Neural Networks Deep Structural Learning Deep Belief Networks etc,

David Kauchak CS158 – Spring 2019

EE 193/Comp 150 Computing with Biological Parts

CSC 578 Neural Networks and Deep Learning

Prof. Carolina Ruiz Department of Computer Science

Outline Announcement Neural networks Perceptrons - continued

Presentation transcript:

CSE 190 Neural Networks: How to train a network to look and see 6/6/2018 CSE 190 Neural Networks: How to train a network to look and see Gary Cottrell Week 9 Lecture 2 . CSE 190 6/6/2018 Walker L. Cisler Memorial Science Lecture

Introduction How do we deal with the high dimensionality of visual input? CSE 190 6/6/2018

Introduction Our field of view is about 200° horizontally and 130° vertically – a HUGE image. Compare to the size of MNIST images! How do we deal with the high dimensionality of visual input? Sampling! CSE 190 6/6/2018

Introduction We have a foveated retina – we only have high resolution for about 2° of visual angle We move our eyes about 3 times a second That pencils out to about 172,000 times a day! So we sample at the highest resolution 2° of visual angle 172k times per day. Perhaps we could apply this idea to computer vision. CSE 190 6/6/2018

Introduction And we have (Kanan & Cottrell, 2010) We used a salience map to decide where to sample from an image And stored fragments of the image For a new image, took new samples and figured out who or what it was by a kind of nearest neighbor voting CSE 190 6/6/2018

6/6/2018 Humans make ~170,000 saccades each day CSE 190 6/6/2018 OSHER 6

What’s wrong with this picture? The model sampled randomly from the image according to the probability distribution of the salience map. Clearly, we (humans, other animals) don’t do this We can recognize a face in two fixations (Hsiao & Cottrell, 2008) Can we learn a policy for sampling from an image efficiently? CSE 190 6/6/2018

The Recurrent Attention Model Gary Cottrell & * 07/16/96 The Recurrent Attention Model Researchers at Deep Mind (purchased by Google for $400,000,000 in 2014) have developed a network that can “move its eyes” and recognize multiple objects in an image. (Ba, Mnih, and Kavukcuoglu (2015), ICLR 2015) It is trained end-to-end to sample from an image, decide the next location to look at, and output a classification Initially used to read street addresses CSE 190 APA Talk, August 2000: The face of fear: *

The Recurrent Attention Model Gary Cottrell & * 07/16/96 The Recurrent Attention Model CSE 190 APA Talk, August 2000: The face of fear: *

The Recurrent Attention Model Gary Cottrell & * 07/16/96 The Recurrent Attention Model Start Here CSE 190 APA Talk, August 2000: The face of fear: *

The Recurrent Attention Model Gary Cottrell & * 07/16/96 The Recurrent Attention Model The little arrow from the little picture is actually a 3-layer convnet with no pooling, that learns from a coarse version of the image to create the initial state of the recurrent network that decides where to look next. CSE 190 APA Talk, August 2000: The face of fear: *

The Recurrent Attention Model Gary Cottrell & * 07/16/96 The Recurrent Attention Model The controller network CSE 190 APA Talk, August 2000: The face of fear: *

The Recurrent Attention Model Gary Cottrell & * 07/16/96 The Recurrent Attention Model This is half of the recurrent network; the part that I’ll call the controller network – because it keeps the state of where we’ve looked and is input to the emission network to produce where to look next. It is an LSTM network CSE 190 APA Talk, August 2000: The face of fear: *

The Recurrent Attention Model Gary Cottrell & * 07/16/96 The Recurrent Attention Model From the controller network, the little arrow marked “emission” is really just a feedforward network with one hidden layer that learns to produce an (x,y) location of where to look next, based on the current state of the r(2) network. n is the time step CSE 190 APA Talk, August 2000: The face of fear: *

The Recurrent Attention Model Gary Cottrell & * 07/16/96 The Recurrent Attention Model So, the first computation is to take a coarse version of the image, run it through a convnet, which sets the initial state of the r(2) network, which feeds into the emission network, which produces a first fixation. CSE 190 APA Talk, August 2000: The face of fear: *

The Recurrent Attention Model Gary Cottrell & * 07/16/96 The Recurrent Attention Model This (x,y) location decides what patch of image is input to the “glimpse network” CSE 190 APA Talk, August 2000: The face of fear: *

The Recurrent Attention Model Gary Cottrell & * 07/16/96 The Recurrent Attention Model So, after training, it focuses on the first digit in the address. Glimpse network glimpse Input image CSE 190 APA Talk, August 2000: The face of fear: *

The Recurrent Attention Model Gary Cottrell & * 07/16/96 The Recurrent Attention Model This little arrow is a feed-forward convnet, with three convolutional layers followed by a fully-connected hidden layer…it is gated by the hidden layer of the location network, one to one (element-wise) glimpse Input image CSE 190 APA Talk, August 2000: The face of fear: *

The Story So Far… This section of the network: Gary Cottrell & * 07/16/96 The Story So Far… This section of the network: CSE 190 APA Talk, August 2000: The face of fear: *

The Story So Far… This section of the network: In more detail… is this Gary Cottrell & * The Story So Far… 07/16/96 This section of the network: X Y X Y X Y T=2 T=1 T=0 In more detail… is this CSE 190 APA Talk, August 2000: The face of fear: *

The Story So Far… This section of the network: In more detail… is this Gary Cottrell & * The Story So Far… 07/16/96 This section of the network: X Y X Y X Y T=2 T=1 T=0 In more detail… is this CSE 190 APA Talk, August 2000: The face of fear: *

The Story So Far… This section of the network: In more detail… is this Gary Cottrell & * The Story So Far… 07/16/96 This section of the network: In more detail… is this CSE 190 APA Talk, August 2000: The face of fear: *

The Story So Far… This section of the network: Gary Cottrell & * The Story So Far… 07/16/96 This section of the network: Again, the hidden units in emission network, with exactly the same number of hidden units as the first recurrent network, are one-to-one connected with multiplicative connections – that is, the hidden layer of the lower recurrent net is gated by the location network CSE 190 APA Talk, August 2000: The face of fear: *

The Story So Far… This section of the network: Gary Cottrell & * The Story So Far… 07/16/96 This section of the network: Note that this also gives a pathway for the error to propagate from the actual target network (which is fed by the lower recurrent net) back al the way to the hidden nodes of the emission network, but not the output of the emission network – the location. CSE 190 APA Talk, August 2000: The face of fear: *

The Recurrent Attention Model Gary Cottrell & * 07/16/96 The Recurrent Attention Model End Here (if done) Start Here CSE 190 APA Talk, August 2000: The face of fear: *

The Recurrent Attention Model Gary Cottrell & * 07/16/96 The Recurrent Attention Model How do we train this?? The y is compared to the target (presumably read out when the LSTM units are good and ready) And then backprop They actually stop the gradient calculation after the first mislabeled target – so shorter sequences first. This is sometimes called curriculum learning CSE 190 APA Talk, August 2000: The face of fear: *

The Recurrent Attention Model Gary Cottrell & * 07/16/96 The Recurrent Attention Model That takes care of the classification part, but what about the location part? Here, we can use reinforcement learning to reward the network when it picks a location that works well. The reinforcement signal is based on the fraction it gets right. CSE 190 APA Talk, August 2000: The face of fear: *

The Recurrent Attention Model Gary Cottrell & * 07/16/96 The Recurrent Attention Model But how do we even get started? We let the network choose random locations at first, to encourage it to explore. Then later we exploit what it has learned, and explore less. CSE 190 APA Talk, August 2000: The face of fear: *

The Recurrent Attention Model Gary Cottrell & * 07/16/96 The Recurrent Attention Model Start Here CSE 190 APA Talk, August 2000: The face of fear: *

So, what can we do with all this machinery?? We can find pairs of digits in images! (whoohoo!) (really? We did all this to do that???) Ok, yeah, well, we can do it better than anyone else! (ok, better than we did it last year…) CSE 190

How the network behaves CSE 190

But wait! There’s more! We can add those two digits (we couldn’t do that last year) CSE 190

But wait! There’s more! We can read street numbers!!! CSE 190

But wait! There’s more! We can read street numbers backwards!!! CSE 190

Was all that really necessary? CSE 190