CSC 578 Neural Networks and Deep Learning

Slides:

Advertisements

Similar presentations

A brief review of non-neural-network approaches to deep learning

Advertisements

Object Recognition with Features Inspired by Visual Cortex T. Serre, L. Wolf, T. Poggio Presented by Andrew C. Gallagher Jan. 25, 2007.

Why equivariance is better than premature invariance

ImageNet Classification with Deep Convolutional Neural Networks

Spatial Pyramid Pooling in Deep Convolutional

Overview of Back Propagation Algorithm

Radial Basis Function (RBF) Networks

Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.

Multiclass object recognition

Convolutional Neural Networks for Image Processing with Applications in Mobile Robotics By, Sruthi Moola.

CS654: Digital Image Analysis Lecture 3: Data Structure for Image Analysis.

Avoiding Segmentation in Multi-digit Numeral String Recognition by Combining Single and Two-digit Classifiers Trained without Negative Examples Dan Ciresan.

CSC321: Introduction to Neural Networks and Machine Learning Lecture 22: Transforming autoencoders for learning the right representation of shapes Geoffrey.

Dr. Z. R. Ghassabi Spring 2015 Deep learning for Human action Recognition 1.

CSC2535: 2013 Advanced Machine Learning Taking Inverse Graphics Seriously Geoffrey Hinton Department of Computer Science University of Toronto.

CSC321 Lecture 5 Applying backpropagation to shape recognition Geoffrey Hinton.

CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 6: Applying backpropagation to shape recognition Geoffrey Hinton.

Convolutional Restricted Boltzmann Machines for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov

ImageNet Classification with Deep Convolutional Neural Networks Presenter: Weicong Chen.

Object Recognition Tutorial Beatrice van Eden - Part time PhD Student at the University of the Witwatersrand. - Fulltime employee of the Council for Scientific.

Deep Learning Overview Sources: workshop-tutorial-final.pdf

Lecture 4b Data augmentation for CNN training

Facial Detection via Convolutional Neural Network Nathan Schneider.

When deep learning meets object detection: Introduction to two technologies: SSD and YOLO Wenchi Ma.

Big data classification using neural network

Hybrid Deep Learning for Reflectance Confocal Microscopy Skin Images

Convolutional Sequence to Sequence Learning

Unsupervised Learning of Video Representations using LSTMs

Learning to Compare Image Patches via Convolutional Neural Networks

Convolutional Neural Network

Deep Feedforward Networks

The Relationship between Deep Learning and Brain Function

Deep Neural Net Scenery Generation

Data Mining, Neural Network and Genetic Programming

Data Mining, Neural Network and Genetic Programming

CSC321: Neural Networks Lecture 22 Learning features one layer at a time Geoffrey Hinton.

DeepCount Mark Lenson.

Spring Courses CSCI 5922 – Probabilistic Models (Mozer) CSCI Mind Reading Machines (Sidney D’Mello) CSCI 7000 – Human Centered Machine Learning.

Classification with Perceptrons Reading:

Intelligent Information System Lab

Intro to NLP and Deep Learning

Lecture 5 Smaller Network: CNN

Deep Belief Networks Psychology 209 February 22, 2013.

CS6890 Deep Learning Weizhen Cai

Machine Learning: The Connectionist

Dynamic Routing Using Inter Capsule Routing Protocol Between Capsules

Introduction to Neural Networks

Goodfellow: Chap 6 Deep Feedforward Networks

Counting in Dense Crowds using Deep Learning

Convolutional Neural Networks

Deep learning Introduction Classes of Deep Learning Networks

Introduction of MATRIX CAPSULES WITH EM ROUTING

Object Classification through Deconvolutional Neural Networks

CSC 578 Neural Networks and Deep Learning

A Proposal Defense On Deep Residual Network For Face Recognition Presented By SAGAR MISHRA MECE

LECTURE 35: Introduction to EEG Processing

On Convolutional Neural Network

LECTURE 33: Alternative OPTIMIZERS

View Inter-Prediction GAN: Unsupervised Representation Learning for 3D Shapes by Learning Global Shape Memories to Support Local View Predictions 1,2 1.

Convolutional Neural Networks

Autoencoders hi shea autoencoders Sys-AI.

Problems with CNNs and recent innovations 2/13/19

CSC 578 Neural Networks and Deep Learning

Department of Computer Science Ben-Gurion University of the Negev

Automatic Handwriting Generation

Image recognition.

CSC 578 Neural Networks and Deep Learning

CSC 578 Neural Networks and Deep Learning

Random Neural Network Texture Model

Presentation transcript:

CSC 578 Neural Networks and Deep Learning Fall 2018/19 10. Capsule (Overview) Noriko Tomuro

Introduction to Capsule A Capsule Network (CapsNet) is a new approach proposed by Geoffrey Hinton (although his original idea dates back to 1990’s). CapsNets are intended to overcome the difficulty of CNNs, in particular MaxPooling. Downsampling by MaxPooling is effective for reducing the size of feature maps as well as for finding important features existing in the image. Noriko Tomuro

Identified features are location invariant, and that’s one of the strengths of MaxPooling. However, identified features are independent, and have lost spatial relationship between themselves. Also high level features (composed of low level features) are not robust to pose (translational and rotational) relationship. Noriko Tomuro

Solution: Capsel Networks Hinton himself stated that the fact that max pooling is working so well is a big mistake and a disaster: Solution: Capsel Networks Hinton took inspiration from a field that already solved that problem: 3D computer graphics. In 3D graphics, a pose matrix is a special technique to represent the relationships between objects. Poses are essentially matrices representing translation plus rotation. It also more closely mimic the human visual system, which creates a tree-like hierarchical structure for each focal point to recognize objects. “The pooling operation used in convolutional neural networks is a big mistake and the fact that it works so well is a disaster.” Noriko Tomuro

Part-Whole Hierarchies Any object is made of parts which themselves might be viewed as objects. The parts will have instantiation parameters, all the way down the parse tree. The object may be defined not only by set of parts that compose it, but also the relationship among their instantiation parameters. human arm torso leg thumb index finger wrist

https://www.slideshare.net/charlesmartin141/capsule-networks-84754653

https://www. slideshare https://www.slideshare.net/aureliengeron/introduction-to-capsule-networks-capsnets

https://www.slideshare.net/charlesmartin141/capsule-networks-84754653

https://twitter.com/KirkDBorne

Step 1: Input images. MNIST dataset. Step 2: Convolution. Apply 256 9x9 filters and obtain 256 20x20 feature maps (after ReLU). Step 1: Input images. MNIST dataset. https://medium.freecodecamp.org/understanding-capsule-networks-ais-alluring-new-architecture-bdb228173ddc

Step 3a: Primary Caps. Apply a 9x9x256 filter with stride 2. Each filter yields a 6x6 map (where (20-9)+1)/2 = 6). Do this 256 times => we get a stack of 256 6x6 maps. https://medium.freecodecamp.org/understanding-capsule-networks-ais-alluring-new-architecture-bdb228173ddc

We cut the stack up into 32 decks with 8 cards each deck. We can call this deck a “capsule layer.” Each capsule layer has 36 “capsules.” Each capsule has an array of 8 values. This is what we can call a “vector.” https://medium.freecodecamp.org/understanding-capsule-networks-ais-alluring-new-architecture-bdb228173ddc

These “capsules” are our new pixel. With a capsule we can store 8 values (not just 1) per location! That gives us the opportunity to store more information than just whether or not we found a shape in that spot. We store details we need to describe the shape: Type of shape Position Rotation Color Size We can call these “instantiation parameters.” With more complex images we will end up needing more details. They can include pose (position, size, orientation), deformation, velocity, albedo, hue, texture, and so on.

Noriko Tomuro

Learning in Capsule Then, how do we coax the network into actually wanting to learn these things? When training a traditional CNN, we only care about whether or not the model predicts the right classification. With a capsule network, we have something called a “reconstruction.” A reconstruction takes the vector we created and tries to recreate the original input image, given only this vector. We then grade the model based on how close the reconstruction matches the original image.

https://medium. freecodecamp https://medium.freecodecamp.org/understanding-capsule-networks-ais-alluring-new-architecture-bdb228173ddc

Step 3b: Squashing. Apply the Squashing function Step 3b: Squashing. Apply the Squashing function. This function scales the values of the vector so that only the length of the vector changes, not the angle. This way we can make the vector between 0 and 1 so it’s an actual probability. https://medium.freecodecamp.org/understanding-capsule-networks-ais-alluring-new-architecture-bdb228173ddc

https://medium. freecodecamp https://medium.freecodecamp.org/understanding-capsule-networks-ais-alluring-new-architecture-bdb228173ddc

This is what lengths of the capsule vectors look like after squashing This is what lengths of the capsule vectors look like after squashing. At this point it’s almost impossible to guess what each capsule is looking for. Keep in mind that each pixel is actually a vector of length 8. https://medium.freecodecamp.org/understanding-capsule-networks-ais-alluring-new-architecture-bdb228173ddc

Step 4: Routing by Agreement Step 4: Routing by Agreement. This step decides what information to send to the next level. Each capsule tries to predict the next layer’s activations based on itself: https://medium.freecodecamp.org/understanding-capsule-networks-ais-alluring-new-architecture-bdb228173ddc

Mapping From Capsules in One Layer to the Next Michael Mozer, http://www.cs.colorado.edu/~mozer/Teaching/syllabi/DeepLearningFall2017/

Capsule Coupling Michael Mozer, http://www.cs.colorado.edu/~mozer/Teaching/syllabi/DeepLearningFall2017/

Capsule Coupling/Agreement Michael Mozer, http://www.cs.colorado.edu/~mozer/Teaching/syllabi/DeepLearningFall2017/

Routing Algorithm: Probabilities of Capsules to be coupled The algorithm is essentially an EM clustering algorithm. Michael Mozer, http://www.cs.colorado.edu/~mozer/Teaching/syllabi/DeepLearningFall2017/

Step 5: DigitCaps. After agreement, we end up with ten 16 dimensional vectors, one vector for each digit. This matrix is our final prediction. The length of the vector is the confidence of the digit being found — the longer the better. https://medium.freecodecamp.org/understanding-capsule-networks-ais-alluring-new-architecture-bdb228173ddc

Step 6: Reconstruction. A 3-layer fully connected decoder Step 6: Reconstruction. A 3-layer fully connected decoder. The final activity vector is used to generate a reconstruction of the input image via a CNN decoder consisting of 3 fully connected layers. The reconstruction loss minimizes the sum of squared differences between the outputs of the logistic units and the pixel intensities. https://medium.freecodecamp.org/understanding-capsule-networks-ais-alluring-new-architecture-bdb228173ddc

https://medium. freecodecamp https://medium.freecodecamp.org/understanding-capsule-networks-ais-alluring-new-architecture-bdb228173ddc

The network is trained by minimizing the euclidean distance between the image and the output of a CNN that reconstructs the input from the output of the terminal capsules. The network is discriminatively trained, using iterative routing-by-agreement. Margin Loss Reconstruction Loss https://medium.freecodecamp.org/understanding-capsule-networks-ais-alluring-new-architecture-bdb228173ddc

Capsule – Future of ANN? Everybody agrees with the idea. But it has not been tested with other large data. The first results seem promising, but so far tested with a few datasets. Also the implemented systems are very slow to train. [Quora] “At this moment in time it is not possible to say whether capsule networks are the future for neural AI. Other experiments besides image classification will need to be conducted to proof that the techniques is robust for all other kinds of learning that involve other aspects of perception besides the visual one. … Lots more work has to be done in the structure of these learning architectures.” Noriko Tomuro