Introduction of MATRIX CAPSULES WITH EM ROUTING

Slides:

Advertisements

Similar presentations

Face Recognition: A Convolutional Neural Network Approach

Advertisements

Why equivariance is better than premature invariance

ImageNet Classification with Deep Convolutional Neural Networks

AN ANALYSIS OF SINGLE- LAYER NETWORKS IN UNSUPERVISED FEATURE LEARNING [1] Yani Chen 10/14/

Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.

Convolutional Neural Networks for Image Processing with Applications in Mobile Robotics By, Sruthi Moola.

Rotation Invariant Neural-Network Based Face Detection

CSC321: Introduction to Neural Networks and Machine Learning Lecture 22: Transforming autoencoders for learning the right representation of shapes Geoffrey.

CSC2535: 2013 Advanced Machine Learning Taking Inverse Graphics Seriously Geoffrey Hinton Department of Computer Science University of Toronto.

CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 6: Applying backpropagation to shape recognition Geoffrey Hinton.

ImageNet Classification with Deep Convolutional Neural Networks Presenter: Weicong Chen.

Object Recognition Tutorial Beatrice van Eden - Part time PhD Student at the University of the Witwatersrand. - Fulltime employee of the Council for Scientific.

Convolutional Neural Network

Deep Learning Overview Sources: workshop-tutorial-final.pdf

Parsing Natural Scenes and Natural Language with Recursive Neural Networks INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML 2011) RICHARD SOCHER CLIFF.

Xintao Wu University of Arkansas Introduction to Deep Learning 1.

Learning to Compare Image Patches via Convolutional Neural Networks SERGEY ZAGORUYKO & NIKOS KOMODAKIS.

Another Example: Circle Detection

Big data classification using neural network

Deep Learning for Dual-Energy X-Ray

Convolutional Neural Network

Deep Feedforward Networks

The Relationship between Deep Learning and Brain Function

CS 6501: 3D Reconstruction and Understanding Convolutional Neural Networks Connelly Barnes.

Compact Bilinear Pooling

an introduction to: Deep Learning

Data Mining, Neural Network and Genetic Programming

Computer Science and Engineering, Seoul National University

DeepCount Mark Lenson.

Jure Zbontar, Yann LeCun

A Neural Approach to Blind Motion Deblurring

CSCI 5922 Neural Networks and Deep Learning: Convolutional Nets For Image And Speech Processing Mike Mozer Department of Computer Science and Institute.

Spring Courses CSCI 5922 – Probabilistic Models (Mozer) CSCI Mind Reading Machines (Sidney D’Mello) CSCI 7000 – Human Centered Machine Learning.

Intelligent Information System Lab

Classification of Hand-Written Digits Using Scattering Convolutional Network Dongmian Zou Advisor: Professor Radu Balan.

Lecture 5 Smaller Network: CNN

Training Techniques for Deep Neural Networks

Multiple Wavelet Coefficients Fusion in Deep Residual Networks for Fault Diagnosis

Dynamic Routing Using Inter Capsule Routing Protocol Between Capsules

Introduction to Deep Learning for neuronal data analyses

By: Kevin Yu Ph.D. in Computer Engineering

Introduction to Neural Networks

Image Classification.

Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong

Neural Networks Advantages Criticism

CSC 578 Neural Networks and Deep Learning

Towards Understanding the Invertibility of Convolutional Neural Networks Anna C. Gilbert1, Yi Zhang1, Kibok Lee1, Yuting Zhang1, Honglak Lee1,2 1University.

Deep learning Introduction Classes of Deep Learning Networks

Very Deep Convolutional Networks for Large-Scale Image Recognition

[Figure taken from googleblog

Object Detection Creation from Scratch Samsung R&D Institute Ukraine

Outline Background Motivation Proposed Model Experimental Results

Neural Speech Synthesis with Transformer Network

Introduction to Artificial Intelligence Lecture 24: Computer Vision IV

Problems with CNNs and recent innovations 2/13/19

CSCI 5922 Neural Networks and Deep Learning: Convolutional Nets For Image And Speech Processing Mike Mozer Department of Computer Science and Institute.

Face Recognition: A Convolutional Neural Network Approach

An introduction to: Deep Learning aka or related to Deep Neural Networks Deep Structural Learning Deep Belief Networks etc,

Department of Computer Science Ben-Gurion University of the Negev

Automatic Handwriting Generation

Human-object interaction

Deep Object Co-Segmentation

CS295: Modern Systems: Application Case Study Neural Network Accelerator Sang-Woo Jun Spring 2019 Many slides adapted from Hyoukjun Kwon‘s Gatech “Designing.

VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION

Debasis Bhattacharya, JD, DBA University of Hawaii Maui College

Multi-UAV to UAV Tracking

Example of training and deployment of deep convolutional neural networks. Example of training and deployment of deep convolutional neural networks. During.

CSC 578 Neural Networks and Deep Learning

Random Neural Network Texture Model

Directional Occlusion with Neural Network

Presentation transcript:

Introduction of MATRIX CAPSULES WITH EM ROUTING 1st May 2018

Who Am I? 學歷經歷期刊文章 2002, 中國文化大學地理系學士 2006, 臺灣大學地理環境資源學系碩士台大空間資訊研究中心遙測及資料加值組組長台大空間資訊研究中心資料加值組研究助理、組長財團法人空間及環境科技文教基金會遙測及資料加值組組長財團法人空間及環境科技文教基金會遙測組、資料加值組研究助理臺灣觀光學院休閒管理系講師私立大同高級中學兼任地理教師期刊文章 Tzu-How Chu, Meng-Lung Lin* and Chia-Hao Chang （2012）mGUIDING (Mobile Guiding) — Using a Mobile GIS APP For Guiding, Scandinavian Journal of Hospitality and Tourism, 12(3): 269-283. Tzu-How Chu, Meng-Lung Lin, Chia-Hao Chang, Cheng-Wu Chen（2011）, Developing a tour guiding information system for tourism service using mobile GIS and GPS techniques, Advances in Information Sciences and Service Sciences, Vol. 3, No. 6, pp. 49 ~ 58 張家豪、朱子豪、劉英毓，2005，應用高解像力遙測影像於台北市屋頂加蓋物之監測，台灣地理資訊學刊，第三期，pp.15-26

Geoffrey E. Hinton —The Godfather of AI British cognitive psychologist and computer scientist. Emeritus Professor at the Dept. of Computer Science, University of Toronto. AI: Artificial Intelligence

Convolutional Neural Networks (CNNs) https://chtseng.wordpress.com/2017/09/12/%E5%88%9D%E6%8E%A2%E5%8D%B7%E7%A9%8D%E7%A5%9E%E7%B6%93%E7%B6%B2%E8%B7%AF/

ReLU https://chtseng.wordpress.com/2017/09/12/%E5%88%9D%E6%8E%A2%E5%8D%B7%E7%A9%8D%E7%A5%9E%E7%B6%93%E7%B6%B2%E8%B7%AF/

https://chtseng. wordpress https://chtseng.wordpress.com/2017/09/12/%E5%88%9D%E6%8E%A2%E5%8D%B7%E7%A9%8D%E7%A5%9E%E7%B6%93%E7%B6%B2%E8%B7%AF/

Calculated from training data https://chtseng.wordpress.com/2017/09/12/%E5%88%9D%E6%8E%A2%E5%8D%B7%E7%A9%8D%E7%A5%9E%E7%B6%93%E7%B6%B2%E8%B7%AF/ https://brohrer.mcknote.com/zh-Hant/how_machine_learning_works/how_convolutional_neural_networks_work.html

Human face CNNs thought Human face we imaged Human face CNNs thought https://medium.com/ai%C2%B3-theory-practice-business/understanding-hintons-capsule-networks-part-i-intuition-b4b559d1159b

What Is Capsule? Instead of aiming for viewpoint invariance in the activities of “neurons” that use a single scalar output to summarize the activities of a local pool of replicated feature detectors, artificial neural networks should use local “capsules” that perform some quite complicated internal computations on their inputs and then encapsulate the results of these computations into a small vector of highly informative outputs. Each capsule learns to recognize an implicitly defined visual entity over a limited domain of viewing conditions and deformations and it outputs both the probability that the entity is present within its limited domain and a set of “instantiation parameters” that may include the precise pose, lighting and deformation of the visual entity relative to an implicitly defined canonical version of that entity. --Geoffrey E. Hinton et al., 2011, Transforming Auto-Encoders, ICANN 2011

Differences between CNNs and Capsules An important difference between capsules and standard neural nets is that the activation of a capsule is based on a comparison between multiple incoming pose predictions whereas in a standard neural net it is based on a comparison between a single incoming activity vector and a learned weight vector.

How Capsules Work? Each capsule has a 4x4 pose matrix, M, and an activation probability, a. In between each capsule i in layer L and each capsule j in layer L+1 is a 4x4 trainable transformation matrix, Wij. The pose matrix of capsule i is transformed by Wij to cast a vote Vij = MiWij for the pose matrix of capsule j.

Expectation-Maximization Algorithm

SPREAD LOSS Use “spread loss” to directly maximize the gap between the activation of the target class (at) and the activation of the other classes to make the training less sensitive to the initialization and hyper-parameters of the model. L: total spread loss, Li: spread loss in capsule i, at: activation of target class, at: activation of other class i, m: margin

Capsules Architecture ReLU: Rectified Linear Unit

The Small NORB Dataset Contains images of 50 toys belonging to 5 generic categories: four-legged animals, human figures, airplanes, trucks, and cars. Imaged by two cameras under 6 lighting conditions, 9 elevations (30 to 70 degrees every 5 degrees), and 18 azimuths (0 to 340 every 20 degrees). NORB: NYU Object Recognition Benchmark

Experiments For training and test, 24300 stereo pairs of 96x96 pixels images were used for each. 5 kinds x 5 physical instances x 18 different azimuths x 9 elevations x 6 lighting conditions. Downsample from 96x96 pixels to 48x48 pixels. For training data Randomly crop 32x32 pixels and add random brightness and contrast. For test data Crop 32x32 pixels from center of the image.

ReLU Conv1 [14, 14, 32] PrimaryCaps ConvCaps1 ConvCaps2 Class Capsules With original image (32x32), 5x5 convolutional layer and Stride=2 32 channels ReLU PrimaryCaps [14, 14, 32] capsules Each capsule: 4x4 pose matrix + 1 activation for layer L Calculate M and a for layer L+1 by EM ConvCaps1 [6, 6, 32] capsules: 3x3 convolutional capsule layers and Stride=2 Calculate M and a for layer L+2 by EM ConvCaps2 [4, 4, 32] capsules: 3x3 convolutional capsule layers, Stride=1 Calculate M and a for layer L+2 by EM Class Capsules [5, 1] capsules for 5 types of toy. Full connected within layer L+2 Coordinate Addition Calculate M and a for Class Capsules by EM ReLu Conv1中，將原始影像用32種不同的kernel，以5x5的kernel間距為2的距離進行摺積計算，再將算完結果為負值的像元利用線性整流單元（Rectified Linear Unit，ReLU）改為0 PrimaryCaps中，一樣維持32個channels，但對於各channel中的每個像元，以4x4計算與32個channels之間的pose matrix及activation，再打包成該像元的32個capsules

Conclusions A new type of capsule system proposed. A new interactive routing procedure between capsule layers based on EM algorithm. Reduce errors by 45% (from 5.2% to 1.8%) from CNNs on the smallNORB data set.

Relations between Capsule and My Proposal (draft) Temporal change of different land cover and land use can be specified by integration of multi-source and multi-temporal remote sensing data. With multi-temporal remote sensing data, pixel value change in the same position could be described by pose transformation matrix.

The End