Introduction of MATRIX CAPSULES WITH EM ROUTING

Slides:



Advertisements
Similar presentations
Face Recognition: A Convolutional Neural Network Approach
Advertisements

Why equivariance is better than premature invariance
ImageNet Classification with Deep Convolutional Neural Networks
AN ANALYSIS OF SINGLE- LAYER NETWORKS IN UNSUPERVISED FEATURE LEARNING [1] Yani Chen 10/14/
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Convolutional Neural Networks for Image Processing with Applications in Mobile Robotics By, Sruthi Moola.
Rotation Invariant Neural-Network Based Face Detection
CSC321: Introduction to Neural Networks and Machine Learning Lecture 22: Transforming autoencoders for learning the right representation of shapes Geoffrey.
CSC2535: 2013 Advanced Machine Learning Taking Inverse Graphics Seriously Geoffrey Hinton Department of Computer Science University of Toronto.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 6: Applying backpropagation to shape recognition Geoffrey Hinton.
ImageNet Classification with Deep Convolutional Neural Networks Presenter: Weicong Chen.
Object Recognition Tutorial Beatrice van Eden - Part time PhD Student at the University of the Witwatersrand. - Fulltime employee of the Council for Scientific.
Convolutional Neural Network
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Parsing Natural Scenes and Natural Language with Recursive Neural Networks INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML 2011) RICHARD SOCHER CLIFF.
Xintao Wu University of Arkansas Introduction to Deep Learning 1.
Learning to Compare Image Patches via Convolutional Neural Networks SERGEY ZAGORUYKO & NIKOS KOMODAKIS.
Another Example: Circle Detection
Big data classification using neural network
Deep Learning for Dual-Energy X-Ray
Convolutional Neural Network
Deep Feedforward Networks
The Relationship between Deep Learning and Brain Function
CS 6501: 3D Reconstruction and Understanding Convolutional Neural Networks Connelly Barnes.
Compact Bilinear Pooling
an introduction to: Deep Learning
Data Mining, Neural Network and Genetic Programming
Computer Science and Engineering, Seoul National University
DeepCount Mark Lenson.
Jure Zbontar, Yann LeCun
A Neural Approach to Blind Motion Deblurring
CSCI 5922 Neural Networks and Deep Learning: Convolutional Nets For Image And Speech Processing Mike Mozer Department of Computer Science and Institute.
Spring Courses CSCI 5922 – Probabilistic Models (Mozer) CSCI Mind Reading Machines (Sidney D’Mello) CSCI 7000 – Human Centered Machine Learning.
Intelligent Information System Lab
Classification of Hand-Written Digits Using Scattering Convolutional Network Dongmian Zou Advisor: Professor Radu Balan.
Lecture 5 Smaller Network: CNN
Training Techniques for Deep Neural Networks
Multiple Wavelet Coefficients Fusion in Deep Residual Networks for Fault Diagnosis
Dynamic Routing Using Inter Capsule Routing Protocol Between Capsules
Introduction to Deep Learning for neuronal data analyses
By: Kevin Yu Ph.D. in Computer Engineering
Introduction to Neural Networks
Image Classification.
Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong
Neural Networks Advantages Criticism
CSC 578 Neural Networks and Deep Learning
Towards Understanding the Invertibility of Convolutional Neural Networks Anna C. Gilbert1, Yi Zhang1, Kibok Lee1, Yuting Zhang1, Honglak Lee1,2 1University.
Deep learning Introduction Classes of Deep Learning Networks
Very Deep Convolutional Networks for Large-Scale Image Recognition
[Figure taken from googleblog
Object Detection Creation from Scratch Samsung R&D Institute Ukraine
Outline Background Motivation Proposed Model Experimental Results
Neural Speech Synthesis with Transformer Network
Introduction to Artificial Intelligence Lecture 24: Computer Vision IV
Problems with CNNs and recent innovations 2/13/19
CSCI 5922 Neural Networks and Deep Learning: Convolutional Nets For Image And Speech Processing Mike Mozer Department of Computer Science and Institute.
Face Recognition: A Convolutional Neural Network Approach
An introduction to: Deep Learning aka or related to Deep Neural Networks Deep Structural Learning Deep Belief Networks etc,
Department of Computer Science Ben-Gurion University of the Negev
Automatic Handwriting Generation
Human-object interaction
Deep Object Co-Segmentation
CS295: Modern Systems: Application Case Study Neural Network Accelerator Sang-Woo Jun Spring 2019 Many slides adapted from Hyoukjun Kwon‘s Gatech “Designing.
VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION
Debasis Bhattacharya, JD, DBA University of Hawaii Maui College
Multi-UAV to UAV Tracking
Example of training and deployment of deep convolutional neural networks. Example of training and deployment of deep convolutional neural networks. During.
CSC 578 Neural Networks and Deep Learning
Random Neural Network Texture Model
Directional Occlusion with Neural Network
Presentation transcript:

Introduction of MATRIX CAPSULES WITH EM ROUTING 1st May 2018

Who Am I? 學歷 經歷 期刊文章 2002, 中國文化大學地理系 學士 2006, 臺灣大學地理環境資源學系 碩士 台大空間資訊研究中心遙測及資料加值組 組長 台大空間資訊研究中心資料加值組 研究助理、組 長 財團法人空間及環境科技文教基金會遙測及資料 加值組 組長 財團法人空間及環境科技文教基金會遙測組、資 料加值組 研究助理 臺灣觀光學院休閒管理系 講師 私立大同高級中學 兼任地理教師 期刊文章 Tzu-How Chu, Meng-Lung Lin* and Chia-Hao Chang (2012)mGUIDING (Mobile Guiding) — Using a Mobile GIS APP For Guiding, Scandinavian Journal of Hospitality and Tourism, 12(3): 269-283. Tzu-How Chu, Meng-Lung Lin, Chia-Hao Chang, Cheng-Wu Chen(2011), Developing a tour guiding information system for tourism service using mobile GIS and GPS techniques, Advances in Information Sciences and Service Sciences, Vol. 3, No. 6, pp. 49 ~ 58 張家豪、朱子豪、劉英毓,2005,應用高解像力 遙測影像於台北市屋頂加蓋物之監測,台灣地理 資訊學刊,第三期,pp.15-26

Geoffrey E. Hinton —The Godfather of AI British cognitive psychologist and computer scientist. Emeritus Professor at the Dept. of Computer Science, University of Toronto. AI: Artificial Intelligence

Convolutional Neural Networks (CNNs) https://chtseng.wordpress.com/2017/09/12/%E5%88%9D%E6%8E%A2%E5%8D%B7%E7%A9%8D%E7%A5%9E%E7%B6%93%E7%B6%B2%E8%B7%AF/

ReLU https://chtseng.wordpress.com/2017/09/12/%E5%88%9D%E6%8E%A2%E5%8D%B7%E7%A9%8D%E7%A5%9E%E7%B6%93%E7%B6%B2%E8%B7%AF/

https://chtseng. wordpress https://chtseng.wordpress.com/2017/09/12/%E5%88%9D%E6%8E%A2%E5%8D%B7%E7%A9%8D%E7%A5%9E%E7%B6%93%E7%B6%B2%E8%B7%AF/

Calculated from training data https://chtseng.wordpress.com/2017/09/12/%E5%88%9D%E6%8E%A2%E5%8D%B7%E7%A9%8D%E7%A5%9E%E7%B6%93%E7%B6%B2%E8%B7%AF/ https://brohrer.mcknote.com/zh-Hant/how_machine_learning_works/how_convolutional_neural_networks_work.html

Human face CNNs thought Human face we imaged Human face CNNs thought https://medium.com/ai%C2%B3-theory-practice-business/understanding-hintons-capsule-networks-part-i-intuition-b4b559d1159b

What Is Capsule? Instead of aiming for viewpoint invariance in the activities of “neurons” that use a single scalar output to summarize the activities of a local pool of replicated feature detectors, artificial neural networks should use local “capsules” that perform some quite complicated internal computations on their inputs and then encapsulate the results of these computations into a small vector of highly informative outputs. Each capsule learns to recognize an implicitly defined visual entity over a limited domain of viewing conditions and deformations and it outputs both the probability that the entity is present within its limited domain and a set of “instantiation parameters” that may include the precise pose, lighting and deformation of the visual entity relative to an implicitly defined canonical version of that entity. --Geoffrey E. Hinton et al., 2011, Transforming Auto-Encoders, ICANN 2011

Differences between CNNs and Capsules An important difference between capsules and standard neural nets is that the activation of a capsule is based on a comparison between multiple incoming pose predictions whereas in a standard neural net it is based on a comparison between a single incoming activity vector and a learned weight vector.

How Capsules Work? Each capsule has a 4x4 pose matrix, M, and an activation probability, a. In between each capsule i in layer L and each capsule j in layer L+1 is a 4x4 trainable transformation matrix, Wij. The pose matrix of capsule i is transformed by Wij to cast a vote Vij = MiWij for the pose matrix of capsule j.

Expectation-Maximization Algorithm

SPREAD LOSS Use “spread loss” to directly maximize the gap between the activation of the target class (at) and the activation of the other classes to make the training less sensitive to the initialization and hyper-parameters of the model. L: total spread loss, Li: spread loss in capsule i, at: activation of target class, at: activation of other class i, m: margin

Capsules Architecture ReLU: Rectified Linear Unit

The Small NORB Dataset Contains images of 50 toys belonging to 5 generic categories: four-legged animals, human figures, airplanes, trucks, and cars. Imaged by two cameras under 6 lighting conditions, 9 elevations (30 to 70 degrees every 5 degrees), and 18 azimuths (0 to 340 every 20 degrees). NORB: NYU Object Recognition Benchmark

Experiments For training and test, 24300 stereo pairs of 96x96 pixels images were used for each. 5 kinds x 5 physical instances x 18 different azimuths x 9 elevations x 6 lighting conditions. Downsample from 96x96 pixels to 48x48 pixels. For training data Randomly crop 32x32 pixels and add random brightness and contrast. For test data Crop 32x32 pixels from center of the image.

ReLU Conv1 [14, 14, 32] PrimaryCaps ConvCaps1 ConvCaps2 Class Capsules With original image (32x32), 5x5 convolutional layer and Stride=2 32 channels ReLU PrimaryCaps [14, 14, 32] capsules Each capsule: 4x4 pose matrix + 1 activation for layer L Calculate M and a for layer L+1 by EM ConvCaps1 [6, 6, 32] capsules: 3x3 convolutional capsule layers and Stride=2 Calculate M and a for layer L+2 by EM ConvCaps2 [4, 4, 32] capsules: 3x3 convolutional capsule layers, Stride=1 Calculate M and a for layer L+2 by EM Class Capsules [5, 1] capsules for 5 types of toy. Full connected within layer L+2 Coordinate Addition Calculate M and a for Class Capsules by EM ReLu Conv1中,將原始影像用32種不同的kernel,以5x5的kernel間距為2的距離進行摺積計算,再將算完結果為負值的像元利用線性整流單元(Rectified Linear Unit,ReLU)改為0 PrimaryCaps中,一樣維持32個channels,但對於各channel中的每個像元,以4x4計算與32個channels之間的pose matrix及activation,再打包成該像元的32個capsules

Conclusions A new type of capsule system proposed. A new interactive routing procedure between capsule layers based on EM algorithm. Reduce errors by 45% (from 5.2% to 1.8%) from CNNs on the smallNORB data set.

Relations between Capsule and My Proposal (draft) Temporal change of different land cover and land use can be specified by integration of multi-source and multi-temporal remote sensing data. With multi-temporal remote sensing data, pixel value change in the same position could be described by pose transformation matrix.

The End