Convolutional Neural Networks

Slides:



Advertisements
Similar presentations
Face Recognition: A Convolutional Neural Network Approach
Advertisements

ImageNet Classification with Deep Convolutional Neural Networks
Tiled Convolutional Neural Networks TICA Speedup Results on the CIFAR-10 dataset Motivation Pretraining with Topographic ICA References [1] Y. LeCun, L.
What is the Best Multi-Stage Architecture for Object Recognition? Ruiwen Wu [1] Jarrett, Kevin, et al. "What is the best multi-stage architecture for object.
Aula 5 Alguns Exemplos PMR5406 Redes Neurais e Lógica Fuzzy.
Methods in Leading Face Verification Algorithms
MACHINE LEARNING AND ARTIFICIAL NEURAL NETWORKS FOR FACE VERIFICATION
Convolutional Neural Networks for Image Processing with Applications in Mobile Robotics By, Sruthi Moola.
Presented by: Kamakhaya Argulewar Guided by: Prof. Shweta V. Jain
Hurieh Khalajzadeh Mohammad Mansouri Mohammad Teshnehlab
Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,
ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –(Finish) Backprop –Convolutional Neural Nets.
Deep Convolutional Nets
Timo Ahonen, Abdenour Hadid, and Matti Pietikainen
A shallow look at Deep Learning
A Theoretical Analysis of Feature Pooling in Visual Recognition Y-Lan Boureau, Jean Ponce and Yann LeCun ICML 2010 Presented by Bo Chen.
Convolutional Neural Network
1 Convolutional neural networks Abin - Roozgard. 2  Introduction  Drawbacks of previous neural networks  Convolutional neural networks  LeNet 5 
Lecture 3b: CNN: Advanced Layers
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Xintao Wu University of Arkansas Introduction to Deep Learning 1.
Yann LeCun Other Methods and Applications of Deep Learning Yann Le Cun The Courant Institute of Mathematical Sciences New York University
Big data classification using neural network
Sentiment analysis using deep learning methods
Demo.
Convolutional Neural Network
Deep Learning Amin Sobhani.
Compact Bilinear Pooling
Data Mining, Neural Network and Genetic Programming
Gradient-based Learning Applied to Document Recognition
Data Mining, Neural Network and Genetic Programming
ECE 5424: Introduction to Machine Learning
Learning Mid-Level Features For Recognition
Intelligent Information System Lab
Classification of Hand-Written Digits Using Scattering Convolutional Network Dongmian Zou Advisor: Professor Radu Balan.
Lecture 5 Smaller Network: CNN
Convolution Neural Networks
Training Techniques for Deep Neural Networks
Deep Learning Qing LU, Siyuan CAO.
Non-linear classifiers Neural networks
State-of-the-art face recognition systems
convolutional neural networkS
Computer Vision James Hays
CNNs and compressive sensing Theoretical analysis
Introduction to Neural Networks
Non-linear hypotheses
convolutional neural networkS
Deep learning Introduction Classes of Deep Learning Networks
Tips for Training Deep Network
Very Deep Convolutional Networks for Large-Scale Image Recognition
Smart Robots, Drones, IoT
Convolutional neural networks Abin - Roozgard.
Object Classes Most recent work is at the object level We perceive the world in terms of objects, belonging to different classes. What are the differences.
[Figure taken from googleblog
A Proposal Defense On Deep Residual Network For Face Recognition Presented By SAGAR MISHRA MECE
Neural Networks Geoff Hulten.
On Convolutional Neural Network
Lecture: Deep Convolutional Neural Networks
LECTURE 33: Alternative OPTIMIZERS
Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824
Convolutional Neural Networks
Face Recognition: A Convolutional Neural Network Approach
Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton
CSC 578 Neural Networks and Deep Learning
CS 534 Spring 2019 Machine Vision Showcase
Introduction to Neural Networks
CS295: Modern Systems: Application Case Study Neural Network Accelerator Sang-Woo Jun Spring 2019 Many slides adapted from Hyoukjun Kwon‘s Gatech “Designing.
VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION
Image recognition.
Patterson: Chap 1 A Review of Machine Learning
Presentation transcript:

Convolutional Neural Networks ConvNet ● ○ ○ ○ ○ ○ ○ ○ ○ ○ DeepFace ○ ○ ○ ○ ○ ○ ○ Convolutional Neural Networks Su-A Kim 12th August 2014

Table of contents Introduce Convolutional Neural Networks ConvNet ● ○ ○ ○ ○ ○ ○ ○ ○ ○ DeepFace ○ ○ ○ ○ ○ ○ ○ Table of contents Introduce Convolutional Neural Networks Introduce application paper : “DeepFace: Closing the Gap to Human-Level Performance in Face Verification”, CVPR 2014

Su-A Kim 12th August 2014 @CVLAB ConvNet ● ○ ○ ○ ○ ○ ○ ○ ○ ○ DeepFace ○ ○ ○ ○ ○ ○ ○ History In 1995, Yann LeCun and Yoshua Bengio introduced the concept of convolutional neural networks. Yann LeCun Yoshua Bengio 1989년은 back-propagation 등 neural network로 해결하려는 시도는 있었지만, 95년부터 CNN 사용

Convolution (Learned) Su-A Kim 12th August 2014 @CVLAB ConvNet ○ ● ○ ○ ○ ○ ○ ○ ○ ○ DeepFace ○ ○ ○ ○ ○ ○ ○ Recap of Convnet Feature maps Pooling Non-linearity Convolution (Learned) Input image Neural network with specialized connectivity structure Feed-forward: - Convolve input - Non-linearity (rectified linear) - Pooling (local max) Supervised Train convolutional filters by back-propagating classification error Convolution 하는 것은 filtering 하는 것과 같음 Slide: R.fergus

Connectivity & weight sharing depends on layer Su-A Kim 12th August 2014 @CVLAB ConvNet ○ ○ ● ○ ○ ○ ○ ○ ○ ○ DeepFace ○ ○ ○ ○ ○ ○ ○ Connectivity & weight sharing depends on layer All different weights All different weights Shared weights Shared weights 같은 색의 weight은 공유되는 weight임 Shared weights하면 좋은 점 ? Input의 위치(rotation, translation…)에 상관없게 feature를 추출할 수 있게 해줌 Convolution layer has much smaller number of parameters by local connection and weight sharing

Su-A Kim 12th August 2014 @CVLAB ConvNet ○ ○ ○ ● ○ ○ ○ ○ ○ ○ DeepFace ○ ○ ○ ○ ○ ○ ○ Convolution layer Detect the same feature at different positions in the input image features Input Filter (kernel) Feature map Slide: R.fergus

Non-linearity Tanh Sigmoid: 1/(1+exp(-x)) Su-A Kim 12th August 2014 @CVLAB ConvNet ○ ○ ○ ○ ● ○ ○ ○ ○ ○ DeepFace ○ ○ ○ ○ ○ ○ ○ Non-linearity Tanh Sigmoid: 1/(1+exp(-x)) Rectified linear (ReLU) : max(0,x) - Simplifies backprop - Makes learning faster - Make feature sparse → Preferred option 하이퍼볼릭 탄젠트 Slide: R.fergus

Su-A Kim 12th August 2014 @CVLAB ConvNet ○ ○ ○ ○ ○ ● ○ ○ ○ ○ DeepFace ○ ○ ○ ○ ○ ○ ○ Sub-sampling layer Spatial Pooling - Average or Max - Boureau et al. ICML’10 for theoretical analysis → Max가 더 좋다는 연구 Role of Pooling - Invariance to small transformations - reduce the effect of noises and shift or distortion Max Sum Slide: R.fergus

Feature maps after contrast normalization Su-A Kim 12th August 2014 @CVLAB ConvNet ○ ○ ○ ○ ○ ○ ● ○ ○ ○ DeepFace ○ ○ ○ ○ ○ ○ ○ Normalization Contrast normalization (between/across feature map) - Equalizes the features map → Detail하지 않은 feature를 잡아냄 Feature maps Feature maps after contrast normalization Slide: R.fergus

LeNet 5 C1,C3,C5 : Convolutional layer. (5 × 5 Convolution matrix.) Su-A Kim 12th August 2014 @CVLAB ConvNet ○ ○ ○ ○ ○ ○ ○ ● ○ ○ DeepFace ○ ○ ○ ○ ○ ○ ○ LeNet 5 C1,C3,C5 : Convolutional layer. (5 × 5 Convolution matrix.) S2 , S4 : Subsampling layer. (by factor 2) F6 : Fully connected layer. About 187,000 connection. About 14,000 trainable weight.

LeNet 5 노이즈에도 강건 Su-A Kim 12th August 2014 @CVLAB ConvNet ○ ○ ○ ○ ○ ○ ○ ○ ● ○ DeepFace ○ ○ ○ ○ ○ ○ ○ LeNet 5 노이즈에도 강건

About CNN’s A special kind of multi-layer neural networks. Su-A Kim 12th August 2014 @CVLAB ConvNet ○ ○ ○ ○ ○ ○ ○ ○ ○ ● DeepFace ○ ○ ○ ○ ○ ○ ○ About CNN’s A special kind of multi-layer neural networks. Implicitly extract relevant features. A feed-forward network that can extract topological properties from an image. Like almost every other neural networks CNNs are trained with a version of the back-propagation algorithm.

Su-A Kim 12th August 2014 @CVLAB ConvNet ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ DeepFace ● ○ ○ ○ ○ ○ ○ DeepFace: Closing the Gap to Human-Level Performance in Face Verification Yaniv Taigman, Ming Yang, Marc’ Aurelio Ranzato, Lior Wolf Facebook AI Research, Tel Aviv University 인간은 97.5% 다른 데이터 셋에도 일반화 할 수 있는 face representation (엄청나게 많은 얼굴 데이터셋 사용한 learning 기법 개발) Reach an accuracy of 97.35%

Architecture Face Alignment Representation(CNN) Su-A Kim 12th August 2014 @CVLAB ConvNet ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ DeepFace ● ○ ○ ○ ○ ○ ○ Architecture Face Alignment Representation(CNN)

Face Alignment (1) 2D alignment (2) 3D alignment 얼굴 영역 검출 후, 기준점 6개 추출 Su-A Kim 12th August 2014 @CVLAB ConvNet ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ DeepFace ○ ● ○ ○ ○ ○ ○ Face Alignment (1) 2D alignment 얼굴 영역 검출 후, 기준점 6개 추출 기준점 추출 : LBP histogram을 descriptor로 사용해서 미리 학습된 SVR(Support Vector Regressor)로 추출 (2) 3D alignment - 얼굴 영역을 먼저 검출한 후, 영역 내에서 6개(눈, 코, 입)의 fiducial point를 추출해 냄 - Fiducial point를 추출해내기 위해서 몇번의 반복 과정을 통해 refine함 - 각 과정은 image descriptor(LBP Histogram을 사용)로부터 point configuration을 예측하기위해 학습된 SVR(Support Vector Regressor)을 통해 추출해냄 - Induced similarity matrix T로 현재이미지를 새로운 이미지로 변형시켜서, 새로운 이미지에서 fiducial point를 다시 추출해 냄. 이런 방식을 계속 반복해서 정확한 fiducial point의 위치를 찾음 - 이 과정의 결과는 2D-aligned crop image(b)임 67개 landmark Landmark mapping 2D-3D align Frontalization 2D projection

Su-A Kim 12th August 2014 @CVLAB ConvNet ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ DeepFace ○ ○ ● ○ ○ ○ ○ Representation C1-M2-C3 Input 152x152 Low-level feature 추출 (simple edges and texture) Apply max-pooling only to the first convolution layer, why? 이 세 레이어는 edge와 texture와 같은 low-level feature를 추출하는 것이 목적 Max pooling layer(M2)에 대해서.. convolution network의 결과를 더 robust하게 해줌(더 구체적으로는, aligned facial image에 max pooling을 적용하면 registration error를 작게 해줌) 그런데 첫번째 convolution layer 결과에만 max pooling을 함 : 이유는? pooling을 여러번하면 구체적인 facial structure와 micro-texture의 정확한 위치에 대한 정보를 잃을 수 있기 때문에.. 이 세 레이어가 대부분의 계산에서 큰 부분인데도 파라미터가 적음 input을 그저 단순한 local features로 확장하기만 함

Representation L4-L5-L6 C1-M2-C3 (Locally connected) Su-A Kim 12th August 2014 @CVLAB ConvNet ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ DeepFace ○ ○ ○ ● ○ ○ ○ Representation 152x152 C1-M2-C3 L4-L5-L6 (Locally connected) Shared weights All different weights Low-level feature 추출 (simple edges and texture) Apply max-pooling only to the first convolution layer, why? Locally connected layer를 사용한 이유? : 각각의 영역들은 서로 다른 local statistic을 가짐 All different weights convolutional layer와 같이 filter bank를 사용하지만, feature map에서 모든 위치는 다른 종류의 filter를 학습함(CNN은 한 feature map에서의 모든 위치는 같은 종류의 filter를 학습) 눈과 눈썹 사이의 공간이 코와 입 사이의 공간과 매우 다른 appearance를 갖고 매우 높은 차별성을 갖고 있는 것과 같이, aligned image에서 각각의 영역들은 서로 다른 local statistic을 갖으니깐 locally layer을사용. (convolution의 spatial stationarity assumption을 만족하지 않음) local layer를 사용하면 feature extraction하는데는 영향을 주지 않으면서, training해야하는 파라미터의 수에 영향을 줄 수 있음

Representation C1-M2-C3 L4-L5-L6 (Locally connected) F7-F8 Su-A Kim 12th August 2014 @CVLAB ConvNet ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ DeepFace ○ ○ ○ ○ ● ○ ○ Representation 152x152 C1-M2-C3 L4-L5-L6 (Locally connected) F7-F8 (Fully connected) Low-level feature 추출 (simple edges and texture) Apply max-pooling only to the first convolution layer, why? Locally connected layer를 사용한 이유? 얼굴에서 떨어져 있는 부분에서 뽑힌 feature사이의 correlation을 구할 수 있음 Output of F7 : raw face representation feature vector Output of F8 : Class labels의 확률분포를 구하는데 사용됨 얼굴 이미지의 떨어져 있는 부분(눈의 위치와 모양과 입의 위치와 모양과 같이)에서 뽑힌 feature사이의 correlation을 구할 수 있는 레이어임 F7의 결과는 raw face representation feature vector로 사용됨 F8의 결과는 class labels의 확률분포를 구하는 K-way softmax로 보내짐

Training Correct class의 확률을 최대화 하는 것이 목적 Su-A Kim 12th August 2014 @CVLAB ConvNet ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ DeepFace ○ ○ ○ ○ ○ ● ○ Training Correct class의 확률을 최대화 하는 것이 목적 Back-propagation해서 파라미터를 최소화하고, stochastic gradient descent(SGD)를 사용해서 파라미터를 업데이트 neuron에서 activation function으로 tanh나 sigmoid를 사용하지 않고, ReLU(Rectified Linear Unit)을 사용 - 그래서 이 네트워크로 만들어진 feature들은 매우 sparse

Result Reduces the error of the previous best methods by more than 50% Su-A Kim 12th August 2014 @CVLAB ConvNet ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ DeepFace ○ ○ ○ ○ ○ ○ ● Result LFW : Labeled Faces in the Wild Database( de facto: 사실적임) YTF : Youtube Faces Reduces the error of the previous best methods by more than 50% Youtube에 100개정도 잘못 라벨링 된 것들이 있어서 그것까지 치면 92.5% 정도 됨

Su-A Kim 12th August 2014 @CVLAB ConvNet ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ DeepFace ○ ○ ○ ○ ○ ○ ● Reference [1] Bouchain, David. "Character recognition using convolutional neural networks.“ Institute for Neural Information Processing 2007 (2006). [2] Bouvrie, Jake. "Notes on convolutional neural networks." (2006). [3] Glorot, Xavier, Antoine Bordes, and Yoshua Bengio. "Deep sparse rectifier networks."  Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. JMLR W&CP Volume. Vol. 15. 2011. [4] Ahonen, Timo, Abdenour Hadid, and Matti Pietikainen. "Face description with local binary patterns: Application to face recognition."  Pattern Analysis and Machine Intelligence, IEEE Transactions on 28.12 (2006): 2037-2041. [5] Bengio, Yoshua. "Learning deep architectures for AI."  Foundations and trends® in Machine Learning 2.1 (2009): 1-127.