A Proposal Defense On Deep Residual Network For Face Recognition Presented By SAGAR MISHRA MECE 15945
Face Recognition Faces represent complex, multidimensional, meaningful visual stimuli and developing a computational model for face recognition is difficult. It is a very active research topic because of its application in human-robot interaction, human- machine interfaces, driving safety, Security. Despite of the significant improvements, face recognition is still a challenging problem that wait for more and more accurate algorithms.
WHY DEEP RESIDUAL NETWORK ??
Problem Statement The deeper the network can cover more complex problems This is supposed to enhance the accuracy of the network. However, training the deeper network is more difficult because of two problem Vanishing Gradients : neuron can “die” in the training and become useless. This can cause information loss Optimization Difficulty : weights, biases increases due to increasing depth, training the network becomes very difficult.
Consequences of more deeper network When the number of layer increased than, there is increment in training error and test error Kaming He el. a [2] present the increment of layers in plain Convolutional network from 20 t0 56 and shows the increment of errors.
Deep residual Learning Plain net H(x) is any desired mapping, hope the 2 weight layer fit H(x) Residual nets H(x)=F(x)+x, F(x) is residual mapping
Research Objective To design a Face recognition system using Deep Residual Network To compare the performance of this network with the previous counterparts VGG-Face and best result of paper "Deeply learned face representations are sparse, selective, and robust“ The Face recognition system will decrease training period and training error using deep residual network. The comparison is based on following parameters Accuracy Training period Training error
System Model
Face Detection Detecting face in the set of images is the first step In our model we are going to use Histograms of Oriented Gradients (HOG) Steps for HOG transform Make image black and white For every single pixel look at the pixel that directly surround it. Find the direction in which pixel gets darker Every pixel is replaced with arrow called gradients
Preprocessing Faces can be aligned in different ways To account for this face landmark estimation algorithm invented in 2014 by Vahid Kazemi and Josephine Sullivan [5] will be used in our model. The basic idea is to identify 68 landmark points that exist in every face Finally Affine Transformation will be used to center the nose, eyes and mouth.
Feature Extraction Residual network will be used to train with available dataset and extract features for individual image residual network with depth of 19 layers will be trained. VGG-19 net will also be trained as a contrast for comparison purpose.
Deep Residual Block(ResBlock)
Convolution Neural Network Consists of input, output and hidden layers Hidden layers are Convolutional, Pooling and Fully connected Convolutional Layer Performs convolution and send output to next layer Consists of filters Filters are convoluted across length and width of input image to generate feature map Its is simply dot product
Convolution Neural Network Pooling Layer Also called down sampling Combines output of neuron clusters at one layer into a single neuron in next layer Max pooling uses maximum value from each of cluster of neuron. Fully connected layer Traditional Multi Layer Perceptron acts as output layer Connects every neuron in one layer to every neuron in another layer
Convolutional Neural Network
Rectified Linerar Unit (ReLU) element wise operation (applied per pixel) and replaces all negative pixel values in the feature map by zero
Face Recognition In the final step featured extracted from the image under test must be compared with the features stored in the face database to identify the face.
Training the network The entire network will be trained with Stochastic Gradient Descent (SGD) with backpropagation.
Training Dataset Dataset that will be used is LFW (Labeled Face in the Wild) Consists of faces Each face labeled with name of the person
Proposed Model
Performance Metric The result of this model will be comparison with VGG-Face and best result of the paper [6]. Our model will be compared with previous models against accuracy, training error, training period.
Tools The tools, programming language and software’s that will be used in this thesis work are listed below: Python programming Language Eclipse (pycharm) Caffe/ Keras platform
Expected Output Implementation of deep residual network for face recognition The result of the residual network will be evaluated and compared with other best models against accuracy, training error, training period
Schedule TasksMonth/Year(2017/2018) SepOctNovDecJanFeb Literature Review Proposal Defense System Design And Coding Mid-Term Defense Final Submission of Thesis Documentation of Thesis Research and Experiments
References [1] X. Z. S. R. He, Kaiming and J. Sun, “Deep residual learning for image recog-nition,”ICCV, [2] A. D. B. Steve Lawrence, “Face recognition: A convolutional neural networkapproach,”IEEE, 1997 [3] J. D. S. K. Y. Jia, E. Shelhamer, “Caffe: Convolutional architecture for fastfeature embedding,”arXiv: , 2014 [4] Z. C. Xiu Li, “Deep redisual network for plank classification,” ICCV, [5] A. Z. K. Simonyan, “Very deep convolutional networks for large- scale image recognition,”ICLR, , 6 [6] X. W. Y. Sun and X. Tang, “Deeply learned face representations are sparse, selective, and robust,”CoRR,abs/ , 2014 [7] S. I. Krizhevsky, A. and G. E. Hinton, “Imagenet classification with deep convolutional neural networks.” NIPS, 2012 [8] H. W. DMasaki Nakada and D. Terzopoulos, “Acfr:active face recognition using convolutional neural network,” 2017