Deep Face Recognition Omkar M. Parkhi Andrea Vedaldi Andrew Zisserman

Slides:



Advertisements
Similar presentations
On-the-fly Specific Person Retrieval University of Oxford 24 th May 2012 Omkar M. Parkhi, Andrea Vedaldi and Andrew Zisserman.
Advertisements

Simultaneous Image Classification and Annotation Chong Wang, David Blei, Li Fei-Fei Computer Science Department Princeton University Published in CVPR.
Object retrieval with large vocabularies and fast spatial matching
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
UPM, Faculty of Computer Science & IT, A robust automated attendance system using face recognition techniques PhD proposal; May 2009 Gawed Nagi.
Robust Real-time Object Detection by Paul Viola and Michael Jones ICCV 2001 Workshop on Statistical and Computation Theories of Vision Presentation by.
Predicting the Semantic Orientation of Adjective Vasileios Hatzivassiloglou and Kathleen R. McKeown Presented By Yash Satsangi.
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
A Significance Test-Based Feature Selection Method for the Detection of Prostate Cancer from Proteomic Patterns M.A.Sc. Candidate: Qianren (Tim) Xu The.
Richard Socher Cliff Chiung-Yu Lin Andrew Y. Ng Christopher D. Manning
Deep face recognition Omkar M. Parkhi, Andrea Vedaldi, Andrew Zisserman.
A NOVEL METHOD FOR COLOR FACE RECOGNITION USING KNN CLASSIFIER
Deep Convolutional Nets
Multiple Organ Detection in CT Volumes using CNN Week 3
Lecture 3b: CNN: Advanced Layers
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Understanding and Predicting Image Memorability at a Large Scale A. Khosla, A. S. Raju, A. Torralba and A. Oliva International Conference on Computer Vision.
Things iPhoto thinks are faces
When deep learning meets object detection: Introduction to two technologies: SSD and YOLO Wenchi Ma.
Recent developments in object detection
CNN architectures Mostly linear structure
Big data classification using neural network
A Discriminative Feature Learning Approach for Deep Face Recognition
Deeply learned face representations are sparse, selective, and robust
Object Detection based on Segment Masks
Compact Bilinear Pooling
Data Mining, Neural Network and Genetic Programming
DeepCount Mark Lenson.
Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek
A Pool of Deep Models for Event Recognition
Many slides and slide ideas thanks to Marc'Aurelio Ranzato and Michael Nielson.
Can Computer Algorithms Guess Your Age and Gender?
Lecture 24: Convolutional neural networks
Hybrid Features based Gender Classification
Natural Language Processing of Knee MRI Reports
Object Localization Goal: detect the location of an object within an image Fully supervised: Training data labeled with object category and ground truth.
Training Techniques for Deep Neural Networks
Efficient Deep Model for Monocular Road Segmentation
CS 698 | Current Topics in Data Science
Machine Learning: The Connectionist
FaceNet A Unified Embedding for Face Recognition and Clustering
R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.
ECE 599/692 – Deep Learning Lecture 6 – CNN: The Variants
State-of-the-art face recognition systems
Bird-species Recognition Using Convolutional Neural Network
Attributes and Simile Classifiers for Face Verification
Face Recognition with Deep Learning Method
NormFace:
Deep Learning Tutorial
“The Truth About Cats And Dogs”
Deep Learning Hierarchical Representations for Image Steganalysis
By: Behrouz Rostami, Zeyun Yu Electrical Engineering Department
Domingo Mery Department of Computer Science
Objects as Attributes for Scene Classification
Lecture: Deep Convolutional Neural Networks
Outline Background Motivation Proposed Model Experimental Results
Zip Codes and Neural Networks: Machine Learning for
Tuning CNN: Tips & Tricks
Object Tracking: Comparison of
A maximum likelihood estimation and training on the fly approach
Course Recap and What’s Next?
Meta Learning (Part 2): Gradient Descent as LSTM
Domingo Mery Department of Computer Science
VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION
Neural Machine Translation using CNN
Learning and Memorization
Presented By: Harshul Gupta
Sign Language Recognition With Unsupervised Feature Learning
Introduction Face detection and alignment are essential to many applications such as face recognition, facial expression recognition, age identification,
Adrian E. Gonzalez , David Parra Department of Computer Science
Presentation transcript:

Deep Face Recognition Omkar M. Parkhi Andrea Vedaldi Andrew Zisserman Visual Geometry Group Department of Engineering Science University of Oxford Present by Shih-Chuan Weng

Introduction In the present day, CNN have taken the computer vision community by storm, significantly improving the state of the art in many applications. One of the most important ingredients for the success of such methods is the availability of large quantities of training data. However, large scale public datasets have been lacking and, largely due to this factor, most of the recent advances in the community remain restricted to Internet giants such as Facebook and Google etc.

This picture shows that the datasets from two grand companies compare to others.

Contribution This paper made two contributions Designed a procedure that is able to assemble a large scale dataset, with small label noise. The second contribution was to show that a deep CNN, without any embellishments but with appropriate training, can achieve results comparable to the state of the art training.

Dataset Collection Stage 1. Bootstrapping and filtering a list of candidate identity names Obtain a list of names of candidate identities for obtaining faces from IMDB. (find 5000 name list, half male and female. 200 images per person using Google Image Search) The candidate list is then filtered to remove identities for which there are not enough distinct images, and to eliminate any overlap with standard benchmark datasets.( finally, there will be 2622 name obtained) Stage 2. Collecting more images for each identity.(queried in both Google and Bing Image Search, 2000 images per person)

Stage 3. Improving purity with an automatic filter This stage focus on removing any erroneous faces in each set automatically using a classifier Reference from https://www.robots.ox.ac.uk/~vgg/publications/2013/Simonyan13/simonyan13.pdf

Stage 4. Near duplicate removal Exact duplicate and Near duplicate, Images differing only in color balance, or with text superimposed, will be removed Reference from https://hal.inria.fr/inria-00633013/document/

This picture is AlexNet structure Dataset Collection This picture is AlexNet structure reference from https://www.nvidia.cn/content/tesla/pdf/machine-learning/imagenet-classification- with-deep-convolutional-nn.pdf Stage 5. Final manual filtering. To increase the purity (precision) of the data using human annotations. Although it said manual, it used AlexNet to calculate the score and find the top 375 high scores.

Dataset statistics after each stage of processing Type A and M specify whether the processing stage was carried out automatically or manually

Learning a face classifier

Architecture and training It’s like a N=2622 classification problem. The Architecture is vggNet, which contains classifier layer (W,b), known as parameters, in the end. The classifier error is calculated by softmax log-loss. Reference from https://arxiv.org/pdf/1409.1556.pdf While the above architecture can be used for face identity verification using the Euclidean distance to compare the images, in this paper they used triplet loss to improve the scores.

Triplet loss Triplet loss : A new loss function which contains a set of examples { anchor, positive, negative}, which are three images with loss value. The purpose is to make the same identity closer and distinct identity far away. In this paper, it aims at refining score vectors(xt = Wφ(lt) + b similar to linear regress Y = w*X+b) that perform well in the final application, get more accurate than the previous architecture .

Architecture and training They use lˆ2 -normalized, think of it as a surface, and affine projection. W’ which needs to be found and to be trained to minimize the empirical triplet loss. Optimize the left formula can get the W’. After getting the W’, calculate the right side formula to find novel softmax results . Reference from https://arxiv.org/pdf/1503.03832.pdf

The result of above experiment The result of above experiment. We can see that the last one with embedding learning has the highest accuracy. The fifth is the work of only vggNet network.

Results

LFW unrestricted setting we achieve comparable results to the state of the art whilst requiring less data (than DeepFace and FaceNet) and using a simpler network architecture (than DeepID-2,3). Note, DeepID3 results are for the test set with label errors corrected – which has not been done by any other method. The right side is ROC curves.

Thank You