1 Rotation Invariant Face Detection Using Neural Network Lecturers: Mehdi Dehghani - Mahdy Bashary Supervisor: Dr. Bagheri Shouraki Spring 2007.

Slides:

Advertisements

Similar presentations

Patient information extraction in digitized X-ray imagery Hsien-Huang P. Wu Department of Electrical Engineering, National Yunlin University of Science.

Advertisements

Applications of one-class classification

Perceptron Learning Rule

1 Neural networks. Neural networks are made up of many artificial neurons. Each input into the neuron has its own weight associated with it illustrated.

Detecting Faces in Images: A Survey

CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.

Machine Learning Lecture 4 Multilayer Perceptrons G53MLE | Machine Learning | Dr Guoping Qiu1.

Automatic Feature Extraction for Multi-view 3D Face Recognition

Performance Evaluation Measures for Face Detection Algorithms Prag Sharma, Richard B. Reilly DSP Research Group, Department of Electronic and Electrical.

EE462 MLCV Lecture 5-6 Object Detection – Boosting Tae-Kyun Kim.

The Viola/Jones Face Detector (2001)

Lecture 5 Template matching

Simple Neural Nets For Pattern Classification

Aula 5 Alguns Exemplos PMR5406 Redes Neurais e Lógica Fuzzy.

RBF Neural Networks x x1 Examples inside circles 1 and 2 are of class +, examples outside both circles are of class – What NN does.

Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.

Rapid Object Detection using a Boosted Cascade of Simple Features

Automatic Face Recognition Using Color Based Segmentation and Intelligent Energy Detection Michael Padilla and Zihong Fan Group 16 EE368, Spring

Robust Real-time Object Detection by Paul Viola and Michael Jones ICCV 2001 Workshop on Statistical and Computation Theories of Vision Presentation by.

Classifiers for Recognition Reading: Chapter 22 (skip 22.3) Slide credits for this chapter: Frank Dellaert, Forsyth & Ponce, Paul Viola, Christopher Rasmussen.

Object Recognition Using Geometric Hashing

A Novel 2D To 3D Image Technique Based On Object- Oriented Conversion.

Smart Traveller with Visual Translator for OCR and Face Recognition LYU0203 FYP.

October 14, 2010Neural Networks Lecture 12: Backpropagation Examples 1 Example I: Predicting the Weather We decide (or experimentally determine) to use.

Artificial Neural Networks

Face Processing System Presented by: Harvest Jang Group meeting Fall 2002.

Scale-Invariant Feature Transform (SIFT) Jinxiang Chai.

Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.

Radial-Basis Function Networks

Face Detection and Recognition Readings: Ch 8: Sec 4.4, Ch 14: Sec 4.4

Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.

November 25, 2014Computer Vision Lecture 20: Object Recognition IV 1 Creating Data Representations The problem with some data representations is that the.

Convolutional Neural Networks for Image Processing with Applications in Mobile Robotics By, Sruthi Moola.

Radial Basis Function Networks

Artificial Neural Networks

Rotation Invariant Neural-Network Based Face Detection

Object Recognition in Images Slides originally created by Bernd Heisele.

Neural and Evolutionary Computing - Lecture 9 1 Evolutionary Neural Networks Design  Motivation  Evolutionary training  Evolutionary design of the architecture.

ECE738 Advanced Image Processing Face Detection IEEE Trans. PAMI, July 1997.

EE459 Neural Networks Examples of using Neural Networks Kasin Prakobwaitayakit Department of Electrical Engineering Chiangmai University.

School of Engineering and Computer Science Victoria University of Wellington Copyright: Peter Andreae, VUW Image Recognition COMP # 18.

Face Detection Ying Wu Electrical and Computer Engineering Northwestern University, Evanston, IL

Face Detection Using Large Margin Classifiers Ming-Hsuan Yang Dan Roth Narendra Ahuja Presented by Kiang “Sean” Zhou Beckman Institute University of Illinois.

Robust Real Time Face Detection

HCI/ComS 575X: Computational Perception Instructor: Alexander Stoytchev

The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images.

Jack Pinches INFO410 & INFO350 S INFORMATION SCIENCE Computer Vision I.

Learning to Detect Faces A Large-Scale Application of Machine Learning (This material is not in the text: for further information see the paper by P.

Face Detection using the Spectral Histogram representation By: Christopher Waring, Xiuwen Liu Department of Computer Science Florida State University Presented.

Face Image-Based Gender Recognition Using Complex-Valued Neural Network Instructor :Dr. Dong-Chul Kim Indrani Gorripati.

A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter )

Supervised learning network G.Anuradha. Learning objectives The basic networks in supervised learning Perceptron networks better than Hebb rule Single.

FACE DETECTION : AMIT BHAMARE. WHAT IS FACE DETECTION ? Face detection is computer based technology which detect the face in digital image. Trivial task.

CSC321 Lecture 5 Applying backpropagation to shape recognition Geoffrey Hinton.

CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 6: Applying backpropagation to shape recognition Geoffrey Hinton.

Face Detection Using Neural Network By Kamaljeet Verma ( ) Akshay Ukey ( )

Robodog Frontal Facial Recognition AUTHORS GROUP 5: Jing Hu EE ’05 Jessica Pannequin EE ‘05 Chanatip Kitwiwattanachai EE’ 05 DEMO TIMES: Thursday, April.

Evaluation of Gender Classification Methods with Automatically Detected and Aligned Faces Speaker: Po-Kai Shen Advisor: Tsai-Rong Chang Date: 2010/6/14.

Data Mining: Concepts and Techniques1 Prediction Prediction vs. classification Classification predicts categorical class label Prediction predicts continuous-valued.

CONTENTS:  Introduction.  Face recognition task.  Image preprocessing.  Template Extraction and Normalization.  Template Correlation with image database.

 Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems n Introduction.

Neural networks.

Neural Network Architecture Session 2

Data Mining, Neural Network and Genetic Programming

Mean Shift Segmentation

Convolutional Networks

Categorization by Learning and Combing Object Parts

Creating Data Representations

An introduction to: Deep Learning aka or related to Deep Neural Networks Deep Structural Learning Deep Belief Networks etc,

Presentation transcript:

1 Rotation Invariant Face Detection Using Neural Network Lecturers: Mehdi Dehghani - Mahdy Bashary Supervisor: Dr. Bagheri Shouraki Spring 2007

2 Agenda What’s face detection? Usages Face Detection Techniques in Grayscale Images Template-Based Face Detection with Neural Network  Structure  Router Network  Detector Network  Arbitration Among Multiple Networks  Empirical Results

3 Face Detection Face detection is a computer technology that determines the locations and sizes of human faces in arbitrary (digital) images. It detects facial features and ignores anything else, such as buildings, trees and bodies.

4 Usages Biometrics: often as a part of face recognition system Security Surveillance (e.g. for logging people passing area by saving their faces.) Image database Management (e.g. for make several picture of face in a database uniform by align face in the center of image)

5 Face Detection Techniques in Grayscale Images Template-based face detection: these techniques encode facial images directly in terms of pixel intensities. These images can be characterized by probabilistic models of the set of face images or implicitly by neural networks or other mechanisms.

6 Face Detection Techniques in Grayscale Images (cont.) Feature-based face detection: This approach based on extracting features and applying either manually or automatically generated rules for evaluating these features. (e.g.: finding place of eyes, mouth and nose and checking if nose is in the triangle made by eyes and mouth.)

7 Template-base Face Detection

8 Image Pyramid It’s used to detect faces larger than window size. It’s made by repeatedly reducing size of input image by subsampling. This amount of reduction in size in each stage is determined by invariance of detector network to scale.

9 Rotation Invariance Rotation invariance is ability to detect faces which are rotated in-plane

10 Rotation Invariance (cont.) The simplest would be to employ the upright face detection, by repeatedly rotating the input image in small increments and applying the detector to each rotated image. However, this would be an extremely computationally expensive procedure θ

11 Structure Image PyramidRouter NetworkDetector Network

12 Router Network First, the window is preprocessed using histogram equalization, and given to a router network. The rotation angle returned by the router is then used to rotate the window with the potential face to an upright position. Finally, the derotated window is preprocessed and passed to one or more detector networks.

13 Router Network (cont.) Derotator Compute Orientation

14 Output Angle Single Unit: The activation amount of a single output unit (usually either between 0-1 or -1 and +1) is mapped linearly between the range of to determine the angle of rotation. 1-of-N Encoding: N units are used to represent the output Each unit represents 360/N For example, if there were 180 units, and if unit 30 had the highest activation, this would indicate a rotation of 60.

15 Output Angle (cont.) If we presume there are vectors from center of circle to each units with length of pixel intensity. The direction of average vector of these vectors is interpreted as the angle of face.

16 Architecture The architecture for the router network consists of three layers, an input layer of 400 units, a hidden layer of 15 units, and an output layer of 36 units. Each layer is fully connected to the next. Each unit uses a hyperbolic tangent activation function, and the network is trained using the standard error backpropogation algorithm.

17 Generating training set The training examples are generated from a set of manually labelled example images containing 1048 faces. In each face, the eyes, tip of the nose, and the corners and center of the mouth are labelled. We first compute the average location for each of the labelled features over the entire training set. Then, each face is aligned with the average feature locations, by computing the rotation, translation, and scaling that minimizes the distances between the corresponding features. After iterating these steps a small number of times, the alignments converge.

18 Generating training set (cont.) …

19 Generating training set (cont.) Example upright frontal face images aligned to one another.

20 Training Router Network To generate the training set, the faces are rotated to a random orientation.

21 Training Router Network (cont.) Value[i]=cos(θ – i×10) i=0 i=35 θ

22 Review Derotator Compute Orientation

23 Detector Network at a glance It has a 20×20 pixel region of image as input and generates output ranging from 1 to -1 signifying absence or presence of a face.

24 The Preprocessing Light Correction: This process equalize light effects in different places of window. This compensate for a variety of lighting conditions. Histogram Equalization: Histogram equalization is performed on the window. This compensate for difference in camera input gains.

25 The Preprocessing

26 Detector Neural Network It uses multi-layer perceptron. There are three types of hidden units: four which look at 10 × 10 pixel subregions,16 which look at 5 × 5 pixel subregions, and six which look at overlapping 20 × 5 pixel horizontal stripes of pixels. In particular, the horizontal stripes allow the hidden units to detect such features as mouths or pairs of eyes, while the hidden units with square receptive fields might detect features such as individual eyes,the nose, or corners of the mouth.

27 Training Technique It uses backpropagation with momentum technique to train the network. The detectors have two sets of training examples: images which are faces, and images which are not. Training a neural network for the face detection task is challenging because of the difficulty in characterizing prototypical “nonface” images

28 Generating face images training set from each original image by randomly rotating the images (about their center points) up to 10º,scaling between 90 percent and 110 percent, translating up to half a pixel, and mirroring. The randomization gives the filter invariance to translations of less than a pixel, scalings of 20 percent and rotations up to 20º.

29 General non-face images Practically any image can serve as a nonface example because the space of nonface images is much larger than the space of face images. However, collecting a “representative” set of nonfaces is difficult.

30 A “bootstrap” training algorithm 1. Create an initial set of non-face images by generating 1000 random images. 2. Train the neural network to produce an output of +1,0 for the face examples, and -1,0 for the nonface examples. In the first iteration, the network’s weights are initialized random. After the first iteration, we use the weights computed by training in the previous iteration as the starting point. 3. Run the system on an image of scenery which contains no faces. Collect subimages in which the network incorrectly identifies a face (an output activation > 0,0). 4. Select up to 250 of these subimages at random, and add them into the training set as negative examples. Go to step 2.

31 An Example

32 An Example of Result

33 Refinement The raw output from a single network will contain a number of false detections. A strategy should be used to reduce number of false positives. There are two ways to improve the reliability of the detector: cleaning-up the outputs from an individual network, and arbitrating among multiple networks.

34 Clean-Up Heuristic The faces is detected at nearby position or scales, while false detections often occur with less consistency. These observation will lead to a heuristic which can eliminate false detections. If a particular location is correctly identified as a face, then all other detection locations which overlap it are likely to be errors, and therefore be eliminated. So we preserve the locations with the higher number of detections within a small neighborhood, and eliminate locations with fewer detections.

35 Illustration For Heuristic

36 The Result

37 Arbitration Among Multiple Network To reduce the number of false positives, we can apply multiple networks, and arbitrate between their outputs to produce the final decision. Each network is trained using the same algorithm with the same set of face examples, but with different random initial weights, random initial nonface images, and permutations of the order of presentation of the scenery images. The detection and false positive rates of the individual networks will be quite close. However, because of different training conditions and because of selfselection of negative training examples, the networks will have different biases and will make different errors.

38 Arbitration Among Multiple Network

39

40 Analysis of the Networks The output of the router network is used to derotate the input for the detector, the angular accuracy of the router must be compatible with the angular invariance of the detector.  To measure the accuracy of the router, we generated test example images based on the training images, with angles between -30º and 30º at 1º increments.  We applied the detector to the same set of test images as the router, and measured the fraction of faces which were correctly classified as a function of the angle of the face. Because 92% of errors range between -10 to 10 and our network detect about 90 percent of faces which are rotated between -10 and 10, the two networks are compatible.

41 Empirical Results Upright Test Set: There are a total of 130 images, with 511 faces (of which 469 are within 10º of upright). Rotated Test Set: There are 50 images containing 223 faces, of which 210 are at angles of more than 10º from upright.

42 Proposed System In current system we train detector network with the scenery images straightly fed to detector network. If we train our detector network with scenery images passed from the router network, the performance of system increases.

43 Exhaustive Search of Orientations To demonstrate the effectiveness of the router for rotation invariant detection, we applied the two sets of detector networks described above without the router. The detectors were instead applied at 18 different orientations (in increments of 20º) for each image location.

44 Upright Detection Accuracy To ensure that adding the capability to detect rotated face has not come with expense of losing accuracy in detecting upright faces. We apply upright face detector on test set image.

45 Comparison Our new system has a slightly lower detection rate on upright faces for two reasons.  First, the detector networks cannot recover from all the errors made by the router network.  Second, the detector networks which are trained with derotated negative examples are more conservative in signalling detections; this is because the derotation process makes the negative examples look more like faces, which makes the classification problem harder.

46

47

48 Movie Examples

49 References 1. H. A. Rowley, S. Baluja, and T. Kanade, “Neural Network-Based Face Detection,” IEEE Trans. PAMI, vol. 20, pp , Jan H.A. Rowley, S. Baluja, and T. Kanade, "Rotation Invariant Neural Network-Based Face Detection" Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp , H.A. Rowley, ”Neural Network Face Detection”, PhD Thesis, May Shumeet Baluja. Face detection with in-plane rotation: Early concepts and preliminary results. JPRC , Justsystem Pittsburgh Research Center, 1997.

50 Any Question?