Face Recognition in Video Int. Conf. on Audio- and Video-Based Biometric Person Authentication (AVBPA ’03) Guildford, UK June 9-11, 2003 Dr. Dmitry Gorodnichy.

Slides:



Advertisements
Similar presentations
On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.
Advertisements

High Performance Associative Neural Networks: Overview and Library High Performance Associative Neural Networks: Overview and Library Presented at AI06,
National Research Council Canada Conseil national de recherches Canada National Research Council Canada Conseil national de recherches Canada Canada Dmitry.
By: Mani Baghaei Fard.  During recent years number of moving vehicles in roads and highways has been considerably increased.
Kien A. Hua Division of Computer Science University of Central Florida.
Image Processing IB Paper 8 – Part A Ognjen Arandjelović Ognjen Arandjelović
Ping Gallivan Xiang Gao Eric Heinen Akarsh Sakalaspur Automated Coin Grader.
The Free Safety Problem Using Gaze Estimation as a Meaningful Input to a Homing Task Albert Goldfain CSE 668: Animate Vision Principles Final Project Presentation.
Reference Guide: Visual feedback provided by Nousor for different states NouseBoard + NousePad Touch corner for next letter selection Go to block containing.
Large-Scale, Real-World Face Recognition in Movie Trailers Week 2-3 Alan Wright (Facial Recog. pictures taken from Enrique Gortez)
Quadtrees, Octrees and their Applications in Digital Image Processing
Instructor: Mircea Nicolescu Lecture 13 CS 485 / 685 Computer Vision.
ICIP 2000, Vancouver, Canada IVML, ECE, NTUA Face Detection: Is it only for Face Recognition?  A few years earlier  Face Detection Face Recognition 
A Study of Approaches for Object Recognition
Quadtrees, Octrees and their Applications in Digital Image Processing
Signal Processing Institute Swiss Federal Institute of Technology, Lausanne 1 “OBJECTIVE AND SUBJECTIVE IDENTIFICATION OF INTERESTING AREAS IN VIDEO SEQUENCES”
Visual Cognition II Object Perception. Theories of Object Recognition Template matching models Feature matching Models Recognition-by-components Configural.
2007Theo Schouten1 Introduction. 2007Theo Schouten2 Human Eye Cones, Rods Reaction time: 0.1 sec (enough for transferring 100 nerve.
Stockman MSU Fall Computing Motion from Images Chapter 9 of S&S plus otherwork.
CS292 Computational Vision and Language Visual Features - Colour and Texture.
1 Motion in 2D image sequences Definitely used in human vision Object detection and tracking Navigation and obstacle avoidance Analysis of actions or.
December 2, 2014Computer Vision Lecture 21: Image Understanding 1 Today’s topic is.. Image Understanding.
Lecture 6: Feature matching and alignment CS4670: Computer Vision Noah Snavely.
A Vision-Based System that Detects the Act of Smoking a Cigarette Xiaoran Zheng, University of Nevada-Reno, Dept. of Computer Science Dr. Mubarak Shah,
 For many years human being has been trying to recreate the complex mechanisms that human body forms & to copy or imitate human systems  As a result.
Facial Recognition CSE 391 Kris Lord.
ICBV Course Final Project Arik Krol Aviad Pinkovezky.
Computer Vision Systems for the Blind and Visually Disabled. STATS 19 SEM Talk 3. Alan Yuille. UCLA. Dept. Statistics and Psychology.
CIS 601 Fall 2004 Introduction to Computer Vision and Intelligent Systems Longin Jan Latecki Parts are based on lectures of Rolf Lakaemper and David Young.
A Model of Saliency-Based Visual Attention for Rapid Scene Analysis Laurent Itti, Christof Koch, and Ernst Niebur IEEE PAMI, 1998.
MIND: The Cognitive Side of Mind and Brain  “… the mind is not the brain, but what the brain does…” (Pinker, 1997)
Knowledge Systems Lab JN 9/10/2002 Computer Vision: Gesture Recognition from Images Joshua R. New Knowledge Systems Laboratory Jacksonville State University.
Computational Video Group From recognition in brain to recognition in perceptual vision systems. Case study: face in video. Example: identifying computer.
Multimodal Interaction Dr. Mike Spann
CPSC 601 Lecture Week 5 Hand Geometry. Outline: 1.Hand Geometry as Biometrics 2.Methods Used for Recognition 3.Illustrations and Examples 4.Some Useful.
Active Vision Key points: Acting to obtain information Eye movements Depth from motion parallax Extracting motion information from a spatio-temporal pattern.
CIS 601 Fall 2003 Introduction to Computer Vision Longin Jan Latecki Based on the lectures of Rolf Lakaemper and David Young.
Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.
EFFICIENT ROAD MAPPING VIA INTERACTIVE IMAGE SEGMENTATION Presenter: Alexander Velizhev CMRT’09 ISPRS Workshop O. Barinova, R. Shapovalov, S. Sudakov,
Introduction EE 520: Image Analysis & Computer Vision.
Computer Science Department Pacific University Artificial Intelligence -- Computer Vision.
Recognition in Video: Recent Advances in Perceptual Vision Technology Dept. of Computing Science, U. Windsor 31 March 2006 Dr. Dmitry O. Gorodnichy Computational.
Quadtrees, Octrees and their Applications in Digital Image Processing.
Access Control Via Face Recognition. Group Members  Thilanka Priyankara  Vimalaharan Paskarasundaram  Manosha Silva  Dinusha Perera.
Visual Information Systems Recognition and Classification.
Recognizing faces in video: Problems and Solutions NATO Workshop on "Enhancing Information Systems Security through Biometrics" October 19, 2004 Dr. Dmitry.
Autonomous Robots Vision © Manfred Huber 2014.
Jack Pinches INFO410 & INFO350 S INFORMATION SCIENCE Computer Vision I.
Computer Science Readings: Reinforcement Learning Presentation by: Arif OZGELEN.
Content-Based Image Retrieval (CBIR) By: Victor Makarenkov Michael Marcovich Noam Shemesh.
Colour and Texture. Extract 3-D information Using Vision Extract 3-D information for performing certain tasks such as manipulation, navigation, and recognition.
Stas Goferman Lihi Zelnik-Manor Ayellet Tal Technion.
Final Year Project Vision based biometric authentication system By Padraic ó hIarnain.
Journal of Visual Communication and Image Representation
Designing High-Capacity Neural Networks for Storing, Retrieving and Forgetting Patterns in Real-Time Dmitry O. Gorodnichy IMMS, Cybernetics Center of Ukrainian.
Image features and properties. Image content representation The simplest representation of an image pattern is to list image pixels, one after the other.
Face recognition in video as a new biometrics modality & the associative memory framework Talk for IEEE Computation Intelligence Society (Ottawa Chapter)
Face Recognition Technology By Catherine jenni christy.M.sc.
Face Detection 蔡宇軒.
Over the recent years, computer vision has started to play a significant role in the Human Computer Interaction (HCI). With efficient object tracking.
Visual Information Retrieval
Paper – Stephen Se, David Lowe, Jim Little
Common Classification Tasks
CEN3722 Human Computer Interaction Displays
Iterative Optimization
On Facial Recognition in Video
Paper Reading Dalong Du April.08, 2011.
Fourier Transform of Boundaries
Towards building user seeing computers
Presentation transcript:

Face Recognition in Video Int. Conf. on Audio- and Video-Based Biometric Person Authentication (AVBPA ’03) Guildford, UK June 9-11, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology National Research Council Canada

2 What makes FR in video special ? Constraints: - Real-time processing is required. - Low resolution: 160x120 images or mpeg-decoded. - Low-quality: week exposure, blurriness, cheap lenses. Importance: - Video is becoming ubiquitous. Cameras are everywhere. - For security, computer–human interaction, video-conferencing, entertainment … Essence: - It is inherently dynamic! - It has parallels with biological vision! NB: Living organisms also process very poor images*, yet they are very successful in tracking, detection and recognition. * - except for a very small area (fovea)

3 Lessons from biological vision Images are of very low resolution except in the fixation point. The eyes look at points which attract visual attention. disparity Saliency is: in a) motion, b) colour, c) disparity, d) intensity. These channels are processed independently in brain (Think of a frog catching a fly or a bull running on a torero) Intensity means: frequencies, orientation, gradient. Brain process the sequences of images rather than one image. - Bad quality of images is compensated by the abundance of images. Animals & humans perceive colour non-linearly. Colour & motion are used for segmentation. Intensity is used for recognition. Bottom-up (image driven) visual attention is very fast and precedes top-down (goal-driven) attention: 25ms vs 1sec.

4 Localization first. Then recognition Try to recognize a face at right What about the next one? What did you do? – -First you detected face-looking regions. -Then, if they were too small or badly orientated, you did nothing. Otherwise, you turned your face – right? -…to align your eyes with the eyes in the picture. -…since this was the coordinate system in which you stored the face. This is what biological vision does. - Localization (and tracking) of the object precedes its recognition - These tasks are performed by two different parts of visual cortex. So, why computer vision should not do the same?

5 These mesmerizing eyes Did you notice that you’ve started examining this slide by looking at the eyes (or circles) at left? - These pictured are sold commercially to capture infants attention. Now imagine that the eyes blinked … - For sure you’ll be looking at them! No wonder, animals and humans look at each other’s eyes. - This is apart from psychological reasons. Eyes are the most salient features on a face. Besides, there two of them, which creates a hypnotic effect (which is due to the fact that the saliency of a pixel just attended is inhibited to avoid attending it again soon.) Finally, they also the best (and the only) stable landmarks on a face which can be used a reference. Intra-ocular distance (IOD) make a very convenient unit of measurement!

6 Which part of the face is the most informative? What is the minimal size of a recognizable face? 1.By studying previous work: [CMU, MIT, UIUC, Fraunhofer, MERL, …] 2.By examining averaged faces: 3.By computing statistical relationship between face pixels in 1500 faces from the BioID Face Database: 9x9 12x12 16x1624x24 Using the RGB colours, each point in this 576x576 array shows, how frequently two pixels of the 24x24 face are darker one another, brighter one another or are the same (within a certain boundary) The presence of high contrast RGB colours in the image indicates the strong relationship between the face pixels. Such the strongest relationship is observed for 24x24 images centered on the eye as shown on the next slide.

7 Anthropometrics of face Surprised to the binary nature of our faces? But it’s true - Tested with 1500 faces from BioID face database and multiple experiments with perceptual user interfaces [Nouse’02, BlinkDet’03]. Do you also see that colour is not important for recognition? - while f or detection, it is. 2. IOD. IOD IOD

8 Canonical eye-centered face model Size 24 x 24 is sufficient for face memorization & recognition and is optimal for low-quality video and for fast processing. Canonical face model suitable for on-line Face Memorization and Recognition in video [Gorodnichy’03] 2.. IOD d IOD Procedure: after the eyes are located, the face is extracted from video and resized to the canonical 24x24 form, in which it is memorized or recognized. Canonical face model suitable for Face Recognition in documents [Identix’02]

9 Face Processing Tasks Hierarchy of face recognition tasksApplicability of 160x120 video to the tasks, according to face anthropometrics “Something yellow moves” Face Segmentation Facial Event Recognition Face Memorization Face Detection Face Tracking (crude) Face Classification Face Localization (precise) Face Identification “It’s a face” “It’s at (x,y,z,  ” “Lets follow it!” “It’s face of a child”“S/he smiles, blinks” “Face unknown. Store it!”“It’s Mila!” “I look and see…” … Face size ½ image ¼ image 1/8 image 1/16 image In pixels80x8040x4020x2010x10 Between eyes-IOD Eye size Nose size FS  b FD  b- FT  b- FL  b-- FER  b- FC  b- FM / FI  --  – good b – barely applicable - – not good (tested with Perceptual User Interfaces)

10 Perceptual Vision Interfaces Goal: To detect, track and recognize face and facial movements of the user. colour calibration face identification nose tracking (precise) blink detection face tracking (crude) face detection “click” event x y ( z  ) face classification face memorization Multi-channel video processing framework x y, z  PUI monitor binary event ON OFF recognition / memorization Unknown User!

11 Recent Advances in PUI 1. Nouse TM (Use Nose as Mouse) Face Tracking - based on tracking rotation-invariant convex shape nose feature [FGR’02] - head motion- and scale- invariant & sub-pixel precision "Nouse TM brings users with disabilities and video game fans one step closer to a more natural way of interacting hands-free with computers" - Silicon Valley North magazine, Jan 2002 "It is a convincing demonstration of the potential uses of cameras as natural interfaces." - The Industrial Physicist, Feb Eye blink detection in moving heads - based on computing second-order change [Gorodnichy’03] & non-linear change detection [Durucan’01] - is currently used to enable people with brain injury [AAATE’03] 1 & 2: After each blink, eyes and nose positions are retrieved. If they form an equilateral triangle (i.e.face is parallel to image plane), than face is extracted and recognized / memorized. Figure 3. Commonly used first-order change (left image) has many pixels due to head motion (shown in the middle). Second-order change (right image) detects the local change only (change in a change), making it possible to detect eye blinks in moving heads, which was previously not possible. t-2t-1t Figure 1. This logo of the Nouse TM Technology website is written by nose. Figure 2. A camera tracks the point of each player’s nose closest to the camera and links it to the red “bat” at the top (or bottom) of the table to return the omputer ball across the “net.” (The Industrial Physicist)

12 Recognition with Associative Memory We use Pseudo-Inverse Associative Memory for on-line memorization and storing of faces in video. The advantages of this memory over others, as well as the Cpp code, are available from our website. Main features: –It stores binary patterns as attractors. –The accociativity is achieved by converging from any state to an attractor: –Faces are made attractors by using the Pseudo-Inverse learning rule: C = VV + or –Saturation of the network is avoided by using the desaturation technique [Gorodnichy’95]: C ii = D  C ii (0<D<1) or Converting 24x24 face to binary feature vector: A) V i =I i - I ave, B ) V i,j =sign(I i - I j ), C ) V i,j =Viola(i,j,k,l ), D ) V i,j =Haar(i,j,k,l ) PINN website:

13 Summary & Demos than localized and transformed to the canonical 24x24 representation, than recognized using the PINN associative memory trained pixel differences. In experiments: With 63 faces from BioD database and 9 faces of our lab users (all of which are shown) stored, the system has no problem recognizing our users after a single (or several) blinks. In many cases, as a user involuntary blinks, s/he is even not aware of the fact that his/her face is memorized / recognized. More at The face is detected: using motion at far range (non-linear change detection), using colour at close range (non-linear colour mapping to perceptual uniform space) than tracked until convenient for recognition: using blink detection and nose tracking E.g. images retrieved from blink (at left) are recognized as the right image