The Analysis of Faces in Brains and Machines

The Analysis of Faces in Brains and Machines
CS 332 Visual Processing in Computer and Biological Vision Systems The Analysis of Faces in Brains and Machines - next few classes focus on analysis of faces, example of problem where substantial progress computer systems - automatic face detection/recognition (best by Google/Facebook), psychology - understanding human ability to recognize faces & how we represent information about faces in process, neuroscience – how face images represented and processed in brain? specialized network of patches of neural tissue involved in process of face recognition - begin by asking, why is analysis of faces important for humans? what aspects of intelligent behavior use information about faces? what kind of information can we infer from face? reflecting on your own experience, what are some answers to these questions? Rafael Reif stay tuned...

Why is face analysis important?
Remember/recognize people we’ve seen before Categorization – e.g. gender, race, age, kinship Social communication – emotions/mood, intentions, trustworthiness, competence or intelligence, attractiveness Scene understanding, e.g. direction of gaze suggests focus of attention - remember people seen before is important in our lives (family, friends, co-workers, acquaintances, famous people) - infer important information, e.g. gender, race, age, kinship (family resemblance) - important component of social communication – facial expression, convey emotions or overall mood, intentions, read a lot in faces, e.g. about whether person is trustworthy, competent, intelligent, attractive - contributes to our ability to understand what’s going on in a scene, e.g. determine direction of gaze of all these people and infer that their attention is focused on this computer, feel like can almost read their thoughts from expressions on their faces - hopefully convinced, analysis of faces is an important aspect of intelligence

Why is face recognition hard?
changing pose changing illumination aging - why hard? particular person can appear very different in image – some of the factors that lead to variability – different points of view or pose relative to the viewer, changing illumination (can make you look more like someone else than like yourself, fairly extreme illuminations here, but see in Tony Blair images from web), different accessories (glasses) - changing facial expression, age (young & adventurous to old & grumpy), hair changes - most of the time, people not looking right toward you or camera – going about their daily business, appearing with different poses, illuminations, size (seen at a distance, very small, low resolution) - scene cluttered with lots of objects, occlusion (bit of face next to Rafael Reif, security guard occluded) – recognition in unconstrained environments adds to challenge of detecting and recognizing faces clutter occlusion changing expression

How good are we at face recognition?
- so how good are we at face recognition, begin with demonstration (for people who have never seen this example before) - imagine given stack of cards, each contains one of these face images, asked to sort cards into piles correspond to different identities, all images of same person in one pile, different people in different piles, can’t mimic actual experimental setup, but take moment to scan through informally, estimate how many different people depicted here? - 5-10? >10? <5? answer is two (one author Jenkins), conducted formal study used array of images of two Dutch celebrities from web, subjects not familiar with people, on average created about ~ 7.5 piles, ranged 3-16 (no correct), but Dutch people familiar ~100% correct - key point, there’s important distinction between recognition of familiar vs. unfamiliar faces (if familiar, good at recognizing face over wide range of appearances that differ in illumination, expression, age, etc.) - if unfamiliar, not so good at making these generalizations Jenkins, White, Van Montfort & Burton, Cognition, 2011

Face recognition performance in humans
chance performance - psychologists assess face recognition ability many ways, one method becoming standard, Cambridge Face Memory Test, Duchaine & Nakayama, original clinical test diagnose prosopagnosia, inability to recognize faces (not even family), versions of test available online (can test own face recognition ability), e.g. testmybrain.org - Cambridge Face Memory Test measures ability to learn new face, images used in study cropped to remove hair so judgment based just on face, 3 views of each individual (frontal & turned +/- 30 deg), experiment starts with learning phase, see 3 views of same person, one after other for 3 sec each, then three test panels with one identical to face just learned & select, repeated for 6 different faces, people generally do well this phase > 95% - then given photo array of 6 faces, all frontal, study for 20 sec, 30 trials, each 3 faces, all novel (change pose or lighting from particular images seen previously), each contains one of 6 target faces learned, select which, then study 6 again for 20 secs, final 24 like second block by added noise - recent study by Wilmer & colleagues focused on individual differences between peoples’ performance on set of cognitive tasks including face recognition, in addition to CFMT, had test of ability to recognize famous people (designed way got around fact that may recognize but not remember name, not so critical) - data from lots of individuals (each dot different subject), % correct famous vs. learn new faces, chance 33% (1/3), things leap out, striking? range of performance, “I’m not good with faces” vs. super-recognizers - keep in mind when evaluate performance model vs. human, human variation at task testmybrain.org Wilmer et al., 2012 Duchaine & Nakayama, 2006

Face recognition performance in humans
Which of the 10 photos on the bottom depicts the target face? Viewers are ~ 70% correct Performance degrades with changes in pose, expression Only slight improvement with short video clip of target - one more study mention, reinforces challenge of recognizing new faces, given display with target face at top, array of 10 faces bottom, target may or may not be present, subject indicates whether present and if so, which one (trial may seem easy, which?), different photos, same pose, viewers only 70% correct, best possible viewing conditions, variations e.g. slightly different pose, expression performance worse, 5-sec videos only slight improvement, reinforces challenge of task, this context, computer vision systems can outperform human observers at task Importance of familiar vs. unfamiliar face recognition! Bruce et al., 1999

How good are the best machines?
Public databases of face images serve as benchmarks: Labeled Faces in the Wild (LFW, > 13,000 images of celebrities, 5,749 different identities YouTube Faces Database (YTF, 3,425 videos, 1,595 different identities Private face image datasets: (Facebook) Social Face Classification dataset 4.4 million face photos, 4,030 different identities (Google) million face images, ~ 8 million different identities - also ask, how good are best machines? how measure? many publicly available databases face images used as benchmarks to compare face recognition performance across systems, mention two, Labeled Faces in Wild (>13,000 images celebrities from web, 5,749 distinct individuals, some >1 sample), YouTube Faces Database (>3,000 videos, ~1500 identities), ~ hundreds papers comparing performance different algorithms on databases like this, mention two Facebook & Google (may have seen in news) - both systems use deep networks (hear more next week), deep networks are trained to recognize faces given vast set training data, collections face images only companies like Facebook & Google can amass (private of course), labeled with identity, typical task used to measure performance, given pair of images, same person or different? studies evaluated human performance on public databases, compare that to Facebook/Google, impression machines similar or better than human performance (mechanical Turk crowdsourcing), examples few errors Google FaceNet, false accept (machines say pair is same when not), false reject (same, but machine says not) - LFW not really so wild, limited variation in pose, illumination, accessories that occlude e.g. sunglasses, humans better in challenging conditions - if real aim is to understand how people learn to recognize faces, how process takes place in brain - not supervised learning with millions training examples, learn new faces to large extent from relatively little experience LFW YTF Facebook DeepFace 97.4% 91.4% Google FaceNet 99.6% 95.1% Human performance 97.5% 89.7%

Machine vision applications of face recognition
security, forensics access control surveillance - if could build machine vision system that’s good at this task, like human system appears to be, very lucrative business, many applications - e.g. applications related to security and forensics - hear a lot about use of face recognition systems in large public places like airports, government buildings - verify person’s identity to control access into computer or secure facility, confirm identify for airport travel, bank machines - different problems: for problem like access control, trying to establish 1:1 matching between user being viewed and individual in database, create favorable viewing conditions vs. surveillance (large database of faces, 1:N matching problem, one face image and many individuals in database, also may be viewing at a distance, covertly, under varying conditions)

More applications of face recognition
content-based image retrieval social media - whether we like it or not, face recognition capability is probably here to stay in social media, controversial – why? perspective of privacy/security - Google banned face recognition June 2013 (developers adding face recognition, Dubai police), stopped marketing Glass Jan 2015 (not popular) - want to be able to retrieve images of interest, not using usual textual tags like in past, search database of images based on visual content - having better understanding of things like human facial expression (how generated and inferred), having impact on creation of technologies like humanoid robots (Japanese robot, created by Intelligent robots lab at Osaka University and Tokyo-based robotics company Kokoro Co.) that can mimic human facial expression more naturally - leads to creation of more naturalistic rendering of faces using computer graphics (from nvidia face works), important for entertainment industry graphics, HCI humanoid robots

Aspects of face processing
Face detection – find image regions that contain faces Face identification – who is the person? Categorization – gender, age, race Facial expression – mood, emotion Non-verbal social perception and communication - let’s examine computational strategies for processing faces - noted earlier that there are many aspects of face analysis, going to focus on first two, start with couple examples of face recognition methods that were significant from historical perspective, then examine in some detail, one of common methods that’s been used for face detection in computer vision systems - other aspects things might consider exploring in final project

It all began with Takeo Kanade (1973)…
PhD thesis, Picture Processing System by Computer Complex and Recognition of Human Faces Special purpose algorithms to locate eyes, nose, mouth, boundaries of face ~ 40 geometric features, e.g. ratios of distances and angles between features - first major work on face recognition in computer vision, PhD thesis of Takeo Kanade 1973, back when I was graduating from high school (Kyoto Univ, now CMU), Takeo himself in image of results from his later work on face detection - complete system, take digital image, process find places where changes of intensity (significant difference brightness between adjacent regions), finding edges in image, lots of special purpose algorithms to locate particular features e.g. certain points on eyes, nose, mouth, top of head, boundaries of cheek, etc. - computed ~ 40 geometric quantities conveyed on cartoon face on right, e.g. ratios of distances between feature points, angles, e.g. jaw line, vector of geometric measurements served as signature used to identify the individual in image - why relative measures vs. absolute? variation in size - quite technical feat at time - built own digitizer for photos, produced image 140x200 pixels (jpeg from digital camera thousands pixels on side), 32 shades of gray (256 levels RGB camera) images, 20 different people, 75% correct recognition - general approach has been pursued in many face recognition systems in years since, detailed metric descriptions of parts of face and geometric arrangement - Expo ‘70 Osaka Japan – booth, photo, analysis, parameters famous people (monroe, jfk, churchill) - resemblance

Eigenfaces for recognition (Turk & Pentland) Principal Components Analysis (PCA)
Goal: reduce the dimensionality of the data while retaining as much information as possible in the original dataset PCA allows us to compute a linear transformation that maps data from a high dimensional space to a lower dimensional subspace - continue tour - approach to face recognition championed Turk/Pentland early 90s, known as eigenface approach, based on technique of principle components analysis - at some level, think of an image as a two-dimensional thing, but really very high-dimensional entity if think of each image pixel as dimension that can take on range of values corresponding to brightness of image at that location - much of information stored in individual pixels may be redundant, so for purpose of representing the image for recognition, especially if nearby, would be nice if could take high-dimensional data, map onto lower dimensional subspace that removes lot of redundancy in image, PCA one technique can use to do this - not going into details of method, here very simple example to capture basic intuition – imagine lots of images, measured brightness at two pixels across all images, plot all combinations of brightness – suppose nearby, chances are highly correlated, most variation brightness captured by where brightness lies along this one dimension, first principal component, then project all data to this one dimension, adequate for whatever task using data for - why eigenfaces term? method PCA involves computing eigenvectors and eigenvalues of covariance matrix constructed from data, first principal component associated with eigenvector with largest eigenvalue (direction red line), captures direction of largest variance in data, if multi-dimensional data, second principle component would capture direction of next largest variation in data, and so on - what is analogy for images of faces?

Typical sample training set…
One or more images per person Aligned & cropped to common pose, size Simple background - process begins with lots of images of peoples’ faces, one or more samples for each person in database, large (> hundreds) - for technique to work well, faces normalized to common pose, cropped to common size, simple background - Yale – changes in lighting, expression, glasses/not - results from C. deCoro [eigenvectors mutually orthogonal, global algorithm, Turk & Pentland - algorithm to compute eigenvectors/values efficiently - original image data NxN would give covariance matrix N2xN2 (A.AT) but can initially compute eigenvectors/values of AT.A (MxM) where M image samples, then project back to N2xN2 space] YouTube videos by Mahvish Nasir Sample images from the Yale face database

Eigenfaces for recognition (Turk & Pentland)
Ψ(x,y) Perform PCA on a large set of training images, to create a set of eigenfaces, Ei(x,y), that span the data set First components capture most of the variation across the data set, later components capture subtle variations Ψ(x,y): average face (across all faces) - analogy of eigenvectors can be portrayed as images, Turk & Pentland referred to as eigenfaces – look creepy, capture variations exist across database of face images - upper left corner, average face, brightness each location is average of brightness at that same location across entire stack of image samples (generated from database where each image had dark background), before PCA, subtract average from each image in training set - to right is first eigenface, starting with average space, add in different amounts of first eigenface and capture large part of variation in face images, then add some amount of next eigenface to capture more variation in dataset, as add in bit of each one, capture more of variation - denote eigenfaces as Ei(x,y), express each face image in data set as weighted sum of each of eigenfaces, combined with average face, where weights vary between individuals - some intuition about contribution that some eigenfaces make, e.g. first may capture overall skin tone of individual, some beard, glasses, etc. - typically about 25 eigenfaces might capture enough of variation to get reasonable recognition performance - once you have eigenfaces, computing weights for new image is direct calculation (check), but how then use representation for recognition? Each face image F(x,y) can be expressed as a weighted combination of the eigenfaces Ei(x,y): F(x,y) = Ψ(x,y) + Σi wi*Ei(x,y)

Representing individual faces
Each face image F(x,y) can be expressed as a weighted combination of the eigenfaces Ei(x,y): F(x,y) = Ψ(x,y) + Σi wi*Ei(x,y) Recognition process: Compute weights wi for novel face image Find image m in face database with most similar weights, e.g. - top repeats what just said, original face image, set of first k eigenfaces computed from training set, each associated with separate weight, can be multiplied by image, all weighted images added together along with mean image, would give rough depiction of face, not exact, if preserving subset eigenfaces, lost some subtle variations - weights are signature for this person, different set of weights for each individual in database of faces can recognize, so then search through weights for known individuals for one that yields closest match - different metrics can use, simplest is euclidean distance, i index for each of weights, m sample individual in database, measure how different are two sets of weights, find identity m that minimizes difference

Faces everywhere...

The Analysis of Faces in Brains and Machines

Similar presentations

Presentation on theme: "The Analysis of Faces in Brains and Machines"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

The Analysis of Faces in Brains and Machines

Similar presentations

Presentation on theme: "The Analysis of Faces in Brains and Machines"— Presentation transcript:

Similar presentations

About project

Feedback