“Bayesian Identity Clustering”

Slides:

Advertisements

Similar presentations

Example 1 Generating Random Photorealistic Objects Umar Mohammed and Simon Prince Department of Computer Science, University.

Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki

Presented by Xinyu Chang

Computer vision: models, learning and inference Chapter 8 Regression.

Computer vision: models, learning and inference Chapter 18 Models for style and identity.

Proposed concepts illustrated well on sets of face images extracted from video: Face texture and surface are smooth, constraining them to a manifold Recognition.

Joint and implicit registration for face recognition Dr. Peng Li and Dr. Simon J.D. Prince Department of Computer Science University College London

Complex Feature Recognition: A Bayesian Approach for Learning to Recognize Objects by Paul A. Viola Presented By: Emrah Ceyhan Divin Proothi Sherwin Shaidee.

Computer vision: models, learning and inference

Report on Intrusion Detection and Data Fusion By Ganesh Godavari.

Chapter 2: Pattern Recognition

Automatic Pose Estimation of 3D Facial Models Yi Sun and Lijun Yin Department of Computer Science State University of New York at Binghamton Binghamton,

Fig. 2 – Test results Personal Memory Assistant Facial Recognition System The facial identification system is divided into the following two components:

Facial Recognition CSE 391 Kris Lord.

Wang, Z., et al. Presented by: Kayla Henneman October 27, 2014 WHO IS HERE: LOCATION AWARE FACE RECOGNITION.

Exercise Session 10 – Image Categorization

Mining Discriminative Components With Low-Rank and Sparsity Constraints for Face Recognition Qiang Zhang, Baoxin Li Computer Science and Engineering Arizona.

1 Recognition by Appearance Appearance-based recognition is a competing paradigm to features and alignment. No features are extracted! Images are represented.

Report on Intrusion Detection and Data Fusion By Ganesh Godavari.

Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.

Using Support Vector Machines to Enhance the Performance of Bayesian Face Recognition IEEE Transaction on Information Forensics and Security Zhifeng Li,

SINGULAR VALUE DECOMPOSITION (SVD)

A NOVEL METHOD FOR COLOR FACE RECOGNITION USING KNN CLASSIFIER

Chapter 13 (Prototype Methods and Nearest-Neighbors )

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

9.913 Pattern Recognition for Vision Class9 - Object Detection and Recognition Bernd Heisele.

SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.

Chapter 8: Estimating with Confidence

CHAPTER 8 Estimating with Confidence

Chapter 8: Estimating with Confidence

Computer vision: models, learning and inference

Deeply learned face representations are sparse, selective, and robust

PRESENTED BY Yang Jiao Timo Ahonen, Matti Pietikainen

Can Computer Algorithms Guess Your Age and Gender?

CHAPTER 8 Estimating with Confidence

Computer vision: models, learning and inference

Statistical Models for Automatic Speech Recognition

A Hybrid PCA-LDA Model for Dimension Reduction Nan Zhao1, Washington Mio2 and Xiuwen Liu1 1Department of Computer Science, 2Department of Mathematics Florida.

Image Segmentation Techniques

CHAPTER 8 Estimating with Confidence

Revision (Part II) Ke Chen

Xaq Pitkow, Dora E. Angelaki Neuron

SMEM Algorithm for Mixture Models

CHAPTER 8 Estimating with Confidence

Revision (Part II) Ke Chen

Chapter 10: Estimating with Confidence

Chapter 8: Estimating with Confidence

Authors: Wai Lam and Kon Fan Low Announcer: Kyu-Baek Hwang

CHAPTER 8 Estimating with Confidence

Chapter 8: Estimating with Confidence

Presented by Wanxue Dong

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence

CHAPTER 8 Estimating with Confidence

CHAPTER 8 Estimating with Confidence

CHAPTER 8 Estimating with Confidence

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence

University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Analysing Means I: (Extending) Analysis.

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence

CHAPTER 8 Estimating with Confidence

CHAPTER 8 Estimating with Confidence

Chapter 8: Estimating with Confidence

Iterative Projection and Matching: Finding Structure-preserving Representatives and Its Application to Computer Vision.

Presentation transcript:

“Bayesian Identity Clustering” Simon J.D. Prince Department of Computer Science University College London. James Elder Centre for Vision Research York University. http://pvl.cs.ucl.ac.uk s.prince@cs.ucl.ac.uk

The problem How many different people are present and which face corresponds to which person? Could be 80 pics of the same person, or 1 pic each of 80 different people. Difficult Task: Compound face recognition + choice of model size

The problem Faces don’t necessarily all have the same pose

Face Clustering Application 1: Photo content labelling System analyzes a set of photos and provides one picture of each different individual present, which you label. Labels are then propagated to your entire set of photos.

Face Clustering Application 2: Security synopses Here are a collection of face images captured by a pan/tilt security camera. A crime is committed in the area. How many people were present in the last few hours, and when did each enter and leave?

Face Clustering Application 3: Web search Here are images from Google image search given the query “john smith”. Identify which pictures belong with which and present only one of each person as options rather than 100’s of pages with repeated images.

Latent variables for identity (i) Face images depend on several interacting factors: these include the person’s identity (signal) and the pose, illumination (nuisance variables). (ii) Image generation is noisy: even in matching conditions, images of the same person differ. This remaining variation comprises unmodeled factors and sensor noise. (iii) Identity can never exactly be known: since generation is noisy, there will always be uncertainty on any estimate of identity, regardless of how we form this estimate. (iv) Recognition tasks do not require identity estimates: in face recognition, we can ask whether two faces have the same identity, regardless of what this identity is.

Latent Identity Variables GOAL: To build a generative model of the whole face manifold. We hypothesize an underlying representation of identity (h) and generation process f(.) to create image (a generative model) Add within-individual noise process e to explain why two images of same person are not identical Each point represents abstract identity of person. . . . x1 Identity space explains data in feature space h1(explains x1) x2 . . OBSERVED DIM 2 IDENTITY DIM 2 . • h2(explains x2 and xp ) x3 xp Deterministic transformation + additive noise OBSERVED DIM 3 h3(explains x3) OBSERVED DIM 1 IDENTITY DIM 1

Latent Identity Variables (LIVs) Hypothesize existence of underlying representation of identity: If two identity variables take the same value then they represent same person If two identity variables take different values, they represent different people Each point represents abstract identity of person. . . . x1 Identity space explains data in feature space h1(explains x1) x2 . . OBSERVED DIM 2 IDENTITY DIM 2 . • h2(explains x2 and xp ) x3 xp Deterministic transformation + additive noise OBSERVED DIM 3 h3(explains x3) OBSERVED DIM 1 IDENTITY DIM 1

Latent Identity Variables (LIVs) Hypothesize existence of underlying representation of identity: If two points in identity space are the same then they represent same person If two points in identity space differ, they represent different people . x1 = f(h1) + e . e . • x1 h1(explains x1) . x2 . . OBSERVED DIM 2 IDENTITY DIM 2 . • h2(explains x2 and xp ) x3 xp Deterministic transformation + additive noise OBSERVED DIM 3 h3(explains x3) OBSERVED DIM 1 IDENTITY DIM 1

Latent Identity Variables (LIVs) Hypothesize existence of underlying representation of identity: If two points in identity space are the same then they represent same person If two points in identity space differ, they represent different people . x2 = f(h2) + ei . . xp = f(h2) + ej x1 h1(explains x1) . x2 . . e • OBSERVED DIM 2 IDENTITY DIM 2 e . • h2(explains x2 and xp ) x3 xp Deterministic transformation + additive noise OBSERVED DIM 3 h3(explains x3) OBSERVED DIM 1 IDENTITY DIM 1

Latent Identity Variables (LIVs) So a LIV model partitions of the data into signal and noise. In recognition we: Compare the probability of data under different assignments between the data (left figure) and identity (right figure) Acknowledge that values of identity variables are fundamentally uncertain so consider all possibilities (i.e. marginalize over them) – never estimate identity! Each point represents abstract identity of person. . . . x1 Identity space explains data in feature space h1(explains x1) x2 . . OBSERVED DIM 2 IDENTITY DIM 2 . • h2(explains x2 and xp ) x3 xp Deterministic transformation + additive noise OBSERVED DIM 3 h3(explains x3) OBSERVED DIM 1 IDENTITY DIM 1

Probabilistic Linear Discriminant Analysis (Prince & Elder 2007) Observed data from j th image of i th indivual Overall mean noise Weighted sum of basis functions G for within-individual variation Weighted sum of basis functions F for between individual variation (identity)

Probabilistic LDA Or: = + h1 + h2 + + h3 + w1j + w2j + w3j MEAN, m F(:,1) F(:,2) NOISE, S = + h1 + h2 + IMAGE, xj + h3 F(:,3) G(:,1) G(:,2) G(:,3) + w1j + w2j + w3j

Within individual and between individual variation m+2F(:,1) m+2F(:,2) m+2(F:,3) m+2G(:,1) m+2(G:,2) m+2G(:,3) Between-individual variation Within-individual variation

Model Comparison Frame clustering as model comparison h1 h2 h3 h1 x1 4 5

Data Extraction XM2VTS database Trained with 8 images each of 195 individuals Test with remaining 100 individuals Find facial feature points Extract normalized vector of pooled gradients around each feature Build a PLDA model of each

Model Explosion IMAGES: 2 3 4 5 6 7 8 MODELS: 2 5 15 52 203 877 4140 For small N, we can compare all hypotheses PROBLEM 1: For larger N, the number of models is too large (10115 for 100 images) and we must resort to approximate methods. Agglomerative approximation used – start with N different groups and merge the best two repeatedly until the likelihood stops increasing PROBLEM 2: We almost never get the answer exactly right when N=100. % Correct is not a good metric. How should we measure clustering performance NO. OF IMAGES % CORRECT PLDA Model Chance

Cluster Alignment

Variation of Information Clustering Metrics Rand Metric Variation of Information Precision and Recall

Which clusters are the best? Surely M1=M2<M3=M4?

Which clusters are the best? Propose using no of splits and merges required to match

Quantitative Results XM2VTS Frontal Data Always 4 images per person. 5,10,15,20,25 people 100 repeats for each datapoint with different people and different images

Qualitative Results Split-Merge cost here = 0.97 (average was 0.98)

Cross pose case What if there is more than one pose?

Tied PLDA Overall mean for pose k noise Observed data from j th image of i th Individual in k th pose Weighted sum of basis functions G for within-individual variation for this pose Weighted sum of basis functions F for between individual variation (identity) for this pose

Tied PLDA Between-individual variation (pose 1) Within-individual variation (pose 1) Within-individual variation (pose 2)

Quantitative Results XM2VTS Frontal Data Always 4 images per person. 5,10,15,20,25 people 100 repeats for each datapoint with different people and different images

Clustering Results: Mixed Pose Qualitative Results Clustering Results: Mixed Pose Split merge cost = 0.85 (average 0.865)

Learn more about these techniques via http://pvl.cs.ucl.ac.uk Conclusions Clustering of faces is a difficult problem Multiple face recognition Choice of model size Our method marginalizes over number of identities – can compare models of different size Extends to the cross-pose case Learn more about these techniques via http://pvl.cs.ucl.ac.uk