Multi-Attribute Spaces: Calibration for Attribute Fusion and Similarity Search University of Oxford 5 th December 2012 Walter Scheirer, Neeraj Kumar, Peter.

Slides:

Advertisements

Similar presentations

Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.

Advertisements

Principles of Density Estimation

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki

A MIXED MODEL FOR ESTIMATING THE PROBABILISTIC WORST CASE EXECUTION TIME Cristian MAXIM*, Adriana GOGONEL, Liliana CUCU-GROSJEAN INRIA Paris-Rocquencourt,

Chapter 2 Multivariate Distributions Math 6203 Fall 2009 Instructor: Ayona Chatterjee.

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?

Random Variables ECE460 Spring, 2012.

On-the-fly Specific Person Retrieval University of Oxford 24 th May 2012 Omkar M. Parkhi, Andrea Vedaldi and Andrew Zisserman.

Neeraj Kumar, Alexander C. Berg, Peter N. Belumeur, and Shree K. Nayar Presented by Gregory Teodoro.

Face detection Behold a state-of-the-art face detector! (Courtesy Boris Babenko)Boris Babenko.

1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.

Continuous Probability Distributions.  Experiments can lead to continuous responses i.e. values that do not have to be whole numbers. For example: height.

Probability distribution functions Normal distribution Lognormal distribution Mean, median and mode Tails Extreme value distributions.

F (x) - PDF Expectation Operator x (fracture stress) Nomenclature/Preliminaries.

Discriminative and generative methods for bags of features

Face Verification across Age Progression Narayanan Ramanathan Dr. Rama Chellappa.

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?

1 Engineering Computation Part 6. 2 Probability density function.

A gentle introduction to Gaussian distribution. Review Random variable Coin flip experiment X = 0X = 1 X: Random variable.

Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.

Continuous Random Variables Chap. 12. COMP 5340/6340 Continuous Random Variables2 Preamble Continuous probability distribution are not related to specific.

1 How to be a Bayesian without believing Yoav Freund Joint work with Rob Schapire and Yishay Mansour.

CSE 221: Probabilistic Analysis of Computer Systems Topics covered: Continuous random variables Uniform and Normal distribution (Sec. 3.1, )

Chapter 11 Integration Information Instructor: Prof. G. Bebis Represented by Reza Fall 2005.

Pattern Classification, Chapter 1 1 Basic Probability.

Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.

QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik.

TOPIC 5 Normal Distributions.

Random Variables and Probability Distributions

Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.

Chapter 5 Sampling and Statistics Math 6203 Fall 2009 Instructor: Ayona Chatterjee.

Probability distribution functions

IBS-09-SL RM 501 – Ranjit Goswami 1 Basic Probability.

Sampling Distributions & Standard Error Lesson 7.

ECE 8443 – Pattern Recognition Objectives: Error Bounds Complexity Theory PAC Learning PAC Bound Margin Classifiers Resources: D.M.: Simplified PAC-Bayes.

Chapter 1: DESCRIPTIVE STATISTICS – PART I2  Statistics is the science of learning from data exhibiting random fluctuation.  Descriptive statistics:

80 million tiny images: a large dataset for non-parametric object and scene recognition CS 4763 Multimedia Systems Spring 2008.

Classifiers Given a feature representation for images, how do we learn a model for distinguishing features from different classes? Zebra Non-zebra Decision.

Chapter 5.6 From DeGroot & Schervish. Uniform Distribution.

PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology.

1 E. Fatemizadeh Statistical Pattern Recognition.

MSRI workshop, January 2005 Object Recognition Collected databases of objects on uniform background (no occlusions, no clutter) Mostly focus on viewpoint.

PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.

Mixture of Gaussians This is a probability distribution for random variables or N-D vectors such as… –intensity of an object in a gray scale image –color.

Computer Vision Lecture 6. Probabilistic Methods in Segmentation.

Expectation. Let X denote a discrete random variable with probability function p(x) (probability density function f(x) if X is continuous) then the expected.

CSSE463: Image Recognition Day 14 Lab due Weds, 3:25. Lab due Weds, 3:25. My solutions assume that you don't threshold the shapes.ppt image. My solutions.

KNN & Naïve Bayes Hongning Wang Today’s lecture Instance-based classifiers – k nearest neighbors – Non-parametric learning algorithm Model-based.

Chapter 13 (Prototype Methods and Nearest-Neighbors )

Probability distributions

Kin 304 Descriptive Statistics & the Normal Distribution

Maximum Likelihood Estimate Jyh-Shing Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University.

NN k Networks for browsing and clustering image collections Daniel Heesch Communications and Signal Processing Group Electrical and Electronic Engineering.

Face detection Many slides adapted from P. Viola.

KNN & Naïve Bayes Hongning Wang

Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.

Random Variables By: 1.

Fewer permutations, more accurate P-values Theo A. Knijnenburg 1,*, Lodewyk F. A. Wessels 2, Marcel J. T. Reinders 3 and Ilya Shmulevich 1 1Institute for.

Chapter 12 Chi-Square Tests and Nonparametric Tests

Statistical Modelling

BPK 304W Descriptive Statistics & the Normal Distribution

Kin 304 Descriptive Statistics & the Normal Distribution

STATISTICS Random Variables and Distribution Functions

Classification of unlabeled data:

Chapter 7: Sampling Distributions

Attributes and Simile Classifiers for Face Verification

BPK 304W Descriptive Statistics & the Normal Distribution

Domingo Mery Department of Computer Science

Extreme Value Theory: Part I

Random Variables and Probability Distributions

Presentation transcript:

Multi-Attribute Spaces: Calibration for Attribute Fusion and Similarity Search University of Oxford 5 th December 2012 Walter Scheirer, Neeraj Kumar, Peter N. Belhumeur, Terrance E. Boult, CVPR 2012

Attributes based image description 4-Legged Orange Striped Furry White Symmetric Ionic columns Classical Male Asian Beard Smiling Slide Courtesy: Neeraj Kumar

Attribute Classifiers Attribute and Simile Classifiers for Face Verification N. Kumar, A. C. Berg, P. N. Belhumeur, and S. K. Nayar ICCV 2009 FaceTracer: A Search Engine for Large Collections of Images with Faces N. Kumar, P. N. Belhumeur, and S. K. Nayar ICCV 2009

Attributes Fusion FaceTracer: “smiling asian men with glasses” Slide Courtesy: Neeraj Kumar

Score Normalization: Problem Necessary to prevent high confidence for one attribute from dominating the results. Ideal normalization technique should, 1)Normalize scores to a uniform range say, [0,1] 2)Assign perceptual quality to the scores. Positive and negative distributions of different classifiers do not necessarily follow same distribution. Fitting a Gaussian or any other distribution to scores satisfies condition 1 but doesn’t satisfy condition 2. Negative Scores DistributionsPositive Scores Distributions

Score Normalization: Solution Model distance between positive scores and the negative scores. If we knew distribution of negative scores, we could do a hypothesis test for each positive score using that distribution. Unfortunately, we don’t know anything about overall negative distribution. But, we know something about tail of the negative score distribution.

Extreme Value Theory Central Limit Theorem: The “mean” of a sufficiently large iid random variables will be distributed according to Normal distribution Extreme Value Theory: The maximum of a sufficiently large iid random variable will be distributed according to Gumbell, Frechet or Weibull distribution. If the values are bounded from above and below, the the values are distributed according to “Weibull” distribution.

Weibull Distribution PDF CDF k and λ are shape and location parameters respectively. PDFCDF

Extreme Value Theory: Application Tail Overall Negative Score Distribution Maximum values of random variables Tail of negative scores can be seen as a collection of maxima of some random variables. Hence it follows Weibull distribution according to Extreme Value Theory.

W-score normalization: Procedure For any classifier, Fix the decision boundary on the scores (Ideally this should be at score = 0 ) Select maximum N (tail size) samples from negative side of the boundary. Fit a Weibull Distribution to these tail scores. Renormalize scores using Cumulative Density Function (CDF) of this Weibull distribution.

Results: Dataset “Labeled Faces In The Wild” dataset. About 13,000 images of 5000 celebrities. 75 different attribute classification scores available from “ Attribute and Simile Classifiers for Face Verification”. Kumar et al. ICCV 09. Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments.

Results

Multi Attribute Fusion: Joint score can be computed as multiplication of individual attribute probabilities. Attributes may not be independent. Low probability due to: bad classifier absence of images belonging to an attribute. Instead of product, authors propose use l1 norm of probabilities as a fusion score.

Results

Similarity Search: Given an image and a set of attributes, find nearest images. Perceived difference between images in different ranges might be similar. Distances between query attribute and its nearest neighbor needs to be normalized. Normalize query attribute scores on query image. Get nearest neighbor distances. Fit Weibull distribution to distances.

Summary Provides way of normalizing scores intuitively. Provides way for combining attributes. Relies on finding the right threshold and tail size. Requires fair bit of tuning.

Questions?