Transform-based Models

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

EigenFaces and EigenPatches Useful model of variation in a region –Region must be fixed shape (eg rectangle) Developed for face recognition Generalised.
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Face Recognition Ying Wu Electrical and Computer Engineering Northwestern University, Evanston, IL
Machine Learning Lecture 8 Data Processing and Representation
Computer Vision Spring ,-685 Instructor: S. Narasimhan Wean 5403 T-R 3:00pm – 4:20pm Lecture #20.
Dimensionality Reduction Chapter 3 (Duda et al.) – Section 3.8
© 2003 by Davi GeigerComputer Vision September 2003 L1.1 Face Recognition Recognized Person Face Recognition.
Principal Component Analysis
CS 790Q Biometrics Face Recognition Using Dimensionality Reduction PCA and LDA M. Turk, A. Pentland, "Eigenfaces for Recognition", Journal of Cognitive.
Dimensional reduction, PCA
Project 4 out today –help session today –photo session today Project 2 winners Announcements.
Face Recognition Jeremy Wyatt.
Face Recognition Using Eigenfaces
Computer Vision I Instructor: Prof. Ko Nishino. Today How do we recognize objects in images?
Face Collections : Rendering and Image Processing Alexei Efros.
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
EE565 Advanced Image Processing Copyright Xin Li Statistical Modeling of Natural Images in the Wavelet Space Parametric models of wavelet coefficients.
Project 2 due today Project 3 out today –help session today Announcements.
Computer Vision Spring ,-685 Instructor: S. Narasimhan WH 5409 T-R 10:30am – 11:50am Lecture #18.
Face Detection and Recognition Readings: Ch 8: Sec 4.4, Ch 14: Sec 4.4
CS 485/685 Computer Vision Face Recognition Using Principal Components Analysis (PCA) M. Turk, A. Pentland, "Eigenfaces for Recognition", Journal of Cognitive.
Linear Algebra and Image Processing
Manifold learning: Locally Linear Embedding Jieping Ye Department of Computer Science and Engineering Arizona State University
Empirical Modeling Dongsup Kim Department of Biosystems, KAIST Fall, 2004.
Summarized by Soo-Jin Kim
So far focused on 3D modeling
Principal Components Analysis on Images and Face Recognition
Dimensionality Reduction: Principal Components Analysis Optional Reading: Smith, A Tutorial on Principal Components Analysis (linked to class webpage)
Recognition Part II Ali Farhadi CSE 455.
Face Recognition and Feature Subspaces
Computer Vision Spring ,-685 Instructor: S. Narasimhan Wean 5403 T-R 3:00pm – 4:20pm Lecture #19.
1 Recognition by Appearance Appearance-based recognition is a competing paradigm to features and alignment. No features are extracted! Images are represented.
Classification Course web page: vision.cis.udel.edu/~cv May 12, 2003  Lecture 33.
N– variate Gaussian. Some important characteristics: 1)The pdf of n jointly Gaussian R.V.’s is completely described by means, variances and covariances.
Face Recognition: An Introduction
CSE 185 Introduction to Computer Vision Face Recognition.
EE565 Advanced Image Processing Copyright Xin Li Image Denoising Theory of linear estimation Spatial domain denoising techniques Conventional Wiener.
EE565 Advanced Image Processing Copyright Xin Li Statistical Modeling of Natural Images in the Wavelet Space Why do we need transform? A 30-min.
Lecture 15: Eigenfaces CS6670: Computer Vision Noah Snavely.
Point Distribution Models Active Appearance Models Compilation based on: Dhruv Batra ECE CMU Tim Cootes Machester.
Feature Extraction 主講人:虞台文. Content Principal Component Analysis (PCA) PCA Calculation — for Fewer-Sample Case Factor Analysis Fisher’s Linear Discriminant.
Recognition Readings – C. Bishop, “Neural Networks for Pattern Recognition”, Oxford University Press, 1998, Chapter 1. – Szeliski, Chapter (eigenfaces)
Chapter 13 Discrete Image Transforms
Face detection and recognition Many slides adapted from K. Grauman and D. Lowe.
EE565 Advanced Image Processing Copyright Xin Li Why do we Need Image Model in the first place? Any image processing algorithm has to work on a collection.
Dimension reduction (1) Overview PCA Factor Analysis Projection persuit ICA.
Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.
Estimating standard error using bootstrap
Face Detection and Recognition Readings: Ch 8: Sec 4.4, Ch 14: Sec 4.4
LECTURE 09: BAYESIAN ESTIMATION (Cont.)
University of Ioannina
LECTURE 10: DISCRIMINANT ANALYSIS
Lecture 8:Eigenfaces and Shared Features
Face Recognition and Feature Subspaces
Recognition: Face Recognition
Announcements Project 1 artifact winners
In summary C1={skin} C2={~skin} Given x=[R,G,B], is it skin or ~skin?
Face Recognition and Detection Using Eigenfaces
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
CS4670: Intro to Computer Vision
CS4670: Intro to Computer Vision
LECTURE 09: DISCRIMINANT ANALYSIS
Announcements Project 2 artifacts Project 3 due Thursday night
Announcements Project 4 out today Project 2 winners help session today
Principal Components Analysis on Images and Face Recognition
Where are we? We have covered: Project 1b was due today
Announcements Artifact due Thursday
Announcements Artifact due Thursday
The “Margaret Thatcher Illusion”, by Peter Thompson
Presentation transcript:

Transform-based Models Principal component analysis (PCA) or Karhunen-Loeve transform (KLT) Application into Face Recognition and MATLAB demo DFT, DCT and Wavelet transforms Statistical modeling of transform coefficients (sparse representations) Application into Texture Synthesis and MATLAB demos EE565 Advanced Image Processing Copyright Xin Li 2008

EE565 Advanced Image Processing Copyright Xin Li 2008 PCA/KLT What is principal components? direction of maximum variance in the input space (physical interpretation) principal eigenvector of the covariance matrix (mathematical definition) Theoretic derivations (This is not a Theory Course like EE513) There exist several different approaches in the literature of statistics, economics and communication theory EE565 Advanced Image Processing Copyright Xin Li 2008

Standard Derivation (Covariance method) (unitary condition, refer to EE465) Basic idea: Diagonalization = EE565 Advanced Image Processing Copyright Xin Li 2008

Geometric Interpretation (direction of maximum variation/information) EE565 Advanced Image Processing Copyright Xin Li 2008

Why Does PCA/KLT Make Sense? In Pattern Recognition (e.g., R. Duda’s textbook “Pattern Classification”) or in Signal Processing (e.g., S. Mallat’s textbook “A Wavelet Tour of Signal Processing”) Analytical results are available for stationary Gaussian processes except the unknown parameters (low-order statistics) Classical ML/Bayes parameter estimation works most effectively under the independence assumption (recall the curse of dimensionality) Transform facilitates the satisfaction of this assumption In Economics, Google “Hotteling transform” EE565 Advanced Image Processing Copyright Xin Li 2008

Example: Transform Facilitates Modeling y2 y1 x1 x1 and x2 are highly correlated y1 and y2 are less correlated p(x1x2)  p(x1)p(x2) p(y1y2)  p(y1)p(y2) EE565 Advanced Image Processing Copyright Xin Li 2008

Comparison Between LR and LT Linear regression (AR model) Hyperplane fitting (a local strategy) Dimensionality reduction: data space mapped to parameter space Distortion not preserved (refer to EE467 closed-loop opt. in speech coding) Linear transform (PCA/KLT) Rotation of coordinate (a global strategy) Dimensionality reduction: only preserve the largest eigenvalues in the data space Preserves distortion (unitary property of P) EE565 Advanced Image Processing Copyright Xin Li 2008

Transform-based Models Principal component analysis (PCA) or Karhunen-Loeve transform (KLT) Application into Face Recognition and MATLAB demo DFT, DCT and Wavelet transforms Statistical modeling of transform coefficients (sparse representations) Application into Texture Synthesis and MATLAB demos EE565 Advanced Image Processing Copyright Xin Li 2008

Appearance-based Recognition (adapted from CMU Class 15385-s06) Directly represent appearance (image brightness), not geometry. Why? Avoids modeling geometry, complex interactions between geometry, lighting and reflectance. Why not? Too many possible appearances! m “visual degrees of freedom” (eg., pose, lighting, etc) R discrete samples for each DOF “nature is economical of structures but of principles” –Abdus Salam

The Space of Faces + = An image with N pixels is a point in N-dimensional space A collection of M images is a cloud of M point in RN We can define vectors in this space as we did in the 2D case [Apologies to former President Bush]

Key Idea: Linear Subspace Images in the possible set are highly correlated. So, compress them to a low-dimensional linear subspace that captures key appearance characteristics of the visual DOFs. Linearity assumption is a double-bladed sword: it facilitates analytical derivation and computational solution but Nature seldom works in a linear fashion EIGENFACES: [Turk and Pentland’1991] USE PCA!

Example of Eigenfaces Training set of face images. 15 principal components (eigenfaces or eigenvectors corresponding to 15 largest eigenvalues). Eigenfaces look somewhat like ghost faces.

Linear Subspaces explained by 2D Toy Example (Easier for Visualization) convert x into v1, v2 coordinates What does the v2 coordinate measure? What does the v1 coordinate measure? distance to line use it for classification—near 0 for orange pts position along line use it to specify which orange point it is Classification can be expensive Must either search (e.g., nearest neighbors) or store large probability density functions. Suppose the data points are arranged as above Idea—fit a line, classifier measures distance to line

Dimensionality Reduction We can represent the orange points with only their v1 coordinates since v2 coordinates are all essentially 0 This makes it much cheaper to store and compare points A much bigger deal for higher dimensional problems

Linear Subspaces (PCA in 2D) Consider the variation along direction v among all of the orange points: What unit vector v minimizes var? What unit vector v maximizes var? Solution: v1 is eigenvector of A with largest eigenvalue v2 is eigenvector of A with smallest eigenvalue

PCA in Higher Dimensions Suppose each data point is N-dimensional Same procedure applies: The eigenvectors of A define a new coordinate system eigenvector with largest eigenvalue captures the most variation among training vectors x eigenvector with smallest eigenvalue has least variation We can compress the data by only using the top few eigenvectors corresponds to choosing a “linear subspace” represent points on a line, plane, or “hyper-plane” these eigenvectors are known as the principal components

Problem: Size of Covariance Matrix A Suppose each data point is N-dimensional (N pixels) The size of covariance matrix A is N x N The number of eigenfaces is N Example: For N = 256 x 256 pixels, Size of A will be 65536 x 65536 ! Number of eigenvectors will be 65536 ! Typically, only 20-30 eigenvectors suffice. So, this method is very inefficient! 2 2

Efficient Computation of Eigenvectors Efficient Computation of Eigenvectors* (You can skip it if you don’t like matrix) If B is MxN and M<<N then A=BTB is NxN >> MxM M  number of images, N  number of pixels use BBT instead, eigenvector of BBT is easily converted to that of BTB (BBT) y = y => BT(BBT) y =  (BTy) => (BTB)(BTy) =  (BTy) => BTy is the eigenvector of BTB

Eigenfaces – summary in words Eigenfaces are the eigenvectors of the covariance matrix of the probability distribution of the vector space of human faces Eigenfaces are the ‘epitomized face ingredients’ derived from the statistical analysis of many pictures of human faces A human face may be considered to be a combination of these epitomized faces

Generating Eigenfaces – in words Large set of images of human faces is taken. The images are normalized to line up the eyes, mouths and other features. The eigenvectors of the covariance matrix of the face image vectors are then extracted. These eigenvectors are called eigenfaces. Well – please also keep in mind that if BTB is too large, you can use BBT instead (an algebraic trick)

Eigenfaces for Face Recognition When properly weighted, eigenfaces can be summed together to create an approximate gray-scale rendering of a human face. Remarkably few eigenvector terms are needed to give a fair likeness of most people's faces. Hence eigenfaces provide a means of applying “data compression” to faces for identification purposes (note NOT for transmission purpose).

Detection with Eigenfaces The set of faces is a “subspace” of the set of images Suppose it is K dimensional We can find the best subspace using PCA This is like fitting a “hyper-plane” to the set of faces spanned by vectors v1, v2, ..., vK Any face:

Eigenfaces PCA extracts the eigenvectors of A Gives a set of vectors v1, v2, v3, ... Each one of these vectors is a direction in face space what do these look like?

Projecting onto the Eigenfaces (it is easier to understand projection by using the 2D toy example though conceptually high-D works the same way) The eigenfaces v1, ..., vK span the space of faces A face is converted to eigenface coordinates by

Key Property of Eigenspace Representation Given 2 images that are used to construct the Eigenspace is the eigenspace reconstruction of image Then, That is, distance in Eigenspace is approximately equal to the correlation between two images.

Advantage of Dimensionality Reduction xRHxW →a RK Training set: x1 ,x2 ,…,xM → a1 ,a2 ,…,aM New image: x → a For detection: thresholding d=mean(||a-ak||) For recognition: select the index minimizing ||a-ak|| EE565 Advanced Image Processing Copyright Xin Li 2008

Detection: Is this a face or not?

Recognition: Whose Face is This? Algorithm Process the image database (set of images with labels) Run PCA—compute eigenfaces Calculate the K coefficients for each image Given a new image (to be recognized) x, calculate K coefficients Detect if x is a face If it is a face, who is it? Find closest labeled face in database nearest-neighbor in K-dimensional space

An Excellent Toy Example to Help Your Understanding (M’ is the number of eigenfaces used)

Choosing the Dimension K NM i = eigenvalues How many eigenfaces to use? Look at the decay of the eigenvalues the eigenvalue tells you the amount of variance “in the direction” of that eigenface ignore eigenfaces with low variance

New Ideas We can Play with A localized version of eigenfaces-based recognition For each subject k=1,…,K, obtain its associated N eigen-faces vk1 ,…, vkN For a new image x, project it onto all K sets of “localized” eigen-face spaces (so we can obtain K reconstructed copies x1,…, xK) Select the one minimizing ||x-xk|| Connection with sparse representation (or l1 regularization) - refer to Prof. Yi Ma’s homepage and his new PAMI paper ^ ^ ^

Transform-based Models Principal component analysis (PCA) or Karhunen-Loeve transform (KLT) Application into Face Recognition and MATLAB demo DFT, DCT and Wavelet transforms Statistical modeling of transform coefficients (sparse representations) Application into Texture Synthesis and MATLAB demos EE565 Advanced Image Processing Copyright Xin Li 2008

Alternative Tools (more suitable for Telecommunication Applications) Discrete Fourier Transform Can be shown to be KLT for circular stationary process (eigen-vectors take the form of discrete Fourier basis) Discrete Cosine Transform (DCT) Good approximation of KLT for AR(1) with high-correlation (e.g., a=0.9) – you are asked to show this in HW#1 Wavelet Transform Effective tool for characterizing transient signals EE565 Advanced Image Processing Copyright Xin Li 2008

One-Minute Tour of Wavelet x(n) s(n) d(n) EE565 Advanced Image Processing Copyright Xin Li 2008

Wavelet Transform on Images s(m,n) d(m,n) LL HL HH LH After row transform: each row is decomposed into low-band (approximation) and high-band (detail) Note that the order of row/column transform does not matter EE565 Advanced Image Processing Copyright Xin Li 2008

From one-level to multi-level EE565 Advanced Image Processing Copyright Xin Li 2008

Relationship to Linear Regression (Sparsity Perspective) In AR model, the effectiveness is measured by the energy (sparsity) of prediction errors In transform-based models, the effectiveness is measured by the energy (sparsity) of transform coefficients Improved sparsity (or lower energy) implies a better match between the assumed model and observation data EE565 Advanced Image Processing Copyright Xin Li 2008

Empirical Observation Extract HL band After WT A single peak at zero EE565 Advanced Image Processing Copyright Xin Li 2008

Univariate Probability Model Laplacian: Gaussian: EE565 Advanced Image Processing Copyright Xin Li 2008

Gaussian Distribution EE565 Advanced Image Processing Copyright Xin Li 2008

Laplacian Distribution EE565 Advanced Image Processing Copyright Xin Li 2008

EE565 Advanced Image Processing Copyright Xin Li 2008 Statistical Testing How do we know which parametric model better fits the empirical distribution of wavelet coefficients? In addition to visual inspection (which is often subjective and less accurate), we can use various statistical testing tools to objectively evaluate the closeness of an empirical cumulative distribution function (ECDF) to the hypothesized one One of the most widely used techniques is Kolmogorov-Smirnov Test (MATLAB function: >help kstest). EE565 Advanced Image Processing Copyright Xin Li 2008

Kolmogorov-Smirnov Test* The K-S test is based on the maximum distance between empirical CDF (ECDF) and hypothesized CDF (e.g., the normal distribution N(0,1)). EE565 Advanced Image Processing Copyright Xin Li 2008

EE565 Advanced Image Processing Copyright Xin Li 2008 Example Usage: [H,P,KS,CV] = KSTEST(X,CDF) If CDF is omitted, it assumes pdf of N(0,1) Accept the hypothesis Reject the hypothesis d: high-band wavelet coefficients of lena image (note the normalization by signal variance) x: computer-generated samples (0<P<1, the larger P, the more likely) EE565 Advanced Image Processing Copyright Xin Li 2008

Generalized Gaussian/Laplacian Distribution P: shape parameter : variance parameter where Laplacian Gaussian EE565 Advanced Image Processing Copyright Xin Li 2008

Model Parameter Estimation* Maximum Likelihood Estimation Method of moments Linear regression method Ref. [1] Sharifi, K. and Leon-Garcia, A. “Estimation of shape parameter for generalized Gaussian distributions in subband decompositions of video,” IEEE T-CSVT, No. 1, February 1995, pp. 52-56. [2] www.cimat.mx/reportes/enlinea/I-01-18_eng.pdf EE565 Advanced Image Processing Copyright Xin Li 2008

Transform-based Models Principal component analysis (PCA) or Karhunen-Loeve transform (KLT) Application into Face Recognition and MATLAB demo DFT, DCT and Wavelet transforms Statistical modeling of transform coefficients (sparse representations) Application into Texture Synthesis and MATLAB demos EE565 Advanced Image Processing Copyright Xin Li 2008

Wavelet-based Texture Synthesis Basic idea: two visually similar textures will also have similar statistics Pyramid-based (using steerable pyramids) Facilitate the statistical modeling Histogram matching Enforce the first-order statistical constraint Texture matching Alternate histogram matching in spatial and wavelet domain Boundary handling: use periodic extension Color consistency: use color transformation EE565 Advanced Image Processing Copyright Xin Li 2008

EE565 Advanced Image Processing Copyright Xin Li 2008 Histogram Matching Generalization of histogram equalization (the target is the histogram of a given image instead of uniform distribution) EE565 Advanced Image Processing Copyright Xin Li 2008

Histogram Equalization Uniform Quantization Note: y cumulative probability function L 1 x L EE565 Advanced Image Processing Copyright Xin Li 2008

MATLAB Implementation function y=hist_eq(x) [M,N]=size(x); for i=1:256 h(i)=sum(sum(x= =i-1)); End y=x;s=sum(h); I=find(x= =i-1); y(I)=sum(h(1:i))/s*255; end Calculate the histogram of the input image Perform histogram equalization EE565 Advanced Image Processing Copyright Xin Li 2008

Histogram Equalization Example                                 Histogram Equalization Example EE565 Advanced Image Processing Copyright Xin Li 2008

Histogram Specification ? EE565 Advanced Image Processing Copyright Xin Li 2008

EE565 Advanced Image Processing Copyright Xin Li 2008 Texture Matching Objective: the histogram of both subbands and synthesized image matches the given template Basic hypothesis: if two texture images visually look similar, then they have similar histograms in both spatial and wavelet domain EE565 Advanced Image Processing Copyright Xin Li 2008

EE565 Advanced Image Processing Copyright Xin Li 2008 Image Examples EE565 Advanced Image Processing Copyright Xin Li 2008

EE565 Advanced Image Processing Copyright Xin Li 2008 I.I.D. Assumption Challenged If wavelet coefficients of each subband are indeed i.i.d., then random permutation of pixels should produce another image of the same class (natural images) The fundamental question here: does WT completely decorrelate image signals? EE565 Advanced Image Processing Copyright Xin Li 2008

EE565 Advanced Image Processing Copyright Xin Li 2008 Image Example High-band coefficients permutation You can run the MATLAB demo to check this experiment EE565 Advanced Image Processing Copyright Xin Li 2008

EE565 Advanced Image Processing Copyright Xin Li 2008 Another Experiment Y X Joint pdf of two uncorrelated random variables X and Y EE565 Advanced Image Processing Copyright Xin Li 2008

Joint PDF of Wavelet Coefficients Y= X= Joint pdf of two correlated random variables X and Y Neighborhood I(Q): {Left,Up,cousin and aunt} EE565 Advanced Image Processing Copyright Xin Li 2008

Texture Synthesis via Parametric Models in the Wavelet Space Basic idea: two visually similar textures will also have similar statistics Instead of matching histogram (nonparametric models), we can build parametric models for wavelet coefficients and enforce the synthesized image to inherit the parameters of given image Model parameters: 710 parameters were used in Portilla and Simoncelli’s experiment (4 orientations, 4 scales, 77 neighborhood) EE565 Advanced Image Processing Copyright Xin Li 2008

Statistical Constraints Four types of constraints Marginal Statistics Raw coefficient correlation Coefficient magnitude statistics Cross-scale phase statistics Alternating Projections onto the four constraint sets Projection-onto-convex-set (POCS) EE565 Advanced Image Processing Copyright Xin Li 2008

EE565 Advanced Image Processing Copyright Xin Li 2008 Convex Set A set Ω is said to be convex if for any two point We have Convex set examples Non-convex set examples EE565 Advanced Image Processing Copyright Xin Li 2008

EE565 Advanced Image Processing Copyright Xin Li 2008 Projection Operator f Projection onto convex set C g C In simple words, the projection of f onto a convex set C is the element in C that is closest to f in terms of Euclidean distance EE565 Advanced Image Processing Copyright Xin Li 2008

Alternating Projection X1 X∞ X0 X2 C2 Projection-Onto-Convex-Set (POCS) Theorem: If C1,…,Ck are convex sets, then alternating projection P1,…,Pk will converge to the intersection of C1,…,Ck if it is not empty Alternating projection does not always converge in the case of non-convex set. Can you think of any counter-example? EE565 Advanced Image Processing Copyright Xin Li 2008

Convex Constraint Sets ● Non-negative set ● Bounded-value set or ● Bounded-variance set A given signal EE565 Advanced Image Processing Copyright Xin Li 2008

Non-convex Constraint Set Histogram matching used in Heeger&Bergen’1995 Bounded Skewness and Kurtosis skewness kurtosis The derivation of projection operators onto constraint sets are tedious are referred to the paper and MATLAB codes by Portilla&Simoncelli. EE565 Advanced Image Processing Copyright Xin Li 2008

EE565 Advanced Image Processing Copyright Xin Li 2008 Image Examples original synthesized EE565 Advanced Image Processing Copyright Xin Li 2008

Image Examples (Con’d) original synthesized EE565 Advanced Image Processing Copyright Xin Li 2008

EE565 Advanced Image Processing Copyright Xin Li 2008 When Does It Fail? original synthesized EE565 Advanced Image Processing Copyright Xin Li 2008

EE565 Advanced Image Processing Copyright Xin Li 2008 Summary Textures represent an important class of structures in natural images – unlike edges characterizing object boundaries, textures often associate with the homogeneous property of object surfaces Wavelet-domain parametric models provide a parsimonious representation of high-order statistical dependency within textural images EE565 Advanced Image Processing Copyright Xin Li 2008