Joint Estimation of Image Clusters and Image Transformations Brendan J. Frey Computer Science, University of Waterloo, Canada Beckman Institute and ECE,

Slides:

Advertisements

Similar presentations

Real-time on-line learning of transformed hidden Markov models Nemanja Petrovic, Nebojsa Jojic, Brendan Frey and Thomas Huang Microsoft, University of.

Advertisements

Part 2: Unsupervised Learning

Clustering. How are we doing on the pass sequence? Pretty good! We can now automatically learn the features needed to track both people But, it sucks.

Applications of one-class classification

2010 Winter School on Machine Learning and Vision Sponsored by Canadian Institute for Advanced Research and Microsoft Research India With additional support.

EigenFaces and EigenPatches Useful model of variation in a region –Region must be fixed shape (eg rectangle) Developed for face recognition Generalised.

Learning deformable models Yali Amit, University of Chicago Alain Trouvé, CMLA Cachan.

ECE 8443 – Pattern Recognition LECTURE 05: MAXIMUM LIKELIHOOD ESTIMATION Objectives: Discrete Features Maximum Likelihood Resources: D.H.S: Chapter 3 (Part.

Face Recognition Ying Wu Electrical and Computer Engineering Northwestern University, Evanston, IL

Salvatore giorgi Ece 8110 machine learning 5/12/2014

Input Space versus Feature Space in Kernel- Based Methods Scholkopf, Mika, Burges, Knirsch, Muller, Ratsch, Smola presented by: Joe Drish Department of.

3D Human Body Pose Estimation from Monocular Video Moin Nabi Computer Vision Group Institute for Research in Fundamental Sciences (IPM)

Chapter 4: Linear Models for Classification

Segmentation and Fitting Using Probabilistic Methods

Model: Parts and Structure. History of Idea Fischler & Elschlager 1973 Yuille ‘91 Brunelli & Poggio ‘93 Lades, v.d. Malsburg et al. ‘93 Cootes, Lanitis,

Transformed Component Analysis: Joint Estimation of Image Components and Transformations Brendan J. Frey Computer Science, University of Waterloo, Canada.

Variational Inference and Variational Message Passing

EE 290A: Generalized Principal Component Analysis Lecture 6: Iterative Methods for Mixture-Model Segmentation Sastry & Yang © Spring, 2011EE 290A, University.

Supervised and Unsupervised learning and application to Neuroscience Cours CA6b-4.

Pattern Recognition Topic 1: Principle Component Analysis Shapiro chap

Dimensional reduction, PCA

Audio-Visual Graphical Models Matthew Beal Gatsby Unit University College London Nebojsa Jojic Microsoft Research Redmond, Washington Hagai Attias Microsoft.

Object Class Recognition Using Discriminative Local Features Gyuri Dorko and Cordelia Schmid.

EECE 279: Real-Time Systems Design Vanderbilt University Ames Brown & Jason Cherry MATCH! Real-Time Facial Recognition.

Face Collections : Rendering and Image Processing Alexei Efros.

A Novel 2D To 3D Image Technique Based On Object- Oriented Conversion.

Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.

ICA Alphan Altinok. Outline  PCA  ICA  Foundation  Ambiguities  Algorithms  Examples  Papers.

AN ANALYSIS OF SINGLE- LAYER NETWORKS IN UNSUPERVISED FEATURE LEARNING [1] Yani Chen 10/14/

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 7: Coding and Representation 1 Computational Architectures in.

Face Processing System Presented by: Harvest Jang Group meeting Fall 2002.

Object Recognition by Parts Object recognition started with line segments. - Roberts recognized objects from line segments and junctions. - This led to.

CS 485/685 Computer Vision Face Recognition Using Principal Components Analysis (PCA) M. Turk, A. Pentland, "Eigenfaces for Recognition", Journal of Cognitive.

Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.

Similarity measuress Laboratory of Image Analysis for Computer Vision and Multimedia Università di Modena e Reggio Emilia,

CSC321: Neural Networks Lecture 12: Clustering Geoffrey Hinton.

Visual Tracking Conventional approach Build a model before tracking starts Use contours, color, or appearance to represent an object Optical flow Incorporate.

1 Recognition by Appearance Appearance-based recognition is a competing paradigm to features and alignment. No features are extracted! Images are represented.

Particle Filters.

MSRI workshop, January 2005 Object Recognition Collected databases of objects on uniform background (no occlusions, no clutter) Mostly focus on viewpoint.

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: ML and Simple Regression Bias of the ML Estimate Variance of the ML Estimate.

MACHINE LEARNING 8. Clustering. Motivation Based on E ALPAYDIN 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Classification problem:

Mixture of Gaussians This is a probability distribution for random variables or N-D vectors such as… –intensity of an object in a gray scale image –color.

Generative Models for Image Understanding Nebojsa Jojic and Thomas Huang Beckman Institute and ECE Dept. University of Illinois.

A DISTRIBUTION BASED VIDEO REPRESENTATION FOR HUMAN ACTION RECOGNITION Yan Song, Sheng Tang, Yan-Tao Zheng, Tat-Seng Chua, Yongdong Zhang, Shouxun Lin.

Epitomic Location Recognition A generative approach for location recognition K. Ni, A. Kannan, A. Criminisi and J. Winn In proc. CVPR Anchorage,

Principal Manifolds and Probabilistic Subspaces for Visual Recognition Baback Moghaddam TPAMI, June John Galeotti Advanced Perception February 12,

EE4-62 MLCV Lecture Face Recognition – Subspace/Manifold Learning Tae-Kyun Kim 1 EE4-62 MLCV.

Singer Similarity Doug Van Nort MUMT 611. Goal Determine Singer / Vocalist based on extracted features of audio signal Classify audio files based on singer.

Final Review Course web page: vision.cis.udel.edu/~cv May 21, 2003  Lecture 37.

Chapter 13 (Prototype Methods and Nearest-Neighbors )

Elements of Pattern Recognition CNS/EE Lecture 5 M. Weber P. Perona.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 12: Advanced Discriminant Analysis Objectives:

CSC321 Lecture 5 Applying backpropagation to shape recognition Geoffrey Hinton.

CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 6: Applying backpropagation to shape recognition Geoffrey Hinton.

Advanced Artificial Intelligence Lecture 8: Advance machine learning.

Computational Intelligence: Methods and Applications Lecture 26 Density estimation, Expectation Maximization. Włodzisław Duch Dept. of Informatics, UMK.

Sheng-Fang Huang Chapter 11 part I.  After the image is segmented into regions, how to represent and describe these regions? ◦ In terms of its external.

Lecture 4b Data augmentation for CNN training

Automatic Lung Cancer Diagnosis from CT Scans (Week 2)

LECTURE 11: Advanced Discriminant Analysis

University of Ioannina

Object Recognition by Parts

Outline Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no.

SMEM Algorithm for Mixture Models

Transformation-invariant clustering using the EM algorithm

EE513 Audio Signals and Systems

Multivariate Methods Berlin Chen

The EM Algorithm With Applications To Image Epitome

Object Recognition by Parts

Presentation transcript:

Joint Estimation of Image Clusters and Image Transformations Brendan J. Frey Computer Science, University of Waterloo, Canada Beckman Institute and ECE, Univ of Illinois at Urbana Nebojsa Jojic Beckman Institute, University of Illinois at Urbana

We’d like to cluster images, but The unknown subjects have unknown positions

The unknown subjects have unknown positions unknown rotations unknown scales unknown levels of shearing...

One approach Normalization Pattern Analysis Images Normalized images Labor

Another approach Apply transformations to each image Pattern Analysis Images Huge data set Assumes transformations are equally likely noise gets copied analysis is more complex

Yet another approach Extract transformation- invariant features Pattern Analysis Images Transformation- invariant data Difficult to work with May hide useful features

Our approach Joint Normalization and Pattern Analysis Images

A continuous transformation moves an image,, along a continuous curve Our clustering algorithm should assign images near this nonlinear manifold to the same cluster What transforming an image does in the vector space of pixel intensities

Tractable approaches to modeling the transformation manifold \ Linear approximation - good locally, bad globally Finite-set approximation - good globally, bad locally

Generative models Local invariance: PCA, Turk, Moghaddam, Pentland (96); factor analysis, Hinton, Revow, Dayan, Ghahramani (96); Frey, Colmenarez, Huang (98) Layered motion: Black,Jepson,Wang,Adelson,Weiss(93-98) Learning discrete representations of generative manifolds Generative topographic maps, Bishop,Svensen,Williams (98) Discriminative models Local invariance: tangent distance, tangent prop, Simard, Le Cun, Denker, Victorri (92-93) Global invariance: convolutional neural networks, Le Cun, Bottou, Bengio, Haffner (98) Related work

Generative density modeling The goal is to find a probability model that –reflects the structure we want to extract –can randomly generate plausible images, –represents the data using parameters ML estimation is used to find the parameters We can use class-conditional likelihoods, p(image|class) for recognition, detection,...

Mixture of Gaussians c The probability that an image comes from cluster c = 1,2,… is P(c) =  c

Mixture of Gaussians c z The probability of pixel intensities z given that the image is from cluster c is p(z|c) = N(z;  c,  c ) P(c) =  c

Mixture of Gaussians c P(c) =  c z p(z|c) = N(z;  c,  c ) Parameters  c,  c and  c represent the data For input z, the cluster responsibilities are P(c|z) = p(z|c)P(c) /  c p(z|c)P(c)

Example: Hand-crafted model c P(c) =  c z p(z|c) = N(z;  c,  c )      1 = 0.6,   2 = 0.4,

Example: Simulation c P(c) =  c z p(z|c) = N(z;  c,  c )      1 = 0.6,   2 = 0.4,

Example: Simulation c =1 P(c) =  c z p(z|c) = N(z;  c,  c )      1 = 0.6,   2 = 0.4,

Example: Simulation c =1 P(c) =  c z=z= p(z|c) = N(z;  c,  c )      1 = 0.6,   2 = 0.4,

Example: Simulation c P(c) =  c z p(z|c) = N(z;  c,  c )      1 = 0.6,   2 = 0.4,

Example: Simulation c =2 P(c) =  c z p(z|c) = N(z;  c,  c )      1 = 0.6,   2 = 0.4,

Example: Simulation c =2 P(c) =  c z=z= p(z|c) = N(z;  c,  c )      1 = 0.6,   2 = 0.4,

Example: Inference c z      1 = 0.6,   2 = 0.4, Images from data set

Example: Inference c =1      1 = 0.6,   2 = 0.4, Images from data set z=z= c =2 P(c|z) c

Example: Inference      1 = 0.6,   2 = 0.4, Images from data set z=z= c c =1 c =2 P(c|z)

Example: Learning - E step c z      1 = 0.5,   2 = 0.5, Images from data set

Example: Learning - E step c =1 Images from data set z=z= c =2 P(c|z) c      1 = 0.5,   2 = 0.5,

Example: Learning - E step Images from data set z=z= c c =1 c =2 P(c|z)      1 = 0.5,   2 = 0.5,

Example: Learning - E step Images from data set z=z= c c =1 c =2 P(c|z)      1 = 0.5,   2 = 0.5,

Example: Learning - E step Images from data set z=z= c c =1 c =2 P(c|z)      1 = 0.5,   2 = 0.5,

Example: Learning - M step c      1 = 0.5,   2 = 0.5, z Set  1 to the average of zP(c =1 |z) Set  2 to the average of zP(c =2 |z)

Example: Learning - M step c      1 = 0.5,   2 = 0.5, z Set  1 to the average of zP(c =1 |z) Set  2 to the average of zP(c =2 |z)

Example: Learning - M step c      1 = 0.5,   2 = 0.5, z Set  1 to the average of diag((z-  1 ) T (z-  1 ))P(c =1 |z) Set  2 to the average of diag((z-  2 ) T (z-  2 ))P(c =2 |z)

Example: Learning - M step c      1 = 0.5,   2 = 0.5, z Set  1 to the average of diag((z-  1 ) T (z-  1 ))P(c =1 |z) Set  2 to the average of diag((z-  2 ) T (z-  2 ))P(c =2 |z)

Example: After iterating EM... c z      1 = 0.6,   2 = 0.4,

Adding “transformation” as a discrete latent variable Say there are N pixels We assume we are given a set of sparse N x N transformation generating matrices G 1,…,G l,…,G L These generate points from point

Transformed Mixture of Gaussians c The probability that the image comes from cluster c = 1,2,… is P(c) =  c

Transformed Mixture of Gaussians c z The probability of latent image z for cluster c is p(z|c) = N(z;  c,  c ) P(c) =  c

Transformed Mixture of Gaussians l The probability of transf l = 1,2,… is P(l) =  l p(z|c) = N(z;  c,  c ) c z P(c) =  c

Transformed Mixture of Gaussians The probability of observed image x is p(x|z,l) = N(x; G l z,  ) x P(l) =  l l p(z|c) = N(z;  c,  c ) c z P(c) =  c

Transformed Mixture of Gaussians  l,  c,  c and  c represent the data The cluster/transf responsibilities, P(c,l|x), are quite easy to compute p(x|z,l) = N(x; G l z,  ) x P(l) =  l l p(z|c) = N(z;  c,  c ) c z P(c) =  c

Example: Hand-crafted model     G 1 = shift left and up, G 2 = I, G 3 = shift right and up x l c z l = 1, 2, 3  1 = 0.6,  2 = 0.4  1 =  2 =  3 = 0.33

Example: Simulation     x l c z G 1 = shift left and up, G 2 = I, G 3 = shift right and up

Example: Simulation     c =1 G 1 = shift left and up, G 2 = I, G 3 = shift right and up x lz

Example: Simulation     c =1 G 1 = shift left and up, G 2 = I, G 3 = shift right and up z=z= x l

Example: Simulation     l =1 c =1 G 1 = shift left and up, G 2 = I, G 3 = shift right and up z=z= x

Example: Simulation     l =1 c =1 G 1 = shift left and up, G 2 = I, G 3 = shift right and up z=z= x=x=

Example: Simulation     x l c z G 1 = shift left and up, G 2 = I, G 3 = shift right and up

Example: Simulation     c =2 G 1 = shift left and up, G 2 = I, G 3 = shift right and up x lz

Example: Simulation     c =2 G 1 = shift left and up, G 2 = I, G 3 = shift right and up z=z= x l

Example: Simulation     l =3 c =2 G 1 = shift left and up, G 2 = I, G 3 = shift right and up z=z= x

Example: Simulation     l =3 c =2 G 1 = shift left and up, G 2 = I, G 3 = shift right and up z=z= x=x=

ML estimation of a Transformed Mixture of Gaussians using EM x l c z E step: Compute P(l|x), P(c|x) and p(z|c,x) for each x in data M step: Set –  c = avg of P(c|x) –  l = avg of P(l|x) –  c = avg mean of p(z|c,x) –  c = avg variance of p(z|c,x) –  = avg var of p(x-G l z|x)

A Tough Toy Problem 4 different shapes 25 possible locations cluttered background fixed distraction 100 “clusters” 200 training cases

Mixture of Gaussians Mean and first 5 principal components Transformed Mixture of Gaussians 5 horiz shifts + 5 vert shifts 20 iterations of EM

Face Clustering Examples of 400 outdoor images of 2 people (44 x 28 pixels)

Mixture of Gaussians 15 iterations of EM (MATLAB takes 1 minute) Cluster means c = 1 c = 2 c = 3 c = 4

Transformed mixture of Gaussians 11 horizontal shifts; 11 vertical shifts 4 clusters Each cluster has 1 mean and 1 variance for each latent pixel 1 variance for each observed pixel Training: 15 iterations of EM (MATLAB script takes 10 sec/image)

Initialization Cluster means c = 1 c = 2 c = 3 c = 4 Transformed mixture of Gaussians

1 iteration of EM Cluster means c = 1 c = 2 c = 3 c = 4 Transformed mixture of Gaussians

2 iterations of EM Cluster means c = 1 c = 2 c = 3 c = 4 Transformed mixture of Gaussians

3 iterations of EM Cluster means c = 1 c = 2 c = 3 c = 4 Transformed mixture of Gaussians

4 iterations of EM Cluster means c = 1 c = 2 c = 3 c = 4 Transformed mixture of Gaussians

5 iterations of EM Cluster means c = 1 c = 2 c = 3 c = 4 Transformed mixture of Gaussians

6 iterations of EM Cluster means c = 1 c = 2 c = 3 c = 4 Transformed mixture of Gaussians

7 iterations of EM Cluster means c = 1 c = 2 c = 3 c = 4 Transformed mixture of Gaussians

8 iterations of EM Cluster means c = 1 c = 2 c = 3 c = 4 Transformed mixture of Gaussians

9 iterations of EM Cluster means c = 1 c = 2 c = 3 c = 4 Transformed mixture of Gaussians

10 iterations of EM Cluster means c = 1 c = 2 c = 3 c = 4 Transformed mixture of Gaussians

11 iterations of EM Cluster means c = 1 c = 2 c = 3 c = 4 Transformed mixture of Gaussians

12 iterations of EM Cluster means c = 1 c = 2 c = 3 c = 4 Transformed mixture of Gaussians

13 iterations of EM Cluster means c = 1 c = 2 c = 3 c = 4 Transformed mixture of Gaussians

14 iterations of EM Cluster means c = 1 c = 2 c = 3 c = 4 Transformed mixture of Gaussians

15 iterations of EM Cluster means c = 1 c = 2 c = 3 c = 4 Transformed mixture of Gaussians

20 iterations of EM Cluster means c = 1 c = 2 c = 3 c = 4 Transformed mixture of Gaussians

30 iterations of EM Cluster means c = 1 c = 2 c = 3 c = 4 Transformed mixture of Gaussians

Mixture of Gaussians 30 iterations of EM Cluster means c = 1 c = 2 c = 3 c = 4

Modeling Written Digits

A TMG that Captures Writing Angle P(l|x) identifies the writing angle in image x CLUSTERSCLUSTERS TRANSFORMATIONS

Wrap-up MATLAB scripts available at Other domains: audio, bioinformatics, … Other latent image models, p(z) –factor analysis (prob PCA) (ICCV99) –mixtures of factor analyzers (NIPS99) –time series (CVPR00) Automatic video clustering Fast variational inference and learning