Maximum likelihood separation of spatially autocorrelated images using a Markov model Shahram Hosseini 1, Rima Guidara 1, Yannick Deville 1 and Christian.

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

ECE 8443 – Pattern Recognition LECTURE 05: MAXIMUM LIKELIHOOD ESTIMATION Objectives: Discrete Features Maximum Likelihood Resources: D.H.S: Chapter 3 (Part.
CS479/679 Pattern Recognition Dr. George Bebis
2 – In previous chapters: – We could design an optimal classifier if we knew the prior probabilities P(wi) and the class- conditional probabilities P(x|wi)
Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests
Maximum Likelihood And Expectation Maximization Lecture Notes for CMPUT 466/551 Nilanjan Ray.
Fast Bayesian Matching Pursuit Presenter: Changchun Zhang ECE / CMR Tennessee Technological University November 12, 2010 Reading Group (Authors: Philip.
Segmentation and Fitting Using Probabilistic Methods
Visual Recognition Tutorial
Prénom Nom Document Analysis: Parameter Estimation for Pattern Recognition Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Adaptive Rao-Blackwellized Particle Filter and It’s Evaluation for Tracking in Surveillance Xinyu Xu and Baoxin Li, Senior Member, IEEE.
Independent Component Analysis (ICA) and Factor Analysis (FA)
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 8 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.
Visual Recognition Tutorial
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
QUASI MAXIMUM LIKELIHOOD BLIND DECONVOLUTION QUASI MAXIMUM LIKELIHOOD BLIND DECONVOLUTION Alexander Bronstein.
(1) A probability model respecting those covariance observations: Gaussian Maximum entropy probability distribution for a given covariance observation.
Lecture II-2: Probability Review
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
Outline Separating Hyperplanes – Separable Case
The horseshoe estimator for sparse signals CARLOS M. CARVALHO NICHOLAS G. POLSON JAMES G. SCOTT Biometrika (2010) Presented by Eric Wang 10/14/2010.
Speech Recognition Pattern Classification. 22 September 2015Veton Këpuska2 Pattern Classification  Introduction  Parametric classifiers  Semi-parametric.
G. Cowan Lectures on Statistical Data Analysis Lecture 3 page 1 Lecture 3 1 Probability (90 min.) Definition, Bayes’ theorem, probability densities and.
1 Physical Fluctuomatics 5th and 6th Probabilistic information processing by Gaussian graphical model Kazuyuki Tanaka Graduate School of Information Sciences,
Particle Filtering (Sequential Monte Carlo)
November 1, 2012 Presented by Marwan M. Alkhweldi Co-authors Natalia A. Schmid and Matthew C. Valenti Distributed Estimation of a Parametric Field Using.
Independent Component Analysis Zhen Wei, Li Jin, Yuxue Jin Department of Statistics Stanford University An Introduction.
Stochastic Linear Programming by Series of Monte-Carlo Estimators Leonidas SAKALAUSKAS Institute of Mathematics&Informatics Vilnius, Lithuania
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
Ch 2. Probability Distributions (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by Yung-Kyun Noh and Joo-kyung Kim Biointelligence.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Definitions Random Signal Analysis (Review) Discrete Random Signals Random.
Clustering and Testing in High- Dimensional Data M. Radavičius, G. Jakimauskas, J. Sušinskas (Institute of Mathematics and Informatics, Vilnius, Lithuania)
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: ML and Simple Regression Bias of the ML Estimate Variance of the ML Estimate.
: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation 1 Montri Karnjanadecha ac.th/~montri.
Chapter 3: Maximum-Likelihood Parameter Estimation l Introduction l Maximum-Likelihood Estimation l Multivariate Case: unknown , known  l Univariate.
An Introduction to Kalman Filtering by Arthur Pece
Gaussian Mixture Models and Expectation-Maximization Algorithm.
Lecture 2: Statistical learning primer for biologists
Elements of Pattern Recognition CNS/EE Lecture 5 M. Weber P. Perona.
1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.
M.Sc. in Economics Econometrics Module I Topic 4: Maximum Likelihood Estimation Carol Newman.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
The Unscented Particle Filter 2000/09/29 이 시은. Introduction Filtering –estimate the states(parameters or hidden variable) as a set of observations becomes.
Introduction to Estimation Theory: A Tutorial
1 Information Content Tristan L’Ecuyer. 2 Degrees of Freedom Using the expression for the state vector that minimizes the cost function it is relatively.
G. Cowan Lectures on Statistical Data Analysis Lecture 9 page 1 Statistical Data Analysis: Lecture 9 1Probability, Bayes’ theorem 2Random variables and.
Ch 2. Probability Distributions (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by Joo-kyung Kim Biointelligence Laboratory,
Density Estimation in R Ha Le and Nikolaos Sarafianos COSC 7362 – Advanced Machine Learning Professor: Dr. Christoph F. Eick 1.
Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.
Multiple Random Variables and Joint Distributions
CS479/679 Pattern Recognition Dr. George Bebis
Probability Theory and Parameter Estimation I
Ch3: Model Building through Regression
Classification of unlabeled data:
Latent Variables, Mixture Models and EM
Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests
Propagating Uncertainty In POMDP Value Iteration with Gaussian Process
REMOTE SENSING Multispectral Image Classification
Filtering and State Estimation: Basic Concepts
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Computing and Statistical Data Analysis / Stat 7
Parametric Methods Berlin Chen, 2005 References:
Independent Factor Analysis
Learning From Observed Data
A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes International.
A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes International.
Presentation transcript:

Maximum likelihood separation of spatially autocorrelated images using a Markov model Shahram Hosseini 1, Rima Guidara 1, Yannick Deville 1 and Christian Jutten 2 1. Laboratoire d’Astrophysique de Toulouse-Tarbes (LATT), Observatoire Midi-Pyrénées - Université Paul Sabatier Toulouse 3, 14 A. Edouard Belin, Toulouse, France. 2. Laboratoire des Images et des Signaux, UMR CNRS-INPG-UPS, 46 Avenue Félix Viallet, Grenoble, France

MAXENT 2006, July 8-13, Paris-France2 OUTLINE Problem statement A maximum likelihood approach using Markov model - Second-order Markov random field - Score function estimation - Gradient-based optimisation algorithm Experimental results Conlusion

MAXENT 2006, July 8-13, Paris-France3 Problem statement Assumptions : Linear instantaneous mixture. K unknown independent source images, K observations, N=N 1 ×N 2 samples. Unknown mixing matrix A is invertible. Each source is spatially autocorrelated and can be modeled as a 2nd- order Markov random field. Goal: Compute B by a maximum likelihood (ML) approach. Mixing matrixSeparating matrix Problem statement A maximum likelihood approach using Markov model Experimental results Conlusion

MAXENT 2006, July 8-13, Paris-France4 Motivations Maximum likelihood approach: provides an asymptotically efficient estimator (smallest error covariance matrix among unbiased estimators). Modeling the source images by Markov random fields: - Most of real images present a spatial autocorrelation within near pixels. - Spatial autocorrelation can make the estimation of the model possible where the basic blind source separation methods cannot estimate it (if image sources are Gaussian but spatially autocorrelated). - Markov random fields allow taking into account spatial autocorrelation without a priori assumption concerning the probability density of the sources.

MAXENT 2006, July 8-13, Paris-France5 ML approach (1)  Independence of sources  Problem statement A maximum likelihood approach using Markov model Experimental results Conlusion  We denote the joint PDF of all the samples of all the components of the observation vector x.  Maximum likelihood estimate: whereis the joint PDF of all the samples of source s i

MAXENT 2006, July 8-13, Paris-France6 ML approach (2) Decomposition of the joint PDF of each source using Bayes rule  Many possible sweeping trajectories that preserve continuity and can exploit spatial autocorrelation within the image: (1)(2) (3) These different sweeping schemes being essentially equivalent, we chose the horizontal one.

MAXENT 2006, July 8-13, Paris-France7 ML approach (3)  Bayes rule decomposition resulting from a horizontal sweeping:  To simplify F, sources are modeled by second-order Markov random fields  Conditional PDF of a pixel given all remaining pixels of the image equals its conditional PDF given its 8 nearest neighbors.

MAXENT 2006, July 8-13, Paris-France8 ML approach (4)  is the set of the predecessors of a pixel in the sense of the horizontal sweeping trajectory. For a pixel not located on the boundary of the image, we obtain Denote If the image is quite large, pixels situated on the boundaries can be neglected. We can then write

MAXENT 2006, July 8-13, Paris-France9 ML approach (5) The initial joint PDF to be maximized: Taking the logarithm of C, dividing it by N and defining the spatial average operator the log-likelihood can finally be written as

MAXENT 2006, July 8-13, Paris-France10 ML approach (6) Problem statement A maximum likelihood approach using Markov model Experimental results Conlusion Taking the derivative of L 1 with respect to the separating matrix B, we have  We define the conditional score function of the source s i with respect to the pixel by:

MAXENT 2006, July 8-13, Paris-France11  Denoting: - the column vector which has the conditional score fonctions of the K sources as components. - the K-dimensional vector of observations. we finally obtain ML approach (7) where

MAXENT 2006, July 8-13, Paris-France12 Estimation of score functions Conditional score fonctions must be estimated to solve our ML problem. They may be estimated only via reconstructed sources y i (n 1,n 2 ). We used the method proposed in [D.-T.Pham, IEEE Trans. On Signal Processing, Oct. 2004] - A non-parametric kernel density estimator using third-order cardinal spline kernels. - Estimation of joint entropies using a discrete Riemann sum. Problem statement A maximum likelihood approach using Markov model Experimental results Conlusion No prior knowledge of the source distributions is needed. Good estimation of the conditional score functions.  Very time consuming, especially for large-size images.

MAXENT 2006, July 8-13, Paris-France13 An equivariant algorithm Initialize B=I. Repeat until convergence : - Compute estimated sources y=Bx. Normalize to unit power. - Estimate the conditional score functions - Compute the matrix - Update B Problem statement A maximum likelihood approach using Markov model Experimental results Conlusion

MAXENT 2006, July 8-13, Paris-France14 OUTLINE Problem statement A maximum likelihood approach - Second-order Markov model - Score functions estimation - Gradient optimisation algorithm Experimental results Conclusion

MAXENT 2006, July 8-13, Paris-France15 Experimental results Comparison with two classical methods: 1. SOBI algorithm - A second-order method - Joint diagonalisation of covariance matrices evaluated at different lags.  Exploits autocorrelation but ignores possible non-Gaussianity 2. Pham-Garat algorithm - A maximum likelihood approach - Sources are supposed i.i.d  Exploits non-Gaussianity but ignores possible autocorrelation. Problem statement A maximum likelihood approach using Markov model Experimental results Conlusion

MAXENT 2006, July 8-13, Paris-France16 Artificial data (1) Generate two autocorrelated images : 1. Generate two independent white and uniformly distributed noise images and. 2. Filter i.i.d noise images by 2 Infinite Impulse Response (IIR) filters Problem statement A maximum likelihood approach using Markov model Experimental results Conlusion In this case, generated images perfectly satisfy the working hypotheses : source images are stationary and second-order Markov random fields.

MAXENT 2006, July 8-13, Paris-France17 Signal to Interference Ratio (SIR) Mixing matrix : Problem statement A maximum likelihood approach using Markov model Experimental results Conlusion  The mean of the SIR over 100 Monte Carlo simulations is computed and plotted as a function of the filter parameter ρ 22.

MAXENT 2006, July 8-13, Paris-France18 Artificial data (2)  Artificial data are generated by filtering i.i.d noise images by means of 2 Finite Impulse Response (FIR) Filters.  The source images are stationary but cannot be modeled by second-order Markov random fields.  The mean of the SIR over 100 Monte Carlo simulations is computed and plotted as a function of the selectivity of one of the filters.

MAXENT 2006, July 8-13, Paris-France19 Astrophysical images (1)  working hypotheses no longer true (non-stationary, non second-order Markov random field images)  Weak mixture :SIR=70 dB  Strong mixture: Separation failed because of bad initial estimation of conditional score functions (estimated sources highly different from actual sources). Problem statement A maximum likelihood approach using Markov model Experimental results Conlusion

MAXENT 2006, July 8-13, Paris-France20 SIR Markov: 70 dB Pham-Garat: 13 dB SOBI: 36 dB Solution : Initialize our method with a sub-optimal algorithm like SOBI to obtain a low mixture ratio.

MAXENT 2006, July 8-13, Paris-France21 To conclude… A quasi-optimal maximum likelihood method taking into account both non-Gaussianity and spatial autocorrelation is proposed. Good performance on artificial and real data has been achieved. Very time consuming : Solutions to reduce computational cost : - A parametric polynomial estimator of the conditional score functions - A modified equivariant Newton optimization algorithm