Image Modeling & Segmentation

Slides:



Advertisements
Similar presentations
University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Gaussian Mixture.
Advertisements

Unsupervised Learning
ECE 8443 – Pattern Recognition LECTURE 05: MAXIMUM LIKELIHOOD ESTIMATION Objectives: Discrete Features Maximum Likelihood Resources: D.H.S: Chapter 3 (Part.
Expectation Maximization Expectation Maximization A “Gentle” Introduction Scott Morris Department of Computer Science.
Expectation Maximization
Maximum Likelihood And Expectation Maximization Lecture Notes for CMPUT 466/551 Nilanjan Ray.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Jensen’s Inequality (Special Case) EM Theorem.
Segmentation and Fitting Using Probabilistic Methods
DATA MINING van data naar informatie Ronald Westra Dep. Mathematics Maastricht University.
Hidden Variables, the EM Algorithm, and Mixtures of Gaussians Computer Vision CS 143, Brown James Hays 02/22/11 Many slides from Derek Hoiem.
Hidden Variables, the EM Algorithm, and Mixtures of Gaussians Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 03/15/12.
Visual Recognition Tutorial
EE-148 Expectation Maximization Markus Weber 5/11/99.
EE 290A: Generalized Principal Component Analysis Lecture 6: Iterative Methods for Mixture-Model Segmentation Sastry & Yang © Spring, 2011EE 290A, University.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Midterm Review. The Midterm Everything we have talked about so far Stuff from HW I won’t ask you to do as complicated calculations as the HW Don’t need.
First introduced in 1977 Lots of mathematical derivation Problem : given a set of data (data is incomplete or having missing values). Goal : assume the.
Part 4 b Forward-Backward Algorithm & Viterbi Algorithm CSE717, SPRING 2008 CUBS, Univ at Buffalo.
Most slides from Expectation Maximization (EM) Northwestern University EECS 395/495 Special Topics in Machine Learning.
Gaussian Mixture Example: Start After First Iteration.
Expectation Maximization Algorithm
Expectation Maximization for GMM Comp344 Tutorial Kai Zhang.
Visual Recognition Tutorial
Expectation-Maximization (EM) Chapter 3 (Duda et al.) – Section 3.9
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Gaussian Mixture Models and Expectation Maximization.
Incomplete Graphical Models Nan Hu. Outline Motivation K-means clustering Coordinate Descending algorithm Density estimation EM on unconditional mixture.
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
EM and expected complete log-likelihood Mixture of Experts
A statistical model Μ is a set of distributions (or regression functions), e.g., all uni-modal, smooth distributions. Μ is called a parametric model if.
1 HMM - Part 2 Review of the last lecture The EM algorithm Continuous density HMM.
Lecture 19: More EM Machine Learning April 15, 2010.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
Image Modeling & Segmentation Aly Farag and Asem Ali Lecture #2.
Bayesian Parameter Estimation Liad Serruya. Agenda Introduction Bayesian decision theory Scale-Invariant Learning Bayesian “One-Shot” Learning.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: ML and Simple Regression Bias of the ML Estimate Variance of the ML Estimate.
MACHINE LEARNING 8. Clustering. Motivation Based on E ALPAYDIN 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Classification problem:
Lecture 17 Gaussian Mixture Models and Expectation Maximization
HMM - Part 2 The EM algorithm Continuous density HMM.
Lecture 6 Spring 2010 Dr. Jianjun Hu CSCE883 Machine Learning.
CS Statistical Machine learning Lecture 24
Computer Vision Lecture 6. Probabilistic Methods in Segmentation.
Lecture 2: Statistical learning primer for biologists
Flat clustering approaches
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov
RADFORD M. NEAL GEOFFREY E. HINTON 발표: 황규백
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
Hidden Variables, the EM Algorithm, and Mixtures of Gaussians Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 02/22/11.
Hidden Markov Models. A Hidden Markov Model consists of 1.A sequence of states {X t |t  T } = {X 1, X 2,..., X T }, and 2.A sequence of observations.
Machine Learning Expectation Maximization and Gaussian Mixtures CSE 473 Chapter 20.3.
Computational Intelligence: Methods and Applications Lecture 26 Density estimation, Expectation Maximization. Włodzisław Duch Dept. of Informatics, UMK.
RECITATION 2 APRIL 28 Spline and Kernel method Gaussian Processes Mixture Modeling for Density Estimation.
Machine Learning Expectation Maximization and Gaussian Mixtures CSE 473 Chapter 20.3.
A Study on Speaker Adaptation of Continuous Density HMM Parameters By Chin-Hui Lee, Chih-Heng Lin, and Biing-Hwang Juang Presented by: 陳亮宇 1990 ICASSP/IEEE.
Introduction We consider the data of ~1800 phenotype measurements Each mouse has a given probability distribution of descending from one of 8 possible.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Introduction to Machine Learning Nir Ailon Lecture 12: EM, Clustering and More.
Classification of unlabeled data:
Maximum Likelihood Estimation
CS 2750: Machine Learning Expectation Maximization
Latent Variables, Mixture Models and EM
دانشگاه صنعتی امیرکبیر Instructor : Saeed Shiry
Probabilistic Models with Latent Variables
SMEM Algorithm for Mixture Models
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
EM Algorithm 主講人:虞台文.
A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes International.
A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes International.
Probabilistic Surrogate Models
Presentation transcript:

Image Modeling & Segmentation Aly Farag and Asem Ali Lecture #3

Parametric methods These methods are useful when the underlying distribution is known in advance or is simple enough to be modeled by a simple distribution function or a mixture of such functions The parametric model is very compact (low memory and CPU usage) where only few parameters need to fit. The model’s parameters are estimated using these methods such as maximum likelihood estimation, Bayesian estimation and expectation maximization. A location parameter simply shifts the graph left or right on the horizontal axis. A scale parameter (>1) stretches or (<1 ) compress the pdf,

To show the dependence of p on ɵ explicitly. Parametric methods (1- Maximum Likelihood Estimator: MLE) Suppose that n samples x1, x2, … , xn are drawn independently and identically distributed (i.i.d.) ~ distribution φ(θ) with vector of parameters θ=(θ1,…. , θr) Know: data samples, the distribution type Unknown : θ ????? MLE Method estimates θ by maximizing the log likelihood of the data To show the dependence of p on ɵ explicitly. By i.i.d. monotonicity of log

Parametric methods (1- Maximum Likelihood Estimator: MLE) Let Then calculate Find θ by letting Coin Example ……………….. In some cases we can find a closed form for θ Example: Suppose that n samples x1, x2, … , xn are drawn independently and identically distributed (i.i.d.)~ 1D N(μ,σ) find the MLE of μ and σ Matlab Demo

Parametric methods (1- Maximum Likelihood Estimator: MLE) An estimator of a parameter is unbiased if the expected value of the estimate is the same as the true value of the parameters. Example: An estimator of a parameter is biased if the expected value of the estimate is different from the true value of the parameters. Example: doesn’t make much difference once n --> large

What if there are distinct subpopulations in observed data? Parametric methods What if there are distinct subpopulations in observed data? Example Pearson in 1894, tried to model the distribution of the ratio between measurements of forehead and body length on crabs. He used a two-component mixture. It was hypothesized that the two-component structure was related to the possibility of this particular population of crabs evolving into two new subspecies Mixture Model The underlying density is assumed to have the form What is the difference between Mixture Model and the kernel-based estimator? Components of the mixture are densities and are parameterized by The weights, Constrained

Parametric methods Example Given that {xi, Ci1, Ci2} n samples (complete data) drawn i.i.d. two normal distributions xi observed value of ith instance Ci1 and Ci2 indicate which of two normal distributions was used to generate xi Cij=1 if Cij was used to generate xi, 0 otherwise MLE How can we estimate the parameters given incomplete data (don’t know Ci1 and Ci2 )?

Parametric methods (2- Expectation Maximization: EM) The EM algorithm is a general method of finding the maximum-likelihood estimate of the parameters of an underlying distribution from a given data set when the data is incomplete or has missing values. EM Algorithm: Given initial parameters Θ0 Repeatedly Re-estimating expected values of hidden binary variables Cij Then recalculate the MLE of Θ using these expected values for the hidden variables Note: EM unsupervised method, but MLE supervised To use EM you must to know: Number of classes K, Parametric form of the distribution.

Illustrative example complete incomplete

Illustrative example

Parametric methods (2- Expectation Maximization: EM) Assume that a joint density function for complete data set The EM algorithm first finds the expected value of the complete-data log-likelihood with respect to the unknown data C given the observed data x and the current parameter estimates Θi-1. The current parameters estimates that we used to evaluate the expectation The new parameters that we optimize to maximize Q The evaluation of this expectation is called the E-step of the algorithm The second step (the M-step) of the EM algorithm is to maximize the expectation we computed in the first step. These two steps are repeated as necessary. Each iteration is guaranteed to increase the log likelihood and the algorithm is guaranteed to converge to a local maximum of the likelihood function.

Parametric methods (2- Expectation Maximization: EM) The mixture-density parameter estimation problem Using Bayes’s rule, we can compute: E-step Then Grades Example ………………..

Parametric methods (2- Expectation Maximization: EM) For some distributions, it is possible to get an analytical expressions for For example, if we assume d-dimensional Gaussian component distributions M-step E-step

Parametric methods (2- Expectation Maximization: EM) Example: MATLAB demo