Expectation-Maximization (EM) Chapter 3 (Duda et al.) – Section 3.9

Expectation-Maximization (EM) Chapter 3 (Duda et al.) – Section 3.9
CS479/679 Pattern Recognition Dr. George Bebis

Expectation-Maximization (EM)
EM is an iterative method to perform ML estimation: Starts with an initial estimate for θ. Refines the current estimate iteratively to increase the likelihood of the observed data: p(D/ θ)

EM represents a general framework – works best in situations where the data is incomplete (or can be thought as being incomplete) Some creativity is required to recognize where the EM algorithm can be used. Standard method for estimating the parameters of Mixtures of Gaussians (MoG).

Incomplete Data Many times, it is impossible to apply ML estimation because certain features cannot be measured directly. The EM algorithm is ideal for problems with unobserved (missing) data.

Example (Moon, 1996) s Assume a trinomial distribution: x1+x2+x3=k k!

Example (Moon, 1996) (cont’d)

EM: Main Idea If x was available, we could use ML to estimate θ, i.e.,
Since x is not available: Maximize the expectation of ln p(Dx / θ) with respect to the unknown variables given Dy and an estimate of θ.

EM Steps (1) Initialization (2) Expectation (3) Maximization
(4) Test for convergence

EM Steps (cont’d) (1) Initialization Step: initialize the algorithm with a guess θ0 (2) Expectation step: it is performed with respect to the unobserved variables, using the current estimate of parameters and conditioned upon the observations: When ln p(Dx / θ) is a linear function of the unobserved variables, the expectation step is equivalent to:

EM Steps (cont’d) (3) Maximization Step: provides a new estimate of the parameters: (4) Test for Convergence: stop; otherwise, go to Step 2. if

x1!x2!x3! Suppose: k! k!

Take expected value: k! Let’s look at the M-step for a minute before completing the E-step …

2Σi Σi We only need to estimate: Let’s go back and complete the E-step now …

(see Moon’s paper, page 53, for a proof)

Initialization: θ0 Expectation Step: Maximization Step: Convergence Step: 2Σi Σi

Convergence properties of EM
The solution depends on the initial estimate θ0 At each iteration, a value of θ is computed so that the likelihood function does not decrease. There is no guarantee that it will convergence to a global maximum. The algorithm is guaranteed to be stable. i.e., there is no chance of "overshooting" or diverging from the maximum.

EM represents a general framework – works best in situations where the data is incomplete (or can be thought as being incomplete) Some creativity is required to recognize where the EM algorithm can be used. Standard method for estimating the parameters of Mixtures of Gaussians (MoG).

Mixture of 2D Gaussians - Example

Mixture Model π1 π2 π3 πk

Mixture of 1D Gaussians - Example
π2=0.2 π1=0.3 π3=0.5

Mixture Parameters

Fitting a Mixture Model to a set of observations Dx
Two fundamental problems: (1) Estimate the number of mixture components K (2) Estimate mixture parameters (πk , θk), k=1,2,…,K

Mixtures of Gaussians (see Chapter 10)
where each p(x/θ)= The parameters θk are (μk,Σk)

Mixtures of Gaussians (cont’d)
π1 π2 π3 πk

Estimating Mixture Parameters Using ML – not easy!

Estimating Mixture Parameters Using EM: Case of Unknown Means
Assumptions Observation … but we don’t!

Estimating Mixture Parameters Using EM: Case of Unknown Means (cont’d)
Introduce hidden or unobserved variables zi

Main steps using EM

Expectation Step

Expectation Step E(zik) is just the probability that xi was generated by the k-th component:

Maximization Step

Summary

Estimating Mixture Parameters Using EM: General Case
Need to review Lagrange Optimization first …

Lagrange Optimization
solve for x and λ g(x)=0 n+1 equations / n+1 unknowns

Lagrange Optimization (cont’d)
Example Maximize f(x1,x2)=x1x2 subject to the constraint g(x1,x2)=x1+x2-1=0 3 equations / 3 unknowns

Estimating Mixture Parameters Using EM: General Case
Introduce hidden or unobserved variables zi

Estimating Mixture Parameters Using EM: General Case (cont’d)
Expectation Step

Expectation Step (cont’d)

Maximization Step use Lagrange optimization

Maximization Step (cont’d)

Summary

Estimating the Number of Components K

Expectation-Maximization (EM) Chapter 3 (Duda et al.) – Section 3.9

Similar presentations

Presentation on theme: "Expectation-Maximization (EM) Chapter 3 (Duda et al.) – Section 3.9"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Expectation-Maximization (EM) Chapter 3 (Duda et al.) – Section 3.9

Similar presentations

Presentation on theme: "Expectation-Maximization (EM) Chapter 3 (Duda et al.) – Section 3.9"— Presentation transcript:

Similar presentations

About project

Feedback