Download presentation
Presentation is loading. Please wait.
Published byRussell Davidson Modified over 9 years ago
1
Expectation-Maximization (EM) Chapter 3 (Duda et al.) – Section 3.9
CS479/679 Pattern Recognition Dr. George Bebis
2
Expectation-Maximization (EM)
EM is an iterative method to perform ML estimation: Starts with an initial estimate for θ. Refines the current estimate iteratively to increase the likelihood of the observed data: p(D/ θ)
3
Expectation-Maximization (EM)
EM represents a general framework – works best in situations where the data is incomplete (or can be thought as being incomplete) Some creativity is required to recognize where the EM algorithm can be used. Standard method for estimating the parameters of Mixtures of Gaussians (MoG).
4
Incomplete Data Many times, it is impossible to apply ML estimation because certain features cannot be measured directly. The EM algorithm is ideal for problems with unobserved (missing) data.
5
Example (Moon, 1996) s Assume a trinomial distribution: x1+x2+x3=k k!
6
Example (Moon, 1996) (cont’d)
7
EM: Main Idea If x was available, we could use ML to estimate θ, i.e.,
Since x is not available: Maximize the expectation of ln p(Dx / θ) with respect to the unknown variables given Dy and an estimate of θ.
8
EM Steps (1) Initialization (2) Expectation (3) Maximization
(4) Test for convergence
9
EM Steps (cont’d) (1) Initialization Step: initialize the algorithm with a guess θ0 (2) Expectation step: it is performed with respect to the unobserved variables, using the current estimate of parameters and conditioned upon the observations: When ln p(Dx / θ) is a linear function of the unobserved variables, the expectation step is equivalent to:
10
EM Steps (cont’d) (3) Maximization Step: provides a new estimate of the parameters: (4) Test for Convergence: stop; otherwise, go to Step 2. if
11
Example (Moon, 1996) (cont’d)
x1!x2!x3! Suppose: k! k!
12
Example (Moon, 1996) (cont’d)
Take expected value: k! Let’s look at the M-step for a minute before completing the E-step …
13
Example (Moon, 1996) (cont’d)
2Σi Σi We only need to estimate: Let’s go back and complete the E-step now …
14
Example (Moon, 1996) (cont’d)
(see Moon’s paper, page 53, for a proof)
15
Example (Moon, 1996) (cont’d)
Initialization: θ0 Expectation Step: Maximization Step: Convergence Step: 2Σi Σi
16
Example (Moon, 1996) (cont’d)
17
Convergence properties of EM
The solution depends on the initial estimate θ0 At each iteration, a value of θ is computed so that the likelihood function does not decrease. There is no guarantee that it will convergence to a global maximum. The algorithm is guaranteed to be stable. i.e., there is no chance of "overshooting" or diverging from the maximum.
18
Expectation-Maximization (EM)
EM represents a general framework – works best in situations where the data is incomplete (or can be thought as being incomplete) Some creativity is required to recognize where the EM algorithm can be used. Standard method for estimating the parameters of Mixtures of Gaussians (MoG).
19
Mixture of 2D Gaussians - Example
20
Mixture Model π1 π2 π3 πk
21
Mixture of 1D Gaussians - Example
π2=0.2 π1=0.3 π3=0.5
22
Mixture Parameters
23
Fitting a Mixture Model to a set of observations Dx
Two fundamental problems: (1) Estimate the number of mixture components K (2) Estimate mixture parameters (πk , θk), k=1,2,…,K
24
Mixtures of Gaussians (see Chapter 10)
where each p(x/θ)= The parameters θk are (μk,Σk)
25
Mixtures of Gaussians (cont’d)
π1 π2 π3 πk
26
Estimating Mixture Parameters Using ML – not easy!
27
Estimating Mixture Parameters Using EM: Case of Unknown Means
Assumptions Observation … but we don’t!
28
Estimating Mixture Parameters Using EM: Case of Unknown Means (cont’d)
Introduce hidden or unobserved variables zi
29
Estimating Mixture Parameters Using EM: Case of Unknown Means (cont’d)
Main steps using EM
30
Estimating Mixture Parameters Using EM: Case of Unknown Means (cont’d)
Expectation Step
31
Estimating Mixture Parameters Using EM: Case of Unknown Means (cont’d)
Expectation Step
32
Estimating Mixture Parameters Using EM: Case of Unknown Means (cont’d)
Expectation Step
33
Estimating Mixture Parameters Using EM: Case of Unknown Means (cont’d)
Expectation Step E(zik) is just the probability that xi was generated by the k-th component:
34
Estimating Mixture Parameters Using EM: Case of Unknown Means (cont’d)
Maximization Step
35
Estimating Mixture Parameters Using EM: Case of Unknown Means (cont’d)
Summary
36
Estimating Mixture Parameters Using EM: Case of Unknown Means (cont’d)
Summary
37
Estimating Mixture Parameters Using EM: General Case
Need to review Lagrange Optimization first …
38
Lagrange Optimization
solve for x and λ g(x)=0 n+1 equations / n+1 unknowns
39
Lagrange Optimization (cont’d)
Example Maximize f(x1,x2)=x1x2 subject to the constraint g(x1,x2)=x1+x2-1=0 3 equations / 3 unknowns
40
Estimating Mixture Parameters Using EM: General Case
Introduce hidden or unobserved variables zi
41
Estimating Mixture Parameters Using EM: General Case (cont’d)
Expectation Step
42
Estimating Mixture Parameters Using EM: General Case (cont’d)
Expectation Step (cont’d)
43
Estimating Mixture Parameters Using EM: General Case (cont’d)
Expectation Step (cont’d)
44
Estimating Mixture Parameters Using EM: General Case (cont’d)
Maximization Step use Lagrange optimization
45
Estimating Mixture Parameters Using EM: General Case (cont’d)
Maximization Step (cont’d)
46
Estimating Mixture Parameters Using EM: General Case (cont’d)
Summary
47
Estimating Mixture Parameters Using EM: General Case (cont’d)
Summary
48
Estimating the Number of Components K
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.