Presentation is loading. Please wait.

Presentation is loading. Please wait.

Expectation-Maximization (EM) Chapter 3 (Duda et al.) – Section 3.9

Similar presentations


Presentation on theme: "Expectation-Maximization (EM) Chapter 3 (Duda et al.) – Section 3.9"— Presentation transcript:

1 Expectation-Maximization (EM) Chapter 3 (Duda et al.) – Section 3.9
CS479/679 Pattern Recognition Dr. George Bebis

2 Expectation-Maximization (EM)
EM is an iterative method to perform ML estimation: Starts with an initial estimate for θ. Refines the current estimate iteratively to increase the likelihood of the observed data: p(D/ θ)

3 Expectation-Maximization (EM)
EM represents a general framework – works best in situations where the data is incomplete (or can be thought as being incomplete) Some creativity is required to recognize where the EM algorithm can be used. Standard method for estimating the parameters of Mixtures of Gaussians (MoG).

4 Incomplete Data Many times, it is impossible to apply ML estimation because certain features cannot be measured directly. The EM algorithm is ideal for problems with unobserved (missing) data.

5 Example (Moon, 1996) s Assume a trinomial distribution: x1+x2+x3=k k!

6 Example (Moon, 1996) (cont’d)

7 EM: Main Idea If x was available, we could use ML to estimate θ, i.e.,
Since x is not available: Maximize the expectation of ln p(Dx / θ) with respect to the unknown variables given Dy and an estimate of θ.

8 EM Steps (1) Initialization (2) Expectation (3) Maximization
(4) Test for convergence

9 EM Steps (cont’d) (1) Initialization Step: initialize the algorithm with a guess θ0 (2) Expectation step: it is performed with respect to the unobserved variables, using the current estimate of parameters and conditioned upon the observations: When ln p(Dx / θ) is a linear function of the unobserved variables, the expectation step is equivalent to:

10 EM Steps (cont’d) (3) Maximization Step: provides a new estimate of the parameters: (4) Test for Convergence: stop; otherwise, go to Step 2. if

11 Example (Moon, 1996) (cont’d)
x1!x2!x3! Suppose: k! k!

12 Example (Moon, 1996) (cont’d)
Take expected value: k! Let’s look at the M-step for a minute before completing the E-step …

13 Example (Moon, 1996) (cont’d)
2Σi Σi We only need to estimate: Let’s go back and complete the E-step now …

14 Example (Moon, 1996) (cont’d)
(see Moon’s paper, page 53, for a proof)

15 Example (Moon, 1996) (cont’d)
Initialization: θ0 Expectation Step: Maximization Step: Convergence Step: 2Σi Σi

16 Example (Moon, 1996) (cont’d)

17 Convergence properties of EM
The solution depends on the initial estimate θ0 At each iteration, a value of θ is computed so that the likelihood function does not decrease. There is no guarantee that it will convergence to a global maximum. The algorithm is guaranteed to be stable. i.e., there is no chance of "overshooting" or diverging from the maximum.

18 Expectation-Maximization (EM)
EM represents a general framework – works best in situations where the data is incomplete (or can be thought as being incomplete) Some creativity is required to recognize where the EM algorithm can be used. Standard method for estimating the parameters of Mixtures of Gaussians (MoG).

19 Mixture of 2D Gaussians - Example

20 Mixture Model π1 π2 π3 πk

21 Mixture of 1D Gaussians - Example
π2=0.2 π1=0.3 π3=0.5

22 Mixture Parameters

23 Fitting a Mixture Model to a set of observations Dx
Two fundamental problems: (1) Estimate the number of mixture components K (2) Estimate mixture parameters (πk , θk), k=1,2,…,K

24 Mixtures of Gaussians (see Chapter 10)
where each p(x/θ)= The parameters θk are (μk,Σk)

25 Mixtures of Gaussians (cont’d)
π1 π2 π3 πk

26 Estimating Mixture Parameters Using ML – not easy!

27 Estimating Mixture Parameters Using EM: Case of Unknown Means
Assumptions Observation … but we don’t!

28 Estimating Mixture Parameters Using EM: Case of Unknown Means (cont’d)
Introduce hidden or unobserved variables zi

29 Estimating Mixture Parameters Using EM: Case of Unknown Means (cont’d)
Main steps using EM

30 Estimating Mixture Parameters Using EM: Case of Unknown Means (cont’d)
Expectation Step

31 Estimating Mixture Parameters Using EM: Case of Unknown Means (cont’d)
Expectation Step

32 Estimating Mixture Parameters Using EM: Case of Unknown Means (cont’d)
Expectation Step

33 Estimating Mixture Parameters Using EM: Case of Unknown Means (cont’d)
Expectation Step E(zik) is just the probability that xi was generated by the k-th component:

34 Estimating Mixture Parameters Using EM: Case of Unknown Means (cont’d)
Maximization Step

35 Estimating Mixture Parameters Using EM: Case of Unknown Means (cont’d)
Summary

36 Estimating Mixture Parameters Using EM: Case of Unknown Means (cont’d)
Summary

37 Estimating Mixture Parameters Using EM: General Case
Need to review Lagrange Optimization first …

38 Lagrange Optimization
solve for x and λ g(x)=0 n+1 equations / n+1 unknowns

39 Lagrange Optimization (cont’d)
Example Maximize f(x1,x2)=x1x2 subject to the constraint g(x1,x2)=x1+x2-1=0 3 equations / 3 unknowns

40 Estimating Mixture Parameters Using EM: General Case
Introduce hidden or unobserved variables zi

41 Estimating Mixture Parameters Using EM: General Case (cont’d)
Expectation Step

42 Estimating Mixture Parameters Using EM: General Case (cont’d)
Expectation Step (cont’d)

43 Estimating Mixture Parameters Using EM: General Case (cont’d)
Expectation Step (cont’d)

44 Estimating Mixture Parameters Using EM: General Case (cont’d)
Maximization Step use Lagrange optimization

45 Estimating Mixture Parameters Using EM: General Case (cont’d)
Maximization Step (cont’d)

46 Estimating Mixture Parameters Using EM: General Case (cont’d)
Summary

47 Estimating Mixture Parameters Using EM: General Case (cont’d)
Summary

48 Estimating the Number of Components K


Download ppt "Expectation-Maximization (EM) Chapter 3 (Duda et al.) – Section 3.9"

Similar presentations


Ads by Google