Download presentation
Presentation is loading. Please wait.
Published byMavis Joseph Modified over 9 years ago
1
Incomplete Graphical Models Nan Hu
2
Outline Motivation K-means clustering Coordinate Descending algorithm Density estimation EM on unconditional mixture Regression and classification EM on conditional mixture A general formulation of EM Algorithm
3
K-means clustering Problem: Given a set of observations how to group them into a set of K clustering, supposing the value of K is given. First Phase Second Phase
4
K-means clustering Original Set First Iteration Second Iteration Third Iteration
5
K-means clustering Coordinate descent algorithm The algorithm is trying to minimize distortion measure J by setting the partial derivatives to zero
6
Unconditional Mixture Problem: If the given sample data demonstrate multimodal densities, how to estimate the true density? Fit a single density with this bimodal case. Although algorithm converges, the results bear little relationship to the truth.
7
Unconditional Mixture A “divide-and-conquer” way to solve this problem Introducing latent variable Z Z X Multinomial node taking on one of K values Assign a density model for each subpopulation, overall density is Back
8
Unconditional Mixture Gaussian Mixture Models In this model, the mixture components are Gaussian distributions with parameters Probability model for a Gaussian mixture
9
Unconditional Mixture Posterior probability of latent variable Z: Log likelihood:
10
Unconditional Mixture Partial derivative of over using Lagrange Multipliers Solve it, we have
11
Unconditional Mixture Partial derivative of over Setting it to zero, we have
12
Unconditional Mixture Partial derivative of over Setting it to zero, we have
13
Unconditional Mixture The EM Algorithm First Phase Second Phase Back
14
Unconditional Mixture EM algorithm from expected complete log likelihood point of view Suppose we observed the latent variables, the data set becomes completely observed, the likelihood is defined as the complete log likelihood
15
Unconditional Mixture We treat the as random variables and take expectations conditioned on X and. Note are binary r.v., where Use this as the “best guess” for, we have Expected complete log likelihood
16
Unconditional Mixture Minimizing expected complete log likelihood by setting the derivatives to zero, we have
17
Conditional Mixture Graphical Model X Z Y Latent variable Z, multinomial node taking on one of K values For regression and classification The relationship between X and Z can be modeled in a discriminative classification way, e.g. softmax func. Back
18
Conditional Mixture By marginalizing over Z, X is taken to be always observed. The posterior probability is defined as
19
Conditional Mixture Some specific choice of mixture components Gaussian components Logistic components Where is the logistic function:
20
Conditional Mixture Parameter estimation via EM Complete log likelihood : Use expectation as the “best guess”, we have
21
Conditional Mixture The expected complete log likelihood can then be written as Taking partial derivatives and setting them to zero to find the update formula for EM
22
Conditional Mixture Summary of EM algorithm for conditional mixture (E step): Calculate the posterior probabilities (M step): Use the IRLS algorithm to update the parameter, base on data pairs. (M step): Use the weighted IRLS algorithm to update the parameters, based on the data points, with weights. Back
23
General Formulation - all observable variables - all latent variables - all parameters Suppose is observed, the ML estimate is However, is in fact not observed Complete log likelihood Incomplete log likelihood
24
General Formulation Suppose factors in some way, complete log likelihood turns to be Since is unknown, it’s not clear how to solve this ML estimation. However, if we average over the r.v. of
25
General Formulation Use as an estimate of, complete log likelihood becomes expected complete log likelihood This expected complete log likelihood becomes solvable, and hopefully, it’ll also improve the complete log likelihood in some way. (The basic idea behind EM.)
26
General Formulation EM maximizes incomplete log likelihood Jensen’s Inequality Auxiliary Function
27
General Formulation Given, maximizing is equal to maximizing the expected complete log likelihood
28
General Formulation Given, the choice yields the maximum of. Note: is the upper bound of
29
General Formulation From above, at every step of EM, we maximized. However, how do we know whether the finally maximized also maximized incomplete log likelihood ?
30
General Formulation The different between and KL divergence non-negative and uniquely minimized at
31
General Formulation EM and alternating minimization Recall the maximization of the likelihood is exactly the same as minimization of KL divergence between the empirical distribution and the model. Including the latent variable, KL divergence comes to be a “complete KL divergence” between joint distributions on.complete KL divergence
32
General Formulation Back
33
General Formulation Reformulated EM algorithm (E step) (M step) Alternating minimization algorithm
34
Summary Unconditional Mixture Graphic model EM algorithm Conditional Mixture Graphic model EM algorithm A general formulation of EM algorithm Maximizing auxiliary function Minimizing “complete KL divergence”
35
Incomplete Graphical Models Thank You!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.