Incomplete Graphical Models Nan Hu. Outline Motivation K-means clustering Coordinate Descending algorithm Density estimation EM on unconditional mixture.

Incomplete Graphical Models Nan Hu

Outline Motivation K-means clustering Coordinate Descending algorithm Density estimation EM on unconditional mixture Regression and classification EM on conditional mixture A general formulation of EM Algorithm

K-means clustering Problem: Given a set of observations how to group them into a set of K clustering, supposing the value of K is given. First Phase Second Phase

K-means clustering Original Set First Iteration Second Iteration Third Iteration

K-means clustering Coordinate descent algorithm The algorithm is trying to minimize distortion measure J by setting the partial derivatives to zero

Unconditional Mixture Problem: If the given sample data demonstrate multimodal densities, how to estimate the true density? Fit a single density with this bimodal case. Although algorithm converges, the results bear little relationship to the truth.

Unconditional Mixture A “divide-and-conquer” way to solve this problem Introducing latent variable Z Z X Multinomial node taking on one of K values Assign a density model for each subpopulation, overall density is Back

Unconditional Mixture Gaussian Mixture Models In this model, the mixture components are Gaussian distributions with parameters Probability model for a Gaussian mixture

Unconditional Mixture Posterior probability of latent variable Z: Log likelihood:

Unconditional Mixture Partial derivative of over using Lagrange Multipliers Solve it, we have

Unconditional Mixture Partial derivative of over Setting it to zero, we have

Unconditional Mixture The EM Algorithm First Phase Second Phase Back

Unconditional Mixture EM algorithm from expected complete log likelihood point of view Suppose we observed the latent variables, the data set becomes completely observed, the likelihood is defined as the complete log likelihood

Unconditional Mixture We treat the as random variables and take expectations conditioned on X and. Note are binary r.v., where Use this as the “best guess” for, we have Expected complete log likelihood

Unconditional Mixture Minimizing expected complete log likelihood by setting the derivatives to zero, we have

Conditional Mixture Graphical Model X Z Y Latent variable Z, multinomial node taking on one of K values For regression and classification The relationship between X and Z can be modeled in a discriminative classification way, e.g. softmax func. Back

Conditional Mixture By marginalizing over Z, X is taken to be always observed. The posterior probability is defined as

Conditional Mixture Some specific choice of mixture components Gaussian components Logistic components Where is the logistic function:

Conditional Mixture Parameter estimation via EM Complete log likelihood : Use expectation as the “best guess”, we have

Conditional Mixture The expected complete log likelihood can then be written as Taking partial derivatives and setting them to zero to find the update formula for EM

Conditional Mixture Summary of EM algorithm for conditional mixture (E step): Calculate the posterior probabilities (M step): Use the IRLS algorithm to update the parameter, base on data pairs. (M step): Use the weighted IRLS algorithm to update the parameters, based on the data points, with weights. Back

General Formulation - all observable variables - all latent variables - all parameters Suppose is observed, the ML estimate is However, is in fact not observed Complete log likelihood Incomplete log likelihood

General Formulation Suppose factors in some way, complete log likelihood turns to be Since is unknown, it’s not clear how to solve this ML estimation. However, if we average over the r.v. of

General Formulation Use as an estimate of, complete log likelihood becomes expected complete log likelihood This expected complete log likelihood becomes solvable, and hopefully, it’ll also improve the complete log likelihood in some way. (The basic idea behind EM.)

General Formulation EM maximizes incomplete log likelihood Jensen’s Inequality Auxiliary Function

General Formulation Given, maximizing is equal to maximizing the expected complete log likelihood

General Formulation Given, the choice yields the maximum of. Note: is the upper bound of

General Formulation From above, at every step of EM, we maximized. However, how do we know whether the finally maximized also maximized incomplete log likelihood ?

General Formulation The different between and KL divergence non-negative and uniquely minimized at

General Formulation EM and alternating minimization Recall the maximization of the likelihood is exactly the same as minimization of KL divergence between the empirical distribution and the model. Including the latent variable, KL divergence comes to be a “complete KL divergence” between joint distributions on.complete KL divergence

General Formulation Back

General Formulation Reformulated EM algorithm (E step) (M step) Alternating minimization algorithm

Summary Unconditional Mixture Graphic model EM algorithm Conditional Mixture Graphic model EM algorithm A general formulation of EM algorithm Maximizing auxiliary function Minimizing “complete KL divergence”

Incomplete Graphical Models Thank You!

Incomplete Graphical Models Nan Hu. Outline Motivation K-means clustering Coordinate Descending algorithm Density estimation EM on unconditional mixture.

Similar presentations

Presentation on theme: "Incomplete Graphical Models Nan Hu. Outline Motivation K-means clustering Coordinate Descending algorithm Density estimation EM on unconditional mixture."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Incomplete Graphical Models Nan Hu. Outline Motivation K-means clustering Coordinate Descending algorithm Density estimation EM on unconditional mixture.

Similar presentations

Presentation on theme: "Incomplete Graphical Models Nan Hu. Outline Motivation K-means clustering Coordinate Descending algorithm Density estimation EM on unconditional mixture."— Presentation transcript:

Similar presentations

About project

Feedback