Download presentation
Presentation is loading. Please wait.
Published byArleen Hood Modified over 6 years ago
1
Mikhail Belkin Dept. of Computer Science Ohio State University
Data Spectroscopy: Learning Mixture Models with Eigenspaces of Probability Distributions Mikhail Belkin Dept. of Computer Science Ohio State University Joint work: Tao Shi Ohio State University, Dept. of Statistics, Bin Yu University of California, Berkeley, Dept. of Statistics
2
Spectral geometry Classical question: can you hear the shape of a drum? Cannot hear the full shape but can hear dimension, area, etc. Can also hear homology groups.
3
Eigenfunctions Orthogonal basis. Contain complete information about the manifold. Any function can be written as Fourier series.
4
Can you hear the shape of a probability distribution?
5
Spectrum of a probability distribution p
Kernel K , probability distribution p. Discrete spectrum: Eigenfunctions form an ortho basis for L2(p)
6
How to hear a Gaussian P = N(m,s) Eigenvalues and eigenvectors:
Easy to estimate mean and variance.
7
How to hear a Gaussian Variance: where From data:
8
From data: mean
9
Top eigenfunction Only eigenfunction with no sign change
Multiplicity one Decays quickly away from the mean
10
Classical problem: Gaussian Mixtures
identifying parameters of a Gaussian mixture distribution from a finite sample.
11
Gaussian Mixture Distributions
First considered by Pearson, 1894. Analyzed 1000 crabs from Naples. Concluded there were two distinct populations.
12
Expectation Maximization
EM is the most popular method. Sensitive to initialization. Does not detect the number of components.
13
How to hear a mixture
14
How to hear a mixture Single component -- can analytically express mean and covariance. Mixture of components – assuming enough separation, spectra of components do not interfere with each other (perturbation theory). Estimation from data – constructing kernel matrices. Estimating number of components: looking for nearly positive eigenfunctions.
15
Perturbation theory
16
Example
17
Data Spectroscopy New method for estimating Gaussian mixture distributions. Allows to estimate number of components. Can work as initialization for EM. Some interesting theoretical questions and extensions, particularly, clustering.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.