Presentation is loading. Please wait.

Presentation is loading. Please wait.

Scalable Training of Mixture Models via Coresets Daniel Feldman Matthew Faulkner Andreas Krause MIT.

Similar presentations


Presentation on theme: "Scalable Training of Mixture Models via Coresets Daniel Feldman Matthew Faulkner Andreas Krause MIT."— Presentation transcript:

1 Scalable Training of Mixture Models via Coresets Daniel Feldman Matthew Faulkner Andreas Krause MIT

2 Fitting Mixtures to Massive Data Importance Sample EM, generally expensiveWeighted EM, fast!

3 Coresets for Mixture Models *

4 Naïve Uniform Sampling 4

5 5 Small cluster is missed Sample a set U of m points uniformly  High variance

6 Sampling Distribution Sampling distribution Bias sampling towards small clusters

7 Importance Weights Weights Sampling distribution

8 Creating a Sampling Distribution Iteratively find representative points 8

9 Creating a Sampling Distribution Sample a small set uniformly at random 9 Iteratively find representative points

10 Creating a Sampling Distribution Remove half the blue points nearest the samples Sample a small set uniformly at random 10 Iteratively find representative points

11 Creating a Sampling Distribution Remove half the blue points nearest the samples Sample a small set uniformly at random 11 Iteratively find representative points

12 Creating a Sampling Distribution Remove half the blue points nearest the samples Sample a small set uniformly at random 12 Iteratively find representative points

13 Creating a Sampling Distribution Remove half the blue points nearest the samples Sample a small set uniformly at random 13 Iteratively find representative points

14 Creating a Sampling Distribution Remove half the blue points nearest the samples Sample a small set uniformly at random 14 Iteratively find representative points

15 Creating a Sampling Distribution Remove half the blue points nearest the samples Sample a small set uniformly at random 15 Iteratively find representative points

16 Creating a Sampling Distribution Remove half the blue points nearest the samples Sample a small set uniformly at random 16 Small clusters are represented Iteratively find representative points

17 Creating a Sampling Distribution Partition data via a Voronoi diagram centered at points 17

18 Creating a Sampling Distribution Sampling distribution 18 Points in sparse cells get more mass and points far from centers

19 Importance Weights Sampling distribution 19 Points in sparse cells get more mass and points far from centers Weights

20 20 Importance Sample

21 21 Coresets via Adaptive Sampling

22 A General Coreset Framework Contributions for Mixture Models:

23 A Geometric Perspective Gaussian level sets can be expressed purely geometrically: 23 affine subspace

24 Geometric Reduction Lifts geometric coreset tools to mixture models Soft-min

25 Semi-Spherical Gaussian Mixtures 25

26 Extensions and Generalizations 26 Level Sets

27 Composition of Coresets Merge [c.f. Har-Peled, Mazumdar 04] 27

28 Composition of Coresets Compress Merge [Har-Peled, Mazumdar 04] 28

29 Coresets on Streams Compress Merge [Har-Peled, Mazumdar 04] 29

30 Coresets on Streams Compress Merge [Har-Peled, Mazumdar 04] 30

31 Coresets on Streams Compress Merge [Har-Peled, Mazumdar 04] 31 Error grows linearly with number of compressions

32 Coresets on Streams Error grows with height of tree

33 33 Coresets in Parallel

34 Handwritten Digits Obtain 100-dimensional features from 28x28 pixel images via PCA. Fit GMM with k=10 components. 34 MNIST data: 60,000 training, 10,000 testing

35 35 Neural Tetrode Recordings Waveforms of neural activity at four co-located electrodes in a live rat hippocampus. 4 x 38 samples = 152 dimensions. T. Siapas et al, Caltech

36 36 Community Seismic Network Detect and monitor earthquakes using smart phones, USB sensors, and cloud computing. CSN Sensors Worldwide

37 Learning User Acceleration 37 17-dimensional acceleration feature vectors Bad Good

38 38 Seismic Anomaly Detection Bad Good GMM used for anomaly detection

39 Conclusions Lift geometric coreset tools to the statistical realm - New complexity result for GMM level sets Parallel (MapReduce) and Streaming implementations Strong empirical performance, enables learning on mobile devices GMMs admit coresets of size independent of n - Extensions for other mixture models 39


Download ppt "Scalable Training of Mixture Models via Coresets Daniel Feldman Matthew Faulkner Andreas Krause MIT."

Similar presentations


Ads by Google