Download presentation
Presentation is loading. Please wait.
Published byAshlyn Watson Modified over 8 years ago
1
Sparse Approximate Gaussian Processes
2
Outline Introduction to GPs Subset of Data Bayesian Committee Machine Subset of Regressors Sparse Pseudo GPs / FITC Partially Independent Training Conditional Sparse Spectrum Gaussian Processes
3
Data from real process Distribution over function values Introduction to GPs
4
Prior Likelihood
5
Introduction to GPs Posterior
6
Introduction to GPs Test point posterior
7
Introduction to GPs – Computational Requirements Training Maximise marginal likelihood of training set Memory: N 2 Computation: N 3 Prediction Memory: N 2 Computation: N 2
8
Subset of Data Choose m points from training set Randomly Maximise differential entropy score [Lawrence et al. 2003] – Informative vector machine Information gain [Seeger et al. 2003] based on KL divergence Minimise test set error / maximise test set marginal likelihood Computation scales with m
9
Bayesian committee machine D1D1 D2D2... DkDk
10
Subset of Regressors Full GP SR GP Likelihood of alpha Prior on alpha Posterior on alpha
11
Subset of Regressors Effective covariance function Marginal Likelihood Training Memory: O(NM) Computation: O(NM 2 ) Predictive Memory: O(M 2 ) Computation: O(M 2 )
12
Sparse Pseudo-Input GPs a.k.a. Fully Independent Training Conditional Define a pseudo data set Probability of an observation under a GP conditioned on pseudo data
13
Sparse Pseudo-Input GPs a.k.a. Fully Independent Training Conditional Prior on pseudo targets Predictive posterior N x N M x M
14
Sparse Pseudo-Input GPs a.k.a. Fully Independent Training Conditional Marginal likelihood Effective covariance function Training Memory: O(NM) Computation: O(NM 2 ) Predictive Memory: O(M 2 ) Computation: O(M 2 )
15
Partially Independent Training Conditional Effective covariance function
16
Sparse Spectrum GPs Lázaro-Gredilla et al., Sparse Spectrum Gaussian Process Regression, JMLR 2010 Wiener-Khintchine TheoremBochner's Theorem Effective covariance function Sample pairs { s r, -s r } - finite set of frequencies
17
Sparse Spectrum GPs Trigonometric Bayesian regression Sparse spectrum
18
Sparse Spectrum GPs Marginal likelihood Predictive Memory: O(M 2 ) Computation: O(M 2 ) Training Memory: O(NM) Computation: O(NM 2 ) Predictive moments A is 2M x 2M
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.