Lecture 09: Gaussian Processes

Slides:

Advertisements

Similar presentations

Bayesian Learning & Estimation Theory

Advertisements

ECE 8443 – Pattern Recognition LECTURE 05: MAXIMUM LIKELIHOOD ESTIMATION Objectives: Discrete Features Maximum Likelihood Resources: D.H.S: Chapter 3 (Part.

CS Statistical Machine learning Lecture 13 Yuan (Alan) Qi Purdue CS Oct

Pattern Recognition and Machine Learning

Machine Learning CMPT 726 Simon Fraser University CHAPTER 1: INTRODUCTION.

Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.

Presenting: Assaf Tzabari

Visual Recognition Tutorial1 Random variables, distributions, and probability density functions Discrete Random Variables Continuous Random Variables.

Visual Recognition Tutorial

PATTERN RECOGNITION AND MACHINE LEARNING

ECE 8443 – Pattern Recognition LECTURE 06: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Bias in ML Estimates Bayesian Estimation Example Resources:

Bayesian Inference Ekaterina Lomakina TNU seminar: Bayesian inference 1 March 2013.

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.

ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.

Ch 2. Probability Distributions (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by Yung-Kyun Noh and Joo-kyung Kim Biointelligence.

Learning Theory Reza Shadmehr Linear and quadratic decision boundaries Kernel estimates of density Missing data.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 07: BAYESIAN ESTIMATION (Cont.) Objectives:

Gaussian Processes Li An Li An

Sparse Kernel Methods 1 Sparse Kernel Methods for Classification and Regression October 17, 2007 Kyungchul Park SKKU.

Lecture 2: Statistical learning primer for biologists

Gaussian Processes For Regression, Classification, and Prediction.

CS Statistical Machine learning Lecture 12 Yuan (Alan) Qi Purdue CS Oct

Intro. ANN & Fuzzy Systems Lecture 15. Pattern Classification (I): Statistical Formulation.

Sparse Approximate Gaussian Processes. Outline Introduction to GPs Subset of Data Bayesian Committee Machine Subset of Regressors Sparse Pseudo GPs /

Statistical NLP: Lecture 4 Mathematical Foundations I: Probability Theory (Ch2)

Ch 2. Probability Distributions (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by Joo-kyung Kim Biointelligence Laboratory,

Gaussian Process Networks Nir Friedman and Iftach Nachman UAI-2K.

CS Statistical Machine learning Lecture 7 Yuan (Alan) Qi Purdue CS Sept Acknowledgement: Sargur Srihari’s slides.

Crash course in probability theory and statistics – part 2 Machine Learning, Wed Apr 16, 2008.

Chapter 3: Maximum-Likelihood Parameter Estimation

LECTURE 06: MAXIMUM LIKELIHOOD ESTIMATION

Probability Theory and Parameter Estimation I

Lecture 04: Logistic Regression

ICS 280 Learning in Graphical Models

Outline Introduction Signal, random variable, random process and spectra Analog modulation Analog to digital conversion Digital transmission through baseband.

Ch3: Model Building through Regression

Parameter Estimation 主講人：虞台文.

Lecture 02: Linear Regression

CH 5: Multivariate Methods

Lecture 05: K-nearest neighbors

Graduate School of Information Sciences, Tohoku University

Lecture 04: Logistic Regression

Special Topics In Scientific Computing

Latent Variables, Mixture Models and EM

Probabilistic Models for Linear Regression

Lecture 26: Faces and probabilities

CS489/698: Intro to ML Lecture 08: Kernels 10/12/17 Yao-Liang Yu.

CS489/698: Intro to ML Lecture 08: Kernels 10/12/17 Yao-Liang Yu.

EC 331 The Theory of and applications of Maximum Likelihood Method

Statistical NLP: Lecture 4

The Multivariate Normal Distribution, Part 2

CS489/698: Intro to ML Lecture 08: Kernels 10/12/17 Yao-Liang Yu.

Lecture 11: Mixture of Gaussians

Lecture 02: Linear Regression

Pattern Recognition and Machine Learning

Lecture 04: Logistic Regression

CS480/680: Intro to ML Lecture 09: Kernels 23/02/2019 Yao-Liang Yu.

More Parameter Learning, Multinomial and Continuous Variables

Generally Discriminant Analysis

Lecture 10: Gaussian Processes

LECTURE 09: BAYESIAN LEARNING

LECTURE 07: BAYESIAN ESTIMATION

Lecture 03: K-nearest neighbors

Parametric Methods Berlin Chen, 2005 References:

Learning From Observed Data

Biointelligence Laboratory, Seoul National University

Multivariate Methods Berlin Chen

Mathematical Foundations of BME

Multivariate Methods Berlin Chen, 2005 References:

HKN ECE 313 Exam 2 Review Session

Presentation transcript:

Lecture 09: Gaussian Processes CS489/698: Intro to ML Lecture 09: Gaussian Processes 10/24/17 Yao-Liang Yu

Outline Gaussian Distribution Gaussian Process Gaussian Linear Regression Advanced 10/24/17 Yao-Liang Yu

Announcement Assignment 3 due on Oct 31. 10/24/17 Yao-Liang Yu

Gaussian distribution covariance, PSD mean dimension 10/24/17 Yao-Liang Yu

Carl Friedrich Gauss (1777 – 1855) 10/24/17 Yao-Liang Yu

Important facts marginal joint conditional 10/24/17 Yao-Liang Yu

Derivation 10/24/17 Yao-Liang Yu

Outline Gaussian Distributions Gaussian Processes Gaussian Linear Regression Advanced 10/24/17 Yao-Liang Yu

What is a random variable? A random variable is a function Z(ω) ω Z(ω) 10/24/17 Yao-Liang Yu

Gaussian process A collection of Gaussian random variables {Zt : t in T} such that for any finite N, {Zt : t in N} is jointly Gaussian A Gaussian process is a function of two variables Z(t,ω) For any finite N, {Zt := Z(t,ω) | t in N} is a Gaussian random vector For any ω, Zω := Z(t, ω) : T  R is a function of one variable t (sample path) Does Gaussian process exist? 10/24/17 Yao-Liang Yu

Example 10/24/17 Yao-Liang Yu

Mean and covariance function For each t, Z(t,ω) is a Gaussian random variable hence has mean For each s and t, the covariance between Zs and Zt: Say mean function covariance function 10/24/17 Yao-Liang Yu

Recap: verifying a kernel For any n, for any x1, x2, …, xn, the kernel matrix K with is symmetric and positive semidefinite ( ) Symmetric: Kij = Kji Positive semidefinite (PSD): for all 10/24/17 Yao-Liang Yu

What is a covariance function? For t1, t2, …, tn, by definition is the covariance between and K is symmetric and PSD Thus, the covariance functionκis a kernel ! 10/24/17 Yao-Liang Yu

Conversely Theorem. Given any function and kernel function , exist This may not hold for other distributions !!! 10/24/17 Yao-Liang Yu

Andrey Kolmogorov (1903 - 1987) Foundations of the theory probability 10/24/17 Yao-Liang Yu

Effect of kernel 10/24/17 Yao-Liang Yu

Example: Linear Regression unknown but deterministic independent of each other for different x Z t Y is a Gaussian process 10/24/17 Yao-Liang Yu

Maximum Likelihood Having observed Need to estimate w Choose w that explains the observations best i.i.d. likelihood of data 10/24/17 Yao-Liang Yu

Example: Bayesian Linear Regression independent of w independent of each other for different x Z t Y is a Gaussian process 10/24/17 Yao-Liang Yu

Maximum A Posteriori likelihood of data prior posterior normalization Different prior on w leads to different regularization w ~ N(0, I) is equivalent as ridge regression w ~ Lap(0, I) is equivalent as Lasso 10/24/17 Yao-Liang Yu

Outline Gaussian Distributions Gaussian Process Gaussian Linear Regression Advanced 10/24/17 Yao-Liang Yu

Abstract View Let be a Gaussian process Having observed Need to predict 10/24/17 Yao-Liang Yu

Familiar view Equivalent in finite dimensions Feature map of k Equivalent in finite dimensions Incorrect but “intuitive” in infinite dimensions 10/24/17 Yao-Liang Yu

Back to abstract is jointly Gaussian What is the distribution of ? 10/24/17 Yao-Liang Yu

Recap: Important facts marginal joint conditional 10/24/17 Yao-Liang Yu

Recap: Example 10/24/17 Yao-Liang Yu

Application t = (position, velocity, acceleration) Z = torque 10/24/17 Yao-Liang Yu

Outline Gaussian Distributions Gaussian Process Gaussian Linear Regression Advanced 10/24/17 Yao-Liang Yu

Gaussian process classification Binary label Y generated by Zt is a Gaussian process (real-valued) Given observations Need to predict integrate out latent variable 10/24/17 Yao-Liang Yu

Questions? 10/24/17 Yao-Liang Yu