Gaussian Process Networks Nir Friedman and Iftach Nachman UAI-2K.

Slides:



Advertisements
Similar presentations
CS Statistical Machine learning Lecture 13 Yuan (Alan) Qi Purdue CS Oct
Advertisements

LECTURE 11: BAYESIAN PARAMETER ESTIMATION
Dynamic Bayesian Networks (DBNs)
Introduction of Probabilistic Reasoning and Bayesian Networks
Relational Learning with Gaussian Processes By Wei Chu, Vikas Sindhwani, Zoubin Ghahramani, S.Sathiya Keerthi (Columbia, Chicago, Cambridge, Yahoo!) Presented.
Date:2011/06/08 吳昕澧 BOA: The Bayesian Optimization Algorithm.
Predictive Automatic Relevance Determination by Expectation Propagation Yuan (Alan) Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani.
CS 547: Sensing and Planning in Robotics Gaurav S. Sukhatme Computer Science Robotic Embedded Systems Laboratory University of Southern California
Arizona State University DMML Kernel Methods – Gaussian Processes Presented by Shankar Bhargav.
Estimation and the Kalman Filter David Johnson. The Mean of a Discrete Distribution “I have more legs than average”
Introduction to Bayesian Parameter Estimation
Thanks to Nir Friedman, HU
(1) A probability model respecting those covariance observations: Gaussian Maximum entropy probability distribution for a given covariance observation.
Computer vision: models, learning and inference Chapter 5 The Normal Distribution.
Latent Variable Models Christopher M. Bishop. 1. Density Modeling A standard approach: parametric models  a number of adaptive parameters  Gaussian.
Gaussian process regression Bernád Emőke Gaussian processes Definition A Gaussian Process is a collection of random variables, any finite number.
ECE 8443 – Pattern Recognition LECTURE 06: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Bias in ML Estimates Bayesian Estimation Example Resources:
Using Bayesian Networks to Analyze Expression Data N. Friedman, M. Linial, I. Nachman, D. Hebrew University.
Bayesian Inference Ekaterina Lomakina TNU seminar: Bayesian inference 1 March 2013.
Using Bayesian Networks to Analyze Expression Data By Friedman Nir, Linial Michal, Nachman Iftach, Pe'er Dana (2000) Presented by Nikolaos Aravanis Lysimachos.
Probabilistic Robotics Bayes Filter Implementations.
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
Mixture Models, Monte Carlo, Bayesian Updating and Dynamic Models Mike West Computing Science and Statistics, Vol. 24, pp , 1993.
Ch 2. Probability Distributions (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by Yung-Kyun Noh and Joo-kyung Kim Biointelligence.
1 Generative and Discriminative Models Jie Tang Department of Computer Science & Technology Tsinghua University 2012.
Probability and Measure September 2, Nonparametric Bayesian Fundamental Problem: Estimating Distribution from a collection of Data E. ( X a distribution-valued.
Ch 8. Graphical Models Pattern Recognition and Machine Learning, C. M. Bishop, Revised by M.-O. Heo Summarized by J.W. Nam Biointelligence Laboratory,
: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation 1 Montri Karnjanadecha ac.th/~montri.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 07: BAYESIAN ESTIMATION (Cont.) Objectives:
Gaussian Processes Li An Li An
Sparse Kernel Methods 1 Sparse Kernel Methods for Classification and Regression October 17, 2007 Kyungchul Park SKKU.
Additional Topics in Prediction Methodology. Introduction Predictive distribution for random variable Y 0 is meant to capture all the information about.
Dependency Networks for Collaborative Filtering and Data Visualization UAI-2000 발표 : 황규백.
Lecture 2: Statistical learning primer for biologists
Characterizing the Function Space for Bayesian Kernel Models Natesh S. Pillai, Qiang Wu, Feng Liang Sayan Mukherjee and Robert L. Wolpert JMLR 2007 Presented.
Gaussian Processes For Regression, Classification, and Prediction.
CS Statistical Machine learning Lecture 12 Yuan (Alan) Qi Purdue CS Oct
The Uniform Prior and the Laplace Correction Supplemental Material not on exam.
Gaussian Process and Prediction. (C) 2001 SNU CSE Artificial Intelligence Lab (SCAI)2 Outline Gaussian Process and Bayesian Regression  Bayesian regression.
The Unscented Particle Filter 2000/09/29 이 시은. Introduction Filtering –estimate the states(parameters or hidden variable) as a set of observations becomes.
Univariate Gaussian Case (Cont.)
04/21/2005 CS673 1 Being Bayesian About Network Structure A Bayesian Approach to Structure Discovery in Bayesian Networks Nir Friedman and Daphne Koller.
Introduction to Gaussian Process CS 478 – INTRODUCTION 1 CS 778 Chris Tensmeyer.
Sparse Approximate Gaussian Processes. Outline Introduction to GPs Subset of Data Bayesian Committee Machine Subset of Regressors Sparse Pseudo GPs /
Ch 2. Probability Distributions (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by Joo-kyung Kim Biointelligence Laboratory,
- 1 - Preliminaries Multivariate normal model (section 3.6, Gelman) –For a multi-parameter vector y, multivariate normal distribution is where  is covariance.
Institute of Statistics and Decision Sciences In Defense of a Dissertation Submitted for the Degree of Doctor of Philosophy 26 July 2005 Regression Model.
CS Statistical Machine learning Lecture 7 Yuan (Alan) Qi Purdue CS Sept Acknowledgement: Sargur Srihari’s slides.
Basics of Multivariate Probability
Univariate Gaussian Case (Cont.)
Probability Theory and Parameter Estimation I
Today.
ICS 280 Learning in Graphical Models
Ch3: Model Building through Regression
Parameter Estimation 主講人:虞台文.
Non-Parametric Models
Statistical Models for Automatic Speech Recognition
Lecture 09: Gaussian Processes
Of Probability & Information Theory
Special Topics In Scientific Computing
CSCI 5822 Probabilistic Models of Human and Machine Learning
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Regulation Analysis using Restricted Boltzmann Machines
Pattern Recognition and Machine Learning
More Parameter Learning, Multinomial and Continuous Variables
Lecture 10: Gaussian Processes
LECTURE 09: BAYESIAN LEARNING
LECTURE 07: BAYESIAN ESTIMATION
Multivariate Methods Berlin Chen
Probabilistic Surrogate Models
Presentation transcript:

Gaussian Process Networks Nir Friedman and Iftach Nachman UAI-2K

Abstract Learning structures of Bayesian networks  Evaluating the marginal likelihood of the data given a candidate structure. For continuous networks  Gaussians, Gaussian mixtures were used as priors for parameters. In this paper, a new prior Gaussian Process is presented.

Introduction Bayesian networks are particularly effective in domains where the interactions between variables are fairly local. Motivation - Molecluar Biology problems  To understand transcription of genes.  Continuous variable are necessary. Gaussian Process prior  A Bayesian method.  Semi-parametric nature allows to learn the complicated functional relationships between variables.

Learning Continuous Networks The posterior probability Three assumptions  Structure modularity  Parameter independence  Parameter modularity The posterior probability is now can be represented as follows.

Priors for Continuous Variables Linear Gaussian  So simple… Gaussian mixtures  Approximations are required to learn. Kernel method  Smoothness parameter

Gaussian Process(1/2) Basic of Gaussian Process  A prior over a variable X is a function of U.  The stochastic process over U is said to be Gaussian Process if for each finite set of values, u 1:M = {u[1], …, u[M]}, the distribution over the corresponding random variables x 1:M = {X[1], …, X[M]} is a multivariate normal distribution. The joint distribution of x 1:M is

Gaussian Process(2/2) Prediction  P(X M+1 |X 1:M, U 1:M, U M+1 ) is a univariate Gaussian distribution. Covariance functions  Williams and Rasmussen suggest the following function.

Learning Networks with Gaussian Process Priors score is defined as follows. With this Gaussian process prior, the computation of marginal probability can be done in closed form. Parameters for covariance matrix  MAP approximation  Laplace approximation

Artificial Experimentation(1/3) For two variables X, Y  Non-invertible relationship

Artificial Experimentation(2/3) The results for non-invertible dependencies learning

Artificial Experimentation(3/3) Comparison for Gaussian, Gaussian Process, Kernel methods

Discussion Reproducing Kernel Hilbert Space(RKHS) and Gaussian Process Currently this method is applied to analyze biological data.