Kernel Stick-Breaking Process

Slides:



Advertisements
Similar presentations
Emulation of a Stochastic Forest Simulator Using Kernel Stick-Breaking Processes (Work in Progress) James L. Crooks (SAMSI, Duke University)
Advertisements

MCMC estimation in MlwiN
Xiaolong Wang and Daniel Khashabi
Markov Chain Sampling Methods for Dirichlet Process Mixture Models R.M. Neal Summarized by Joon Shik Kim (Thu) Computational Models of Intelligence.
Gibbs Sampling Methods for Stick-Breaking priors Hemant Ishwaran and Lancelot F. James 2001 Presented by Yuting Qi ECE Dept., Duke Univ. 03/03/06.
Hierarchical Dirichlet Processes
Bayesian dynamic modeling of latent trait distributions Duke University Machine Learning Group Presented by Kai Ni Jan. 25, 2007 Paper by David B. Dunson,
DEPARTMENT OF ENGINEERING SCIENCE Information, Control, and Vision Engineering Bayesian Nonparametrics via Probabilistic Programming Frank Wood
CS433: Modeling and Simulation
Fast Bayesian Matching Pursuit Presenter: Changchun Zhang ECE / CMR Tennessee Technological University November 12, 2010 Reading Group (Authors: Philip.
Dynamic Bayesian Networks (DBNs)
CHAPTER 16 MARKOV CHAIN MONTE CARLO
. PGM: Tirgul 8 Markov Chains. Stochastic Sampling  In previous class, we examined methods that use independent samples to estimate P(X = x |e ) Problem:
Predictive Automatic Relevance Determination by Expectation Propagation Yuan (Alan) Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani.
Particle filters (continued…). Recall Particle filters –Track state sequence x i given the measurements ( y 0, y 1, …., y i ) –Non-linear dynamics –Non-linear.
Probability theory 2010 Conditional distributions  Conditional probability:  Conditional probability mass function: Discrete case  Conditional probability.
2. Random variables  Introduction  Distribution of a random variable  Distribution function properties  Discrete random variables  Point mass  Discrete.
Continuous Random Variables and Probability Distributions
State-Space Models for Biological Monitoring Data Devin S. Johnson University of Alaska Fairbanks and Jennifer A. Hoeting Colorado State University.
Lecture II-2: Probability Review
Introduction to Monte Carlo Methods D.J.C. Mackay.
Chapter 1 Probability and Distributions Math 6203 Fall 2009 Instructor: Ayona Chatterjee.
The horseshoe estimator for sparse signals CARLOS M. CARVALHO NICHOLAS G. POLSON JAMES G. SCOTT Biometrika (2010) Presented by Eric Wang 10/14/2010.
Hierarchical Dirichelet Processes Y. W. Tech, M. I. Jordan, M. J. Beal & D. M. Blei NIPS 2004 Presented by Yuting Qi ECE Dept., Duke Univ. 08/26/05 Sharing.
High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture model Based on Minimum Message Length by Nizar Bouguila.
1 A Bayes method of a Monotone Hazard Rate via S-paths Man-Wai Ho National University of Singapore Cambridge, 9 th August 2007.
The Dirichlet Labeling Process for Functional Data Analysis XuanLong Nguyen & Alan E. Gelfand Duke University Machine Learning Group Presented by Lu Ren.
-Arnaud Doucet, Nando de Freitas et al, UAI
Bayesian Parametric and Semi- Parametric Hierarchical models: An application to Disinfection By-Products and Spontaneous Abortion: Rich MacLehose November.
Variational Inference for the Indian Buffet Process
Bayesian Multivariate Logistic Regression by Sean O’Brien and David Dunson (Biometrics, 2004 ) Presented by Lihan He ECE, Duke University May 16, 2008.
1 Dirichlet Process Mixtures A gentle tutorial Graphical Models – Khalid El-Arini Carnegie Mellon University November 6 th, 2006 TexPoint fonts used.
Stick-Breaking Constructions
CS Statistical Machine learning Lecture 24
1 Clustering in Generalized Linear Mixed Model Using Dirichlet Process Mixtures Ya Xue Xuejun Liao April 1, 2005.
Boosted Particle Filter: Multitarget Detection and Tracking Fayin Li.
Guest lecture: Feature Selection Alan Qi Dec 2, 2004.
Lecture 2: Statistical learning primer for biologists
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
Point Estimation of Parameters and Sampling Distributions Outlines:  Sampling Distributions and the central limit theorem  Point estimation  Methods.
Stick-breaking Construction for the Indian Buffet Process Duke University Machine Learning Group Presented by Kai Ni July 27, 2007 Yee Whye The, Dilan.
Continuous Random Variables and Probability Distributions
Generalized Spatial Dirichlet Process Models Jason A. Duan Michele Guindani Alan E. Gelfand March, 2006.
Lecture 1: Basic Statistical Tools. A random variable (RV) = outcome (realization) not a set value, but rather drawn from some probability distribution.
Bayesian Density Regression Author: David B. Dunson and Natesh Pillai Presenter: Ya Xue April 28, 2006.
CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov
The Nested Dirichlet Process Duke University Machine Learning Group Presented by Kai Ni Nov. 10, 2006 Paper by Abel Rodriguez, David B. Dunson, and Alan.
Generalization Performance of Exchange Monte Carlo Method for Normal Mixture Models Kenji Nagata, Sumio Watanabe Tokyo Institute of Technology.
Multiple Random Variables and Joint Distributions
Bayesian Semi-Parametric Multiple Shrinkage
An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism
Availability Availability - A(t)
Bayesian Generalized Product Partition Model
STATISTICS Joint and Conditional Distributions
Non-Parametric Models
STA 216 Generalized Linear Models
Omiros Papaspiliopoulos and Gareth O. Roberts
A Non-Parametric Bayesian Method for Inferring Hidden Causes
STA 216 Generalized Linear Models
Collapsed Variational Dirichlet Process Mixture Models
Multitask Learning Using Dirichlet Process
Filtering and State Estimation: Basic Concepts
Generalized Spatial Dirichlet Process Models
Aapo Hyvärinen and Ella Bingham
Chinese Restaurant Representation Stick-Breaking Construction
Greedy Importance Sampling
Robust Full Bayesian Learning for Neural Networks
Multivariate Methods Berlin Chen
Multivariate Methods Berlin Chen, 2005 References:
Experiments, Outcomes, Events and Random Variables: A Revisit
Presentation transcript:

Kernel Stick-Breaking Process D. B. Dunson and J. Park Discussion led by Qi An Jan 19th, 2007

Outline Motivation Model formulation and properties Prediction rules Posterior Computation Examples Conclusions

Motivation Consider a problem of estimating the conditional density of a response variable using a mixture model, , where Gx is an unknown probability measure indexed by x. The problem of defining priors for random probability measures on Gx has received increasing attention in recent year. For example, DP, DDP.

One model In DDP, the atoms can vary with x according to a stochastic process while the weights are fixed Dunson et al propose a model to allow the weights to vary with predictors while this model lacks reasonable marginalization and updating properties.

Model formulation Introduce a countable sequence of mutually independent random components The kernel stick-breaking process (KSBP) can be defined as follows:

About the model The model for Gx is a predictor-dependent mixture over an infinite sequence of basis probability measures, Gh* located at Γh. Bases located close to x and having a smaller index, h, tend to receive higher probability weight. KSBP accommodates dependency between Gx and Gx’

Special cases If K(x,Γ)=1 for all and Gh*~DP(αG0), it is a stick-breaking mixture of DP. If K(x,Γ)=1, and , we obtain Gx≡G, with G having a stick-breaking prior. If and , we obtain a Pitman-Yor process.

Properties Let , we can obtain The correlation between measures First moment No dependency on V and Γ Second moment It can be proven and the value 1 in the limit as x x’ where

Alternative representation The KSBP has an alternative representation The moments and correlation coefficient has the form

Truncation For stick-breaking Gibbs sampler, we need to make truncation approximation Author proves that the residual weights decrease exponentially fast in N and an accurate approximation may be obtained for moderate N The approximated model can be expressed as

Prediction rules Consider a special case in which The model can be equivalently expressed as:

Prediction rules Define and is a subset of the integers between 1 and n It can be proven that the probability that subjects i and j belong to the same cluster is The predictive distribution is obtained by marginalization where and denote the set of possible r- dimensional subsets of {1,…,s} that include i

Posterior Computation From the prior, we can obtain 1, sample Si 2, sample CSi when Si=0 (assign subject I to a new atom at an occupied location) 3, sample θh

4, sample Vh 5, sample Γh using a Metropolis-Hastings step or Gibbs step if H is a set of discrete potential locations First sample and then, alternate between (i) Sampling (Aih,Bih) from their conditional distribution (ii) Updating Vh by sampling from conditional posterior

Simulated examples

Conclusions This stick-breaking process is useful in setting in which there is uncertainty in an uncountable collection of probability measures The process can be applied in predictor dependent clustering, dynamic modeling and spatial data analysis, besides the density regression. The KSBP formulation can be applied to many tools developed for exchangeable stick-breaking processes with minimal modification. A predicator dependent urn scheme is obtained, which generalizes the Polya urn scheme