Bayesian Multivariate Logistic Regression by Sean O’Brien and David Dunson (Biometrics, 2004 ) Presented by Lihan He ECE, Duke University May 16, 2008.

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Bayes rule, priors and maximum a posteriori
Copula Regression By Rahul A. Parsa Drake University &
Gibbs Sampling Methods for Stick-Breaking priors Hemant Ishwaran and Lancelot F. James 2001 Presented by Yuting Qi ECE Dept., Duke Univ. 03/03/06.
Bayesian dynamic modeling of latent trait distributions Duke University Machine Learning Group Presented by Kai Ni Jan. 25, 2007 Paper by David B. Dunson,
CS479/679 Pattern Recognition Dr. George Bebis
Bayesian Estimation in MARK
2 – In previous chapters: – We could design an optimal classifier if we knew the prior probabilities P(wi) and the class- conditional probabilities P(x|wi)
LECTURE 11: BAYESIAN PARAMETER ESTIMATION
Predictive Automatic Relevance Determination by Expectation Propagation Yuan (Alan) Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani.
Visual Recognition Tutorial1 Random variables, distributions, and probability density functions Discrete Random Variables Continuous Random Variables.
Arizona State University DMML Kernel Methods – Gaussian Processes Presented by Shankar Bhargav.
Computer vision: models, learning and inference
Introduction to Bayesian Parameter Estimation
Computer vision: models, learning and inference Chapter 3 Common probability distributions.
Lecture II-2: Probability Review
Lecture 10A: Matrix Algebra. Matrices: An array of elements Vectors Column vector Row vector Square matrix Dimensionality of a matrix: r x c (rows x columns)
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
Modeling Menstrual Cycle Length in Pre- and Peri-Menopausal Women Michael Elliott Xiaobi Huang Sioban Harlow University of Michigan School of Public Health.
The horseshoe estimator for sparse signals CARLOS M. CARVALHO NICHOLAS G. POLSON JAMES G. SCOTT Biometrika (2010) Presented by Eric Wang 10/14/2010.
ECE 8443 – Pattern Recognition LECTURE 06: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Bias in ML Estimates Bayesian Estimation Example Resources:
Fast Max–Margin Matrix Factorization with Data Augmentation Minjie Xu, Jun Zhu & Bo Zhang Tsinghua University.
Lecture 8: Generalized Linear Models for Longitudinal Data.
A statistical model Μ is a set of distributions (or regression functions), e.g., all uni-modal, smooth distributions. Μ is called a parametric model if.
Bayesian inference review Objective –estimate unknown parameter  based on observations y. Result is given by probability distribution. Bayesian inference.
Estimating parameters in a statistical model Likelihood and Maximum likelihood estimation Bayesian point estimates Maximum a posteriori point.
Bayesian Analysis and Applications of A Cure Rate Model.
Learning Theory Reza Shadmehr logistic regression, iterative re-weighted least squares.
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
Ch 2. Probability Distributions (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by Yung-Kyun Noh and Joo-kyung Kim Biointelligence.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Conjugate Priors Multinomial Gaussian MAP Variance Estimation Example.
Machine Learning Recitation 6 Sep 30, 2009 Oznur Tastan.
An Asymptotic Analysis of Generative, Discriminative, and Pseudolikelihood Estimators by Percy Liang and Michael Jordan (ICML 2008 ) Presented by Lihan.
- 1 - Bayesian inference of binomial problem Estimating a probability from binomial data –Objective is to estimate unknown proportion (or probability of.
Example: Bioassay experiment Problem statement –Observations: At each level of dose, 5 animals are tested, and number of death are observed.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 07: BAYESIAN ESTIMATION (Cont.) Objectives:
Estimation in Marginal Models (GEE and Robust Estimation)
1 Bayesian Essentials Slides by Peter Rossi and David Madigan.
Geology 6600/7600 Signal Analysis 02 Sep 2015 © A.R. Lowry 2015 Last time: Signal Analysis is a set of tools used to extract information from sequences.
Lecture 2: Statistical learning primer for biologists
Beam Sampling for the Infinite Hidden Markov Model by Jurgen Van Gael, Yunus Saatic, Yee Whye Teh and Zoubin Ghahramani (ICML 2008) Presented by Lihan.
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
by Ryan P. Adams, Iain Murray, and David J.C. MacKay (ICML 2009)
Reducing MCMC Computational Cost With a Two Layered Bayesian Approach
Bayesian Density Regression Author: David B. Dunson and Natesh Pillai Presenter: Ya Xue April 28, 2006.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Univariate Gaussian Case (Cont.)
Ch 2. Probability Distributions (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by Joo-kyung Kim Biointelligence Laboratory,
- 1 - Preliminaries Multivariate normal model (section 3.6, Gelman) –For a multi-parameter vector y, multivariate normal distribution is where  is covariance.
Hierarchical Mixture of Experts Presented by Qi An Machine learning reading group Duke University 07/15/2005.
A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation Yee W. Teh, David Newman and Max Welling Published on NIPS 2006 Discussion.
Generalization Performance of Exchange Monte Carlo Method for Normal Mixture Models Kenji Nagata, Sumio Watanabe Tokyo Institute of Technology.
Bayesian Semi-Parametric Multiple Shrinkage
Probability Theory and Parameter Estimation I
STA 216 Generalized Linear Models
Computer vision: models, learning and inference
Distributions and Concepts in Probability Theory
Remember that our objective is for some density f(y|) for observations where y and  are vectors of data and parameters,  being sampled from a prior.
Kernel Stick-Breaking Process
STA 216 Generalized Linear Models
OVERVIEW OF BAYESIAN INFERENCE: PART 1
Basic Econometrics Chapter 4: THE NORMALITY ASSUMPTION:
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
OVERVIEW OF LINEAR MODELS
Robust Full Bayesian Learning for Neural Networks
LECTURE 09: BAYESIAN LEARNING
Multivariate Methods Berlin Chen
Multivariate Methods Berlin Chen, 2005 References:
Mathematical Foundations of BME Reza Shadmehr
Classical regression review
Presentation transcript:

Bayesian Multivariate Logistic Regression by Sean O’Brien and David Dunson (Biometrics, 2004 ) Presented by Lihan He ECE, Duke University May 16, 2008

Univariate logistic regression Multivariate logistic regression Prior specification and convergence Posterior computation Experimental result Conclusions Outlines

Univariate Logistic Regression Model Equivalent: z i : latent variable L( ): logistic density logistic density: CDF:

Univariate Logistic Regression Model Approximation using t distribution set

Multivariate Logistic Regression Model Binary variable for each output with -- marginal pdf has univariate logistic density, F -1 ( ) is the inverse CDF of density

Multivariate Logistic Regression Model Property  The marginal univariate densities of z j, for j=1,…,p, have univariate logistic form  p=1, reduce to the univariate logistic density  R is a correlation matrix (with 1’s on the diagonal), reflecting the correlations between z j, and hence the correlations between y j  R=diag(1,…,1), reduce to a product of univariate logistic densities, and the elements of z are uncorrelated  Good convergence property for MCMC sampling

Multivariate Logistic Regression Model Likelihood M-ary variable for each output (ordered) Assume Define

Prior specification and convergence or R: uniform density [-1,1] for each element in non-diagonal position

Posterior Computation Posterior: Prior and likelihood are not conjugate Proposal distribution: = Use multivariate t distribution to approximate the multivariate logistic density in the likelihood part. Importance sampling: sample from a proposal distribution to approximate samples from, and use importance weights for exact inference.

Posterior Computation Introduce latent variables and z, the proposal is expressed as Sample and z from the full conditionals since the likelihood is conjugate to prior. Update R using a Metropolis step (accept/reject) z)z) Set with probability Set otherwise

Posterior Computation Importance weights for inference weights

Application Subject: 584 twin pregnancies Output: small for gestational age (SGA), defined as a birthweight below the 10th percentile for a given gestational age in a reference population. Binary output, y ij ={0,1}, i=1,…,584, j=1, 2 Covariates: x ij for the ith pregnancy and the jth infant

Application  Obtain nearly identical estimates to the study of AP for the regression coefficients.  Female gender (β 1 ), prior preterm delivery (β 4, β 5 ) and smoking (β 8 ) are associated with an increased risk of SGA.  Outcomes for twins are highly correlated, represented by R.

Conclusions  Propose a multivariate logistic density for multivariate logistic regression model.  The proposed multivariate logistic density is closely approximated by a multivariate t distribution.  Has properties that facilitate efficient sampling and guaranteed convergence.  The marginals are univariate logistic densities.  Embed the correlation structure within the model.