The Nested Dirichlet Process Duke University Machine Learning Group Presented by Kai Ni Nov. 10, 2006 Paper by Abel Rodriguez, David B. Dunson, and Alan.

Slides:



Advertisements
Similar presentations
Bayesian Spatial and Functional Data Analysis Using Gaussian Processes Alan E. Gelfand Duke University (with contributions from J. Duan, D. Dunson, M.
Advertisements

Sinead Williamson, Chong Wang, Katherine A. Heller, David M. Blei
Markov Chain Sampling Methods for Dirichlet Process Mixture Models R.M. Neal Summarized by Joon Shik Kim (Thu) Computational Models of Intelligence.
Hierarchical Dirichlet Process (HDP)
Hierarchical Linear Modeling: An Introduction & Applications in Organizational Research Michael C. Rodriguez.
Gibbs Sampling Methods for Stick-Breaking priors Hemant Ishwaran and Lancelot F. James 2001 Presented by Yuting Qi ECE Dept., Duke Univ. 03/03/06.
Hierarchical Dirichlet Processes
Statistical Analysis Overview I Session 2 Peg Burchinal Frank Porter Graham Child Development Institute, University of North Carolina-Chapel Hill.
Bayesian dynamic modeling of latent trait distributions Duke University Machine Learning Group Presented by Kai Ni Jan. 25, 2007 Paper by David B. Dunson,
Combining Information from Related Regressions Duke University Machine Learning Group Presented by Kai Ni Apr. 27, 2007 F. Dominici, G. Parmigiani, K.
Chapter 8: Estimating with Confidence
Adaption Adjusting Model’s parameters for a new speaker. Adjusting all parameters need a huge amount of data (impractical). The solution is to cluster.
HW 4. Nonparametric Bayesian Models Parametric Model Fixed number of parameters that is independent of the data we’re fitting.
Error Component models Ric Scarpa Prepared for the Choice Modelling Workshop 1st and 2nd of May Brisbane Powerhouse, New Farm Brisbane.
A New Nonparametric Bayesian Model for Genetic Recombination in Open Ancestral Space Presented by Chunping Wang Machine Learning Group, Duke University.
Decoupling Sparsity and Smoothness in the Discrete Hierarchical Dirichlet Process Chong Wang and David M. Blei NIPS 2009 Discussion led by Chunping Wang.
Longitudinal Experiments Larry V. Hedges Northwestern University Prepared for the IES Summer Research Training Institute July 28, 2010.
Nonparametric Bayes and human cognition Tom Griffiths Department of Psychology Program in Cognitive Science University of California, Berkeley.
Clustered or Multilevel Data
Kernel Methods Part 2 Bing Han June 26, Local Likelihood Logistic Regression.
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
The Analysis of Variance
Introduction to Multilevel Modeling Using SPSS
Adaption Def: To adjust model parameters for new speakers. Adjusting all parameters requires too much data and is computationally complex. Solution: Create.
Correlated Topic Models By Blei and Lafferty (NIPS 2005) Presented by Chunping Wang ECE, Duke University August 4 th, 2006.
CHAPTER 8 Estimating with Confidence
The Effects of Ranging Noise on Multihop Localization: An Empirical Study from UC Berkeley Abon.
Hierarchical Dirichelet Processes Y. W. Tech, M. I. Jordan, M. J. Beal & D. M. Blei NIPS 2004 Presented by Yuting Qi ECE Dept., Duke Univ. 08/26/05 Sharing.
Lecture 8: Generalized Linear Models for Longitudinal Data.
Finding Scientific topics August , Topic Modeling 1.A document as a probabilistic mixture of topics. 2.A topic as a probability distribution.
Mixture Models, Monte Carlo, Bayesian Updating and Dynamic Models Mike West Computing Science and Statistics, Vol. 24, pp , 1993.
Learning Theory Reza Shadmehr Linear and quadratic decision boundaries Kernel estimates of density Missing data.
An Overview of Nonparametric Bayesian Models and Applications to Natural Language Processing Narges Sharif-Razavian and Andreas Zollmann.
The Dirichlet Labeling Process for Functional Data Analysis XuanLong Nguyen & Alan E. Gelfand Duke University Machine Learning Group Presented by Lu Ren.
Timeline: A Dynamic Hierarchical Dirichlet Process Model for Recovering Birth/Death and Evolution of Topics in Text Stream (UAI 2010) Amr Ahmed and Eric.
Clustering and Testing in High- Dimensional Data M. Radavičius, G. Jakimauskas, J. Sušinskas (Institute of Mathematics and Informatics, Vilnius, Lithuania)
Variational Inference for the Indian Buffet Process
Hierarchical Dirichlet Process and Infinite Hidden Markov Model Duke University Machine Learning Group Presented by Kai Ni February 17, 2006 Paper by Y.
Statistics for Differential Expression Naomi Altman Oct. 06.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Stick-Breaking Constructions
1 Clustering in Generalized Linear Mixed Model Using Dirichlet Process Mixtures Ya Xue Xuejun Liao April 1, 2005.
Adaption Def: To adjust model parameters for new speakers. Adjusting all parameters requires an impractical amount of data. Solution: Create clusters and.
NTNU Speech Lab Dirichlet Mixtures for Query Estimation in Information Retrieval Mark D. Smucker, David Kulp, James Allan Center for Intelligent Information.
Bayesian Multi-Population Haplotype Inference via a Hierarchical Dirichlet Process Mixture Duke University Machine Learning Group Presented by Kai Ni August.
Sampling and Nested Data in Practice-Based Research Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine.
Stick-breaking Construction for the Indian Buffet Process Duke University Machine Learning Group Presented by Kai Ni July 27, 2007 Yee Whye The, Dilan.
Generalized Spatial Dirichlet Process Models Jason A. Duan Michele Guindani Alan E. Gelfand March, 2006.
Bayesian Density Regression Author: David B. Dunson and Natesh Pillai Presenter: Ya Xue April 28, 2006.
Nonparametric Bayesian Models. HW 4 x x Parametric Model Fixed number of parameters that is independent of the data we’re fitting.
Hierarchical Beta Process and the Indian Buffet Process by R. Thibaux and M. I. Jordan Discussion led by Qi An.
APPLICATIONS OF DIRICHLET PROCESS MIXTURES TO SPEAKER ADAPTATION Amir Harati and Joseph PiconeMarc Sobel Institute for Signal and Information Processing,
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.1 Confidence Intervals: The.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Mixture Densities Maximum Likelihood Estimates.
A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation Yee W. Teh, David Newman and Max Welling Published on NIPS 2006 Discussion.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.1 Confidence Intervals: The.
Generalization Performance of Exchange Monte Carlo Method for Normal Mixture Models Kenji Nagata, Sumio Watanabe Tokyo Institute of Technology.
Inference: Conclusion with Confidence
Multilevel modelling: general ideas and uses
Bayesian Semi-Parametric Multiple Shrinkage
Nonparametric Bayesian Learning of Switching Dynamical Processes
Bayesian Generalized Product Partition Model
Inference: Conclusion with Confidence
Kernel Stick-Breaking Process
Introduction to Inference
Generalized Spatial Dirichlet Process Models
Day 2 Applications of Growth Curve Models June 28 & 29, 2018
LECTURE 21: CLUSTERING Objectives: Mixture Densities Maximum Likelihood Estimates Application to Gaussian Mixture Models k-Means Clustering Fuzzy k-Means.
Clinical prediction models
Presentation transcript:

The Nested Dirichlet Process Duke University Machine Learning Group Presented by Kai Ni Nov. 10, 2006 Paper by Abel Rodriguez, David B. Dunson, and Alan E. Gelfand, Submitted to JASA 2006

Outline Introduction Nested Dirichlet process Application on haplotype inference

Motivation General problem – Extending the Dirichlet Process to ccommodate multiple dependent distributions. Methods –Inducing dependence through a shared source. For example, the dependent Dirichlet process (DDP) and the hierarchical Dirichlet process (HDP). –Inducing dependence through linear combinations of realizations of independent Dirichlet processes. For example, Muller (2004) defines the distribution of each group as the mixture of a global component and a local component.

Background The paper is motivated by two related problems: clustering probability distributions and simultaneous multilevel clustering in nested setting. Considered an example of hospital analysis: –In assessing quality of care, we need cluster centers according to the distribution of patients outcomes and identify outlying centers. –Also want to simultaneously cluster patients within the centers, and borrow information across centers that have similar clusters.

The Dirichlet process A single clustering problem can be analyzed as a Dirichlet processes (DP). The stick-breaking construction is usually the starting point of analysis: If yields Pitman-Yor process. If a = 0 and b = a resulting in the standard DP.

The nested Dirichlet process mixture Suppose y ij, for i = 1, …, n j are observations within center j. We assume exchangeability for centers, with A collection of distributions {F 1, …, F J } is said to follow a Nested Dirichlet Processes Mixture if

The nested Dirichlet process The collection {G 1, …, G J }, used as the mixing distribution, is said to follow a Nested Dirichlet Process with parameters From the construction, we have, and marginally, for every j. We have the properties for each Gj with

Prior correlation The prior correlation between two distribuitions G j and G j’ is The prior correlation between draws from the process is The correlation within center is larger than the one between centers. Generalized to three standard cases when

Truncations

Truncation error example for nDP(3,3,H) As the number of groups J increases, K needs to be increased. A typical choice will be K = 35 and L = 55;

Sampling by double truncation

Simulated data Showing the discriminating capability of the nDP and its ability to provide more accurate density estimates.

Density estimation result Case (a) – using the nDP and Case (b) – using the DPM. The nDP captures the small mode better and also emphasizes the importance of the main mode. Entropy of the estimation (red) to the true distribution (black) under the nDP is 0.011, while under the DMP it was

Health care quality in United States Data – 3077 hospitals in 51 territories (50 states + DC). Number of hospitals per state varies as well as the number of patients per hospital vary. Four covariates are available for each center: type of hospital, ownership, whether the hospital provides emergency services and whether it has an accreditation. We are interested in clustering states according to their quality. After adjusting for the effect of available covariates, we getting the main- effects ANOVA and use the nDP to model the state-specific error distributions.

Conclusion The author proposed the nested Dirichlet process to simultaneously cluster groups and observations within groups. The groups are clustered by their entire distribution rather than by particular features of it. While being non-parametric, the nDP encompasses a number of typical parametric and non-parametric models as limiting cases.