Download presentation
Presentation is loading. Please wait.
Published byReynard Ross Blair Modified over 8 years ago
1
The Nested Dirichlet Process Duke University Machine Learning Group Presented by Kai Ni Nov. 10, 2006 Paper by Abel Rodriguez, David B. Dunson, and Alan E. Gelfand, Submitted to JASA 2006
2
Outline Introduction Nested Dirichlet process Application on haplotype inference
3
Motivation General problem – Extending the Dirichlet Process to ccommodate multiple dependent distributions. Methods –Inducing dependence through a shared source. For example, the dependent Dirichlet process (DDP) and the hierarchical Dirichlet process (HDP). –Inducing dependence through linear combinations of realizations of independent Dirichlet processes. For example, Muller (2004) defines the distribution of each group as the mixture of a global component and a local component.
4
Background The paper is motivated by two related problems: clustering probability distributions and simultaneous multilevel clustering in nested setting. Considered an example of hospital analysis: –In assessing quality of care, we need cluster centers according to the distribution of patients outcomes and identify outlying centers. –Also want to simultaneously cluster patients within the centers, and borrow information across centers that have similar clusters.
5
The Dirichlet process A single clustering problem can be analyzed as a Dirichlet processes (DP). The stick-breaking construction is usually the starting point of analysis: If yields Pitman-Yor process. If a = 0 and b = a resulting in the standard DP.
6
The nested Dirichlet process mixture Suppose y ij, for i = 1, …, n j are observations within center j. We assume exchangeability for centers, with A collection of distributions {F 1, …, F J } is said to follow a Nested Dirichlet Processes Mixture if
7
The nested Dirichlet process The collection {G 1, …, G J }, used as the mixing distribution, is said to follow a Nested Dirichlet Process with parameters From the construction, we have, and marginally, for every j. We have the properties for each Gj with
8
Prior correlation The prior correlation between two distribuitions G j and G j’ is The prior correlation between draws from the process is The correlation within center is larger than the one between centers. Generalized to three standard cases when
9
Truncations
10
Truncation error example for nDP(3,3,H) As the number of groups J increases, K needs to be increased. A typical choice will be K = 35 and L = 55;
11
Sampling by double truncation
13
Simulated data Showing the discriminating capability of the nDP and its ability to provide more accurate density estimates.
15
Density estimation result Case (a) – using the nDP and Case (b) – using the DPM. The nDP captures the small mode better and also emphasizes the importance of the main mode. Entropy of the estimation (red) to the true distribution (black) under the nDP is 0.011, while under the DMP it was 0.017.
16
Health care quality in United States Data – 3077 hospitals in 51 territories (50 states + DC). Number of hospitals per state varies as well as the number of patients per hospital vary. Four covariates are available for each center: type of hospital, ownership, whether the hospital provides emergency services and whether it has an accreditation. We are interested in clustering states according to their quality. After adjusting for the effect of available covariates, we getting the main- effects ANOVA and use the nDP to model the state-specific error distributions.
19
Conclusion The author proposed the nested Dirichlet process to simultaneously cluster groups and observations within groups. The groups are clustered by their entire distribution rather than by particular features of it. While being non-parametric, the nDP encompasses a number of typical parametric and non-parametric models as limiting cases.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.