Download presentation
Presentation is loading. Please wait.
Published byAsher Augustus Jordan Modified over 9 years ago
1
The Infinite Hierarchical Factor Regression Model Piyush Rai and Hal Daume III NIPS 2008 Presented by Bo Chen March 26, 2009
2
Outline Introduction The Infinite Hierarchical Factor Regression Model Indian Buffet Process and Beta Process Experiment Summary
3
Introduction The latent factor representation benefits: 1. Discovering the latent process underlying the data 2. Simpler predictive modeling through a compact data representation. Large P, Small N. N>=10 · d · C The fundamental advantages over standard FA model: 1. not assume known number of factors; 2. not assume factors are independent; 3. not assume all features are relevant to the factor analysis.
4
Algorithm Model :
5
Graphical Model T is used to eliminate the spurious genes or noise features. So T p determines whether the p-th customer will enter restaurant to eat any dish.
6
Indian Buffet Process --from latent classes to latent features For a finite feature model: (Tom Griffiths, 2006) Indian restaurant with countably many infinite dishes
7
Differences between DP and IBP DP class matrix IBP ‘class’ matrix 1. Latent feature 2. Clustering 3. others Different styles match different problems.
8
Two-Parameter Finite Model the first customer samples Poisson( ) dishes the i-th customer samples a previously sampled dish with probability then samples new dishes (Z. Ghahramani et. al., 2006)
9
Beta Process V.S. IBP Beta Process: the first customer samples Poisson( ) dishes the i-th customer samples a previously sampled dish with probability then samples new dishes
10
Hierarchical Factor Prior Kingman’s Coalescent It is a distribution over the genealogy of a countably infinite set of individuals. Construct tree structure Brownian diffusion A Markov process which encodes message (mean and covariance) in each node of the above tree. Y. W. Teh, H. Daume III, and D. M. Roy. Bayesian Agglomerative Clustering with Coalescents. In NIPS, 2008.
11
Feature Selection Prior Some genes are spurious Before selecting dishes, these ‘spurious’ customers should leave the restaurant.
12
Provided by Piyush Rai
13
Experimental results E-coli data: 100 samples 50 genes 8 underlying factors Breast cancer data: 251 samples 226 genes 5 underlying factors 1.The hierarchy can be used to find factors in order of their prominence. 2.Hierarchical modeling results in better predictive performance for the factor regression task. 3.The factor hierarchy leads to faster convergence since most of the unlikely configurations will never be visited as they are constrained by the hierarchy.
14
The Comparison of Factor Loading Matrice Learned from Different Methods Ground TruthNIPS Method Sparse BPFA on Factor loading VB Sparse BPFA on Factor score VB
15
Factor Regression Training and test data are combined together and test responses are treated as missing values to be imputed.
16
The Existing Similar FA Models Putting binary matrix on factor score matrix David Knowles and Zoubin Ghahramani. Infinite Sparse Factor Analysis and Infinite Independent Components Analysis, ICA 2007 John Paisley et. al., Nonparametric Factor Analysis with Beta Process Priors, in submission 2009. Summary: 1. For ‘large P, small N’ problems, the first one is faster to learn the small factor score matrix with KxN. Considering MCMC solution, it is difficult for the second one to handle the problem with tens of thousands of genes. 2. The second one can give an explanation to the relationship between gene and factor (pathway). Putting binary matrix on factor loading matrix Piyush Rai and Hal Daume III. The Infinite Hierarchical Factor Regression Model, NIPS 2008.
17
The New Developments of IBP F. Doshi, K. T. Miller, J. Van Gael and Y.W. Teh, Variational Inference for the Indian Buffet Process, AISTATS 2009. Jurgen Van Gael, Yee Whye Teh, Zoubin Ghahramani, The Infinite Factorial Hidden Markov Model, NIPS 2008. K. A. Heller and Zoubin Ghahramani, A Nonparametric Bayesian Approach to Modeling Overlapping Clusters, AISTATS 2007.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.