A Non-Parametric Bayesian Method for Inferring Hidden Causes

Slides:

Advertisements

Similar presentations

Pattern Recognition and Machine Learning

Advertisements

Sinead Williamson, Chong Wang, Katherine A. Heller, David M. Blei

Xiaolong Wang and Daniel Khashabi

MAD-Bayes: MAP-based Asymptotic Derivations from Bayes

Gibbs Sampling Methods for Stick-Breaking priors Hemant Ishwaran and Lancelot F. James 2001 Presented by Yuting Qi ECE Dept., Duke Univ. 03/03/06.

Hierarchical Dirichlet Processes

Nonparametric hidden Markov models Jurgen Van Gael and Zoubin Ghahramani.

LECTURE 11: BAYESIAN PARAMETER ESTIMATION

HW 4. Nonparametric Bayesian Models Parametric Model Fixed number of parameters that is independent of the data we’re fitting.

Gibbs Sampling Qianji Zheng Oct. 5th, 2010.

Dictionary Learning on a Manifold

Decoupling Sparsity and Smoothness in the Discrete Hierarchical Dirichlet Process Chong Wang and David M. Blei NIPS 2009 Discussion led by Chunping Wang.

Statistical model for count data Speaker : Tzu-Chun Lo Advisor : Yao-Ting Huang.

Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao.

Beam Sampling for the Infinite Hidden Markov Model Van Gael, et al. ICML 2008 Presented by Daniel Johnson.

Industrial Engineering College of Engineering Bayesian Kernel Methods for Binary Classification and Online Learning Problems Theodore Trafalis Workshop.

1 Unsupervised Learning With Non-ignorable Missing Data Machine Learning Group Talk University of Toronto Monday Oct 4, 2004 Ben Marlin Sam Roweis Rich.

Nonparametric Bayes and human cognition Tom Griffiths Department of Psychology Program in Cognitive Science University of California, Berkeley.

Nonparametric Bayesian Learning

Hierarchical Bayesian Nonparametrics with Applications Michael I. Jordan University of California, Berkeley Acknowledgments: Emily Fox, Erik Sudderth,

Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.

Bayes Factor Based on Han and Carlin (2001, JASA).

Learning the structure of Deep sparse Graphical Model Ryan Prescott Adams Hanna M Wallach Zoubin Ghahramani Presented by Zhengming Xing Some pictures are.

Hierarchical Dirichelet Processes Y. W. Tech, M. I. Jordan, M. J. Beal & D. M. Blei NIPS 2004 Presented by Yuting Qi ECE Dept., Duke Univ. 08/26/05 Sharing.

Bayesian Sets Zoubin Ghahramani and Kathertine A. Heller NIPS 2005 Presented by Qi An Mar. 17 th, 2006.

Inferring structure from data Tom Griffiths Department of Psychology Program in Cognitive Science University of California, Berkeley.

Simulation of the matrix Bingham-von Mises- Fisher distribution, with applications to multivariate and relational data Discussion led by Chunping Wang.

ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.

Ch 2. Probability Distributions (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by Yung-Kyun Noh and Joo-kyung Kim Biointelligence.

Variational Inference for the Indian Buffet Process

Hierarchical Dirichlet Process and Infinite Hidden Markov Model Duke University Machine Learning Group Presented by Kai Ni February 17, 2006 Paper by Y.

Stick-Breaking Constructions

Learning to Detect Events with Markov-Modulated Poisson Processes Ihler, Hutchins and Smyth (2007)

The Infinite Hierarchical Factor Regression Model Piyush Rai and Hal Daume III NIPS 2008 Presented by Bo Chen March 26, 2009.

Chapter Outline Goodness of Fit test Test of Independence.

Beam Sampling for the Infinite Hidden Markov Model by Jurgen Van Gael, Yunus Saatic, Yee Whye Teh and Zoubin Ghahramani (ICML 2008) Presented by Lihan.

Relevance-Based Language Models Victor Lavrenko and W.Bruce Croft Department of Computer Science University of Massachusetts, Amherst, MA SIGIR 2001.

by Ryan P. Adams, Iain Murray, and David J.C. MacKay (ICML 2009)

Stick-breaking Construction for the Indian Buffet Process Duke University Machine Learning Group Presented by Kai Ni July 27, 2007 Yee Whye The, Dilan.

1 Optimizing Decisions over the Long-term in the Presence of Uncertain Response Edward Kambour.

1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.

Nonparametric Bayesian Models. HW 4 x x Parametric Model Fixed number of parameters that is independent of the data we’re fitting.

CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov

The Phylogenetic Indian Buffet Process : A Non- Exchangeable Nonparametric Prior for Latent Features By: Kurt T. Miller, Thomas L. Griffiths and Michael.

Multi-label Prediction via Sparse Infinite CCA Piyush Rai and Hal Daume III NIPS 2009 Presented by Lingbo Li ECE, Duke University July 16th, 2010 Note:

Gibbs Sampling and Hidden Markov Models in the Event Detection Problem By Marc Sobel.

Hierarchical Beta Process and the Indian Buffet Process by R. Thibaux and M. I. Jordan Discussion led by Qi An.

Dependency Networks for Inference, Collaborative filtering, and Data Visualization Heckerman et al. Microsoft Research J. of Machine Learning Research.

Bayesian Enhancement of Speech Signals Jeremy Reed.

A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation Yee W. Teh, David Newman and Max Welling Published on NIPS 2006 Discussion.

How many iterations in the Gibbs sampler? Adrian E. Raftery and Steven Lewis (September, 1991) Duke University Machine Learning Group Presented by Iulian.

Applied statistics Usman Roshan.

Univariate Gaussian Case (Cont.)

Bayesian Semi-Parametric Multiple Shrinkage

An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism

Variational Bayes Model Selection for Mixture Distribution

Model Inference and Averaging

Parameter Estimation 主講人：虞台文.

Accelerated Sampling for the Indian Buffet Process

Kernel Stick-Breaking Process

Hidden Markov Models Part 2: Algorithms

Hierarchical Topic Models and the Nested Chinese Restaurant Process

Instructors: Fei Fang (This Lecture) and Dave Touretzky

1.1 Introduction to Systems of Equations.

Bayesian Nonparametric Matrix Factorization for Recorded Music

CONTEXT DEPENDENT CLASSIFICATION

Probabilistic Reasoning

LECTURE 09: BAYESIAN LEARNING

Parametric Methods Berlin Chen, 2005 References:

Pattern Recognition and Machine Learning Chapter 2: Probability Distributions July chonbuk national university.

Presentation transcript:

A Non-Parametric Bayesian Method for Inferring Hidden Causes by F. Wood, T. L. Griffiths and Z. Ghahramani Discussion led by Qi An ECE, Duke University

Outline Introduction A generative model with hidden causes Inference algorithms Experimental results Conclusions

Introduction A variety of methods from Bayesian statistics have been applied to find model structure from a set of observed variables Find the dependencies among the set of observed variables Introduce some hidden causes and infer their influence on observed variables

Introduction Learning model structure containing hidden causes presents a significant challenge The number of hidden causes is unknown and potentially unbounded The relation between hidden causes and observed variables is unknown Previous Bayesian approaches assume the number of hidden causes is finite and fixed.

A hidden causal structure Assume we have T samples of N BINARY variables. Let be the data and be a dependency matrix among . Introduce K BINARY hidden causes with T samples. Let be hidden causes and be a dependency matrix between and K can potentially be infinite.

A hidden causal structure Hidden causes (Diseases) Observed variables (Symptoms)

A generative model Our goal is to estimate the dependency matrix Z and hidden causes Y. From Bayes’ rule, we know We start by assuming K is finite, and then consider the case where K∞

A generative model Assume the entries of X are conditionally independent given Z and Y, and are generated from a noise-OR distribution. where , ε is a baseline probability that , and λ is the probability with which any of hidden causes is effective

A generative model The entries of Y are assumed to be drawn from a Bernoulli distribution Each column of Z is assumed to be Bernoulli(θk) distributed. If we further assume a Beta(α/K,1) hyper-prior and integrate out θk where These assumptions on Z are exactly the same as the assumption in IBP

Taking the infinite limit If we let K approach infinite, the distribution on X remains well-defined, and we only need to concern about rows in Y that the corresponding mk>0. After some math and reordering of Z, the distribution on Z can be obtained as

The Indian buffet process is defined in terms of a sequence of N customers entering a restaurant and choosing from an infinite array of dishes. The first customer tries the first Poisson(α) dishes. The remaining customers then enter one by one and pick previously sampled dishes with probability and then tries Poisson(α/i) new dishes.

Reversible jump MCMC

Gibbs sampler for Infinite case

Experimental results Synthetic Data

number of iterations

Real data

Conclusions A non-parametric Bayesian technique is developed and demonstrated Recovers the number of hidden causes correctly and can be used to obtain reasonably good estimate of the causal structure Can be integrated into Bayesian structure learning both on observed variables and on hidden causes.