Learning the structure of Deep sparse Graphical Model Ryan Prescott Adams Hanna M Wallach Zoubin Ghahramani Presented by Zhengming Xing Some pictures are.

Slides:



Advertisements
Similar presentations
Sinead Williamson, Chong Wang, Katherine A. Heller, David M. Blei
Advertisements

Xiaolong Wang and Daniel Khashabi
MAD-Bayes: MAP-based Asymptotic Derivations from Bayes
Gated Graphs and Causal Inference
Associative Learning Memories -SOLAR_A
INTRODUCTION TO MACHINE LEARNING Bayesian Estimation.
Hierarchical Dirichlet Processes
Nonparametric hidden Markov models Jurgen Van Gael and Zoubin Ghahramani.
Sharing Features among Dynamical Systems with Beta Processes
HW 4. Nonparametric Bayesian Models Parametric Model Fixed number of parameters that is independent of the data we’re fitting.
Dictionary Learning on a Manifold
Learning Scalable Discriminative Dictionaries with Sample Relatedness a.k.a. “Infinite Attributes” Jiashi Feng, Stefanie Jegelka, Shuicheng Yan, Trevor.
A New Nonparametric Bayesian Model for Genetic Recombination in Open Ancestral Space Presented by Chunping Wang Machine Learning Group, Duke University.
Decoupling Sparsity and Smoothness in the Discrete Hierarchical Dirichlet Process Chong Wang and David M. Blei NIPS 2009 Discussion led by Chunping Wang.
Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao.
Part IV: Monte Carlo and nonparametric Bayes. Outline Monte Carlo methods Nonparametric Bayesian models.
Algorithmic and Economic Aspects of Networks Nicole Immorlica.
. PGM: Tirgul 8 Markov Chains. Stochastic Sampling  In previous class, we examined methods that use independent samples to estimate P(X = x |e ) Problem:
Nonparametric Bayesian Learning
Associative Learning in Hierarchical Self Organizing Learning Arrays Janusz A. Starzyk, Zhen Zhu, and Yue Li School of Electrical Engineering and Computer.
Matlab Tutorial Continued Files, functions and images.
Linear Filtering About modifying pixels based on neighborhood. Local methods simplest. Linear means linear combination of neighbors. Linear methods simplest.
Most slides from Steve Seitz
Postage Stamp Picture Example Created by Annette Marquis Photo by Gini Courter National Zoo, Washington, DC.
Hierarchical Bayesian Nonparametrics with Applications Michael I. Jordan University of California, Berkeley Acknowledgments: Emily Fox, Erik Sudderth,
Fast Max–Margin Matrix Factorization with Data Augmentation Minjie Xu, Jun Zhu & Bo Zhang Tsinghua University.
My Digital Design Portfolio Type Your Name Here. Add Colour To Black and White Photos I used the [ ] tool to [ ]. (add the other tools you used here and.
(Infinitely) Deep Learning in Vision Max Welling (UCI) collaborators: Ian Porteous (UCI) Evgeniy Bart UCI/Caltech) Pietro Perona (Caltech)
Inferring structure from data Tom Griffiths Department of Psychology Program in Cognitive Science University of California, Berkeley.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 11: Bayesian learning continued Geoffrey Hinton.
Simulation of the matrix Bingham-von Mises- Fisher distribution, with applications to multivariate and relational data Discussion led by Chunping Wang.
Introduction to Numerical Analysis I MATH/CMPSC 455 Binary Numbers.
10 December, 2008 CIMCA2008 (Vienna) 1 Statistical Inferences by Gaussian Markov Random Fields on Complex Networks Kazuyuki Tanaka, Takafumi Usui, Muneki.
Randomized Algorithms for Bayesian Hierarchical Clustering
Variational Inference for the Indian Buffet Process
Hierarchical Dirichlet Process and Infinite Hidden Markov Model Duke University Machine Learning Group Presented by Kai Ni February 17, 2006 Paper by Y.
Bayesian Generalized Kernel Mixed Models Zhihua Zhang, Guang Dai and Michael I. Jordan JMLR 2011.
1 Dirichlet Process Mixtures A gentle tutorial Graphical Models – Khalid El-Arini Carnegie Mellon University November 6 th, 2006 TexPoint fonts used.
Stick-Breaking Constructions
The Infinite Hierarchical Factor Regression Model Piyush Rai and Hal Daume III NIPS 2008 Presented by Bo Chen March 26, 2009.
Statistical Models for Partial Membership Katherine Heller Gatsby Computational Neuroscience Unit, UCL Sinead Williamson and Zoubin Ghahramani University.
Beam Sampling for the Infinite Hidden Markov Model by Jurgen Van Gael, Yunus Saatic, Yee Whye Teh and Zoubin Ghahramani (ICML 2008) Presented by Lihan.
by Ryan P. Adams, Iain Murray, and David J.C. MacKay (ICML 2009)
Stick-breaking Construction for the Indian Buffet Process Duke University Machine Learning Group Presented by Kai Ni July 27, 2007 Yee Whye The, Dilan.
Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework N 工科所 錢雅馨 2011/01/16 Li-Jia Li, Richard.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Nonparametric Bayesian Models. HW 4 x x Parametric Model Fixed number of parameters that is independent of the data we’re fitting.
The Phylogenetic Indian Buffet Process : A Non- Exchangeable Nonparametric Prior for Latent Features By: Kurt T. Miller, Thomas L. Griffiths and Michael.
Multi-label Prediction via Sparse Infinite CCA Piyush Rai and Hal Daume III NIPS 2009 Presented by Lingbo Li ECE, Duke University July 16th, 2010 Note:
Matrix Multiplication The Introduction. Look at the matrix sizes.
Measurements and Data. Topics Types of Data Distance Measurement Data Transformation Forms of Data Data Quality.
Hierarchical Beta Process and the Indian Buffet Process by R. Thibaux and M. I. Jordan Discussion led by Qi An.
Latent Feature Models for Network Data over Time Jimmy Foulds Advisor: Padhraic Smyth (Thanks also to Arthur Asuncion and Chris Dubois)
BAYCLONE: BAYESIAN NONPARAMETRIC INFERENCE OF TUMOR SUBCLONES USING NGS DATA SUBHAJIT SENGUPTA, JIN WANG, JUHEE LEE, PETER MULLER, KAMALAKAR GULUKOTA,
Some Slides from 2007 NIPS tutorial by Prof. Geoffrey Hinton
Learning Deep Generative Models by Ruslan Salakhutdinov
Deep Feedforward Networks
An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism
Accelerated Sampling for the Indian Buffet Process
Non-Parametric Models
Multimodal Learning with Deep Boltzmann Machines
ECE 6504 Deep Learning for Perception
A Non-Parametric Bayesian Method for Inferring Hidden Causes
Markov Networks.
Markov Chain Monte Carlo
Pictures in 3-D flip book (Intermediate)
Bayesian Deep Learning on a Quantum Computer
Nonparametric Bayesian Texture Learning and Synthesis
Chapter14-cont..
CSC321 Winter 2007 Lecture 21: Some Demonstrations of Restricted Boltzmann Machines Geoffrey Hinton.
Presentation transcript:

Learning the structure of Deep sparse Graphical Model Ryan Prescott Adams Hanna M Wallach Zoubin Ghahramani Presented by Zhengming Xing Some pictures are directly copied from the paper and Hanna Wallach’s slides

outline Introduction Finite belief network Infinite belief network Inference Experiment

Introduction Main contribution: combine deep belief network and nonparametric bayesian together. Main idea: use IBP to learn the structure of the network Structure of the network include: Depth Width Connectivity

Single layer network Use Binary matrix to represent the network. Black refer to 1(two unit were connected) White refer to 0 (two unit were not connected) IBP can be used as the prior for infinite columns binary matrix Z

Review IBP 1.First customer tries dishes. 2. Nth customer tries Tasked dishes K with probability new dishes

Multi-layer network

Cascading IBP Also parameterize by Each dishes in the restaurant is also a customer in another Indian buffet process Each matrix is exchangeable both rows and columns This chain can reach the state with probability one ( number of unit in layer m) Properties: For unit in layer m+1 Expected number of parents: Expected number of children:

Sample from the CIBP prior

model m refer to the layers and increase upto M. weightsbias Place layer wise Gaussian prior on weights and bias, Gamma prior on noise precision

Inference Weights, bias, noise variance can be sampled with Gibbs sampler.

Inference( sample Z) Two step: Sample existing dishes MH-sample Add a new unit and, and insert connection to this unit with For a exist unit remove the connection to this unit with MH ratio

Experiment result Olivetti faces Remove bottom halves of the test image.

Experiment result MNIST Digits

Experiment result Frey Faces