Generalization Performance of Exchange Monte Carlo Method for Normal Mixture Models Kenji Nagata, Sumio Watanabe Tokyo Institute of Technology.

Slides:



Advertisements
Similar presentations
Introduction to Monte Carlo Markov chain (MCMC) methods
Advertisements

Contrastive Divergence Learning
Bayesian network classification using spline-approximated KDE Y. Gurwicz, B. Lerner Journal of Pattern Recognition.
Bayesian Estimation in MARK
Shinichi Nakajima Sumio Watanabe  Tokyo Institute of Technology
Ch 11. Sampling Models Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by I.-H. Lee Biointelligence Laboratory, Seoul National.
Gibbs Sampling Qianji Zheng Oct. 5th, 2010.
Markov-Chain Monte Carlo
CHAPTER 16 MARKOV CHAIN MONTE CARLO
Bayesian Reasoning: Markov Chain Monte Carlo
Bayesian statistics – MCMC techniques
Stochastic approximate inference Kay H. Brodersen Computational Neuroeconomics Group Department of Economics University of Zurich Machine Learning and.
BAYESIAN INFERENCE Sampling techniques
Computing the Posterior Probability The posterior probability distribution contains the complete information concerning the parameters, but need often.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Bayesian Inference Anders Gorm Pedersen Molecular Evolution Group Center for Biological Sequence Analysis Technical.
Computational statistics 2009 Random walk. Computational statistics 2009 Random walk with absorbing barrier.
Part 4 b Forward-Backward Algorithm & Viterbi Algorithm CSE717, SPRING 2008 CUBS, Univ at Buffalo.
Particle filters (continued…). Recall Particle filters –Track state sequence x i given the measurements ( y 0, y 1, …., y i ) –Non-linear dynamics –Non-linear.
Today Introduction to MCMC Particle filters and MCMC
Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.
Computer vision: models, learning and inference Chapter 10 Graphical Models.
Introduction to Monte Carlo Methods D.J.C. Mackay.
Bayes Factor Based on Han and Carlin (2001, JASA).
Material Model Parameter Identification via Markov Chain Monte Carlo Christian Knipprath 1 Alexandros A. Skordos – ACCIS,
Machine Learning Lecture 23: Statistical Estimation with Sampling Iain Murray’s MLSS lecture on videolectures.net:
A C B Small Model Middle Model Large Model Figure 1 Parameter Space The set of parameters of a small model is an analytic set with singularities. Rank.
1 Gil McVean Tuesday 24 th February 2009 Markov Chain Monte Carlo.
Analysis of Exchange Ratio for Exchange Monte Carlo Method Kenji Nagata, Sumio Watanabe Tokyo Institute of Technology Japan.
Computer Science, Software Engineering & Robotics Workshop, FGCU, April 27-28, 2012 Fault Prediction with Particle Filters by David Hatfield mentors: Dr.
Overview Particle filtering is a sequential Monte Carlo methodology in which the relevant probability distributions are iteratively estimated using the.
Week 41 Estimation – Posterior mean An alternative estimate to the posterior mode is the posterior mean. It is given by E(θ | s), whenever it exists. This.
CAMELS CCDAS A Bayesian approach and Metropolis Monte Carlo method to estimate parameters and uncertainties in ecosystem models from eddy-covariance data.
Tracking Multiple Cells By Correspondence Resolution In A Sequential Bayesian Framework Nilanjan Ray Gang Dong Scott T. Acton C.L. Brown Department of.
1 Analytic Solution of Hierarchical Variational Bayes Approach in Linear Inverse Problem Shinichi Nakajima, Sumio Watanabe Nikon Corporation Tokyo Institute.
Multi-Speaker Modeling with Shared Prior Distributions and Model Structures for Bayesian Speech Synthesis Kei Hashimoto, Yoshihiko Nankaku, and Keiichi.
Markov Chain Monte Carlo for LDA C. Andrieu, N. D. Freitas, and A. Doucet, An Introduction to MCMC for Machine Learning, R. M. Neal, Probabilistic.
Lecture #9: Introduction to Markov Chain Monte Carlo, part 3
Expectation-Maximization (EM) Algorithm & Monte Carlo Sampling for Inference and Approximation.
Reducing MCMC Computational Cost With a Two Layered Bayesian Approach
Bayesian Speech Synthesis Framework Integrating Training and Synthesis Processes Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda Nagoya Institute.
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Introduction to Sampling Methods Qi Zhao Oct.27,2004.
The Unscented Particle Filter 2000/09/29 이 시은. Introduction Filtering –estimate the states(parameters or hidden variable) as a set of observations becomes.
Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines Yu Nishiyama and Sumio Watanabe Tokyo Institute of Technology,
CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov
Bayesian statistics named after the Reverend Mr Bayes based on the concept that you can estimate the statistical properties of a system after measuting.
Gil McVean, Department of Statistics Thursday February 12 th 2009 Monte Carlo simulation.
Kevin Stevenson AST 4762/5765. What is MCMC?  Random sampling algorithm  Estimates model parameters and their uncertainty  Only samples regions of.
HW7: Evolutionarily conserved segments ENCODE region 009 (beta-globin locus) Multiple alignment of human, dog, and mouse 2 states: neutral (fast-evolving),
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
A Method to Approximate the Bayesian Posterior Distribution in Singular Learning Machines Kenji Nagata, Sumio Watanabe Tokyo Institute of Technology.
A Study on Speaker Adaptation of Continuous Density HMM Parameters By Chin-Hui Lee, Chih-Heng Lin, and Biing-Hwang Juang Presented by: 陳亮宇 1990 ICASSP/IEEE.
Hierarchical Models. Conceptual: What are we talking about? – What makes a statistical model hierarchical? – How does that fit into population analysis?
Markov Chain Monte Carlo in R
Reducing Photometric Redshift Uncertainties Through Galaxy Clustering
MCMC Output & Metropolis-Hastings Algorithm Part I
Optimization of Monte Carlo Integration
Advanced Statistical Computing Fall 2016
ERGM conditional form Much easier to calculate delta (change statistics)
Jun Liu Department of Statistics Stanford University
Omiros Papaspiliopoulos and Gareth O. Roberts
Bayesian inference Presented by Amir Hadadi
Remember that our objective is for some density f(y|) for observations where y and  are vectors of data and parameters,  being sampled from a prior.
Kernel Stick-Breaking Process
Predictive distributions
Multidimensional Integration Part I
Ch13 Empirical Methods.
Robust Full Bayesian Learning for Neural Networks
Slides for Sampling from Posterior of Shape
Presentation transcript:

Generalization Performance of Exchange Monte Carlo Method for Normal Mixture Models Kenji Nagata, Sumio Watanabe Tokyo Institute of Technology

Contents  Background Normal Mixture Models Bayesian Learning MCMC method  Proposed method Exchange Monte Carlo method Application to Bayesian Learning  Experiment and Discussion  Conclusion

Background: Normal Mixture Models Normal mixture model is widely used in pattern recognition, data clustering and many other applications. :number of component :dimension of data parameter: A normal mixture model is a learning machine which estimates a target probability density by sum of normal distributions.

Posterior distribution: Marginal likelihood: Predictive distribution: Empirical Kullback information: Background: Bayesian Learning Because of difficulty of analytical calculation, the Markov Chain Monte Carlo (MCMC) method is widely used.

The algorithm to generate the sample sequence which converges to the target distribution. Background: MCMC method set with probability, set with probability. set as the next position. Huge computational cost!!

Purpose  We propose that the exchange MC method is appropriate for Bayesian learning in hierarchical learning machines.  We clarify its effectiveness by experimental result.

Contents  Background Normal Mixture Models Bayesian Learning MCMC method  Proposed method Exchange Monte Carlo method Application to Bayesian Learning  Experiment and Discussion  Conclusion

Exchange Monte Carlo method [Hukushima,96] We consider to obtain the sample sequence from the following simultaneous distribution. < Algorithm > 1.Each sequence is obtained from each target distribution by using the Metropolis algorithm independently and simultaneously for a few iterations. 2.Exchange of two positions, and, is tried and accepted with the following probability, The following two steps are performed alternately:

Exchange Monte Carlo method [Hukushima,96] < Exchange Monte Carlo method >

Application to Bayesian learning (prior) (posterior) :standard normal distribution

Contents  Background Normal Mixture Models Bayesian Learning MCMC method  Proposed method Exchange Monte Carlo method Application to Bayesian Learning  Experiment and Discussion  Conclusion

Experimental Settings dimension of data: 2 components number of training data: 5 components : uniform distribution [0:1] : standard normal distribution

Experimental Settings 1.Exchange Monte Carlo method (EMC) Monte Carlo (MC) Step Sample for expectation

Experimental Settings 2. Conventional Metropolis algorithm (CM) MC Step Sample for expectation

Experimental Settings 3. Parallel Metropolis algorithm (PM) MC Step Sample for expectation

( ) Experimental Settings Initial value: random sampling from the prior distribution For calculating the expectation, we use the last 50% of the sample sequence. ( otherwise )

Experimental result (histogram) Marginal distribution of parameter 1.EMC 2.CM3.PM MC step:3200 The algorithm CM cannot approximate the Bayesian posterior distribution. True marginal distribution has two peaks around 0 and 0.5.

Experimental result (generalization error) Convergence of the generalization error 1.EMC 2.CM3.PM test data: EMC provides smaller generalization error than CM. MC step: from 100 to 3200 CM EMC

Contents  Background Singular Learning Machine Bayesian Learning MCMC method  Proposed method Exchange Monte Carlo method Application to Bayesian Learning  Experiment and Discussion  Conclusion

Conclusion  We proposed that the exchange MC method is appropriate for the Bayesian learning in hierarchical learning machines.  We clarified its effectiveness by the simulation of Bayesian learning in normal mixture models.  By experimental results, The exchange MC method approximates the Bayesian posterior distribution more accurately than the Metropolis algorithm. The exchange MC method provides better generalization performance than the Metropolis algorithm.